Glossary

Data Validation

What is Data Validation?

Data validation is an important step in artificial intelligence to ensure the quality of input data before it is used to develop models and insights. There are many steps that are part of the data validation process:

  • Checking the data is the right type: integer, data, string, Boolean, etc.
  • Checking the range of values: minimum/maximum values and proper format (spelling, phone numbers).
  • Checking for data validity: applying application-specific rules like valid part number, valid customer code, etc.
  • Checking for consistency: birth date is before death date, mother is older than child, etc.

 

Why is Data Validation Important?

AI and machine learning models can only produce valid results if they are built using valid data, so it’s critical to perform all of the steps listed above in the data integration phase to ensure models are working with clean data.

 

How C3.ai Enables Organizations to Perform Data Validation

C3.ai makes it easy to perform data validation as part of the C3 AI® Data Studio. The C3 AI Suite is a complete, end-to-end platform for designing, developing, deploying, and operating enterprise AI applications at industrial scale. C3 AI Data Studio is a set of visual tools to ingest disparate internal, external, and sensor data into a unified, federated image that can be used to design and explore the sources, structure, and content of the resultant C3 AI Data Model. Its data management and data explorer tools set up and provide access to integration pipelines and allow visual exploration and transformations to ensure the data is ready for analysis.