C3 AI Data Studio

C3 AI Data Studio provides data engineers, data scientists, and business analysts with a low-code / no-code environment to rapidly explore disparate data sources and model them into a unified, common, and extensible data image. Built using the model-driven architecture of the C3 AI Suite, C3 AI Data Studio provides over 25 pre-built connectors to commonly used data stores, file systems, queuing, and streaming systems to populate a unified data image using both data ingestion pipelines and data virtualization mechanisms.​

​C3 AI Data Studio enables users to configure and test data ingestion pipelines by inferring metadata and offers AI-assisted auto-mapping. Users can configure data ingestion pipelines by leveraging the C3 AI Suite’s expression library or write custom transforms in JavaScript, Python, or R. C3 AI Data Studio allows users to explore data models, set up validation checks to ensure data quality, and define access control policies.​

Low-code / no-code platform for data integration​

  • Leverage C3 AI Data Studio’s low-code / no-code environment to set up continuous data ingestion pipelines to read data from external data stores and persist data into relational, key value, or file storage systems.​
  • Create reusable modular multi-step transforms between source and target data stores.​
  • Manually inject large datasets into ingestion pipelines for testing or persisting purposes.

Unified Data Image

Unified data image

  • Explore, navigate, and visualize your data on a unified data image using C3 AI Data Studio.​
  • Make your enterprise and external data available for consumption through a common presentation layer.​
  • Present your data uniformly, abstracted from all the complexities such as data type (relational vs. time series), data source (internal vs. external), or data store type (key value vs. object storage).​
  • Add data models either by manually defining the attributes or by automatically inferring metadata from existing sample files or external data sources.

Data virtualization​

  • Use C3 AI Data Studio’s data virtualization mechanism to leverage existing enterprise data lakes.​
  • Reduce data center and cloud hosting costs by minimizing data replication across enterprise data stores.​
  • View a unified data model abstracted from the underlying source system implementation. C3 AI Data Studio infers metadata from source systems end presents users with an extensible data model.​

Data Virtualization

Enterprise catalog

C3 AI Data Studio enables developers to access all relevant metadata on data objects, features, and machine learning models through an enterprise catalog.​

  • Capture technical and business metadata about data and machine learning models via a flexible and powerful cataloging system​
  • Discover and trace lineage from source systems to machine learning models for an end-to-end view of your data​
  • Easily maintain where the data come from, who owns it, and who uses it for data analysis and extended collaboration​
  • Accelerate data discovery by applying custom tags to the metadata

Pre-built connectors​

C3 AI Data Studio provides over 25 pre-built connectors to access cloud and on-premise data sources without having to develop any custom integrations.​

  • Databases and big-data stores including Snowflake, Impala, HBase, Postgres, CosmosDB, MongoDB, Oracle, AWS RedShift, SQL Server​
  • Cloud applications including Salesforce, HubSpot​
  • Queue-based systems including Apache Kafka, Azure Event HUbs, Azure Topics, AWS Kinesis, AWS SQL​
  • File systems including AWS S3, Azure Data Lake Store gen2, Azure Blob, HDFS, local file system​

Pre-built connectors

Continuous data ingestion validation

Continuous data ingestion validation

  • ​Set up simple validation rules to ensure data quality on each C3 AI Model – check for data type, nullability, and set of allowed values.​
  • Leverage the expression engine to validate objects by their own data or by other correlated data inside the unified data image.​
  • Detect and report any unexpected transformation errors in the data ingestion pipeline for further investigation.

Data exploration

  • Explore data across C3 AI Models through a common interface.​
  • Rapidly access and filter data regardless of the underlying system – source data stores, file storage systems, or object storages.

Data Exploration

Extensive Transformation Engine screen shot

Extensive transformation engine​

  • Choose from more than 150 pre-built expressions to map source data to target data models.​
  • Define your own re-usable custom expressions and share across development teams.​
  • Implement code-based transform methods using JavaScript, Python, and R to handle complex data transformations.​
  • Leverage AI assisted auto mapping to reduce the time to create mappings between source data and the unified data image.​