Shortening the data integration process from weeks to days, connecting critical timeseries data for a manufacturing predictive maintenance application

When manufacturers fall behind on their production schedule, leading to delays in customer orders, it’s often due to equipment failure. Even a minor malfunction can throw plans into disarray.

This is where AI comes in. In particular, enterprise AI designed to predict maintenance issues, delivering accurate and timely alerts. Any robust predictive maintenance program needs a strong foundation of data that includes timestamps. Without timeseries data, the application can’t monitor and report anomalies or incidents that may require immediate attention.

Take the case of one C3 AI customer, a large pharmaceutical manufacturer that wanted to build a predictive maintenance application to help monitor its facilities, predict repair rescheduling needs, and prevent unplanned downtime. And, of course, deploying this application quickly will only benefit the customer’s bottom line — and C3 AI can shorten a customer’s time-to-value from months to weeks.

The company relies on the Amazon Web Services stack to store its data; for facility-related time series data, it uses Amazon Timestream to host sensor measurement data for all plants as this database service provides an easy way to store and analyze the company’s high-velocity time series data. So, to build a predictive maintenance application for this customer, the C3 AI team needed to connect the application, and the C3 AI Platform where the application is hosted, to the data stored within Amazon Timestream.

How the C3 AI Platform Accelerates Application Development

Typically, C3 AI can easily connect a customer’s database(s) to the C3 AI Platform using one of over 200 pre-built connectors. But sometimes a connector does not yet exist — and that was the case for this pharma company. To build the C3 AI Amazon Timestream Connector on the C3 AI Platform, the C3 AI team relied on features, including its proprietary model-driven architecture, to deliver results quickly.

The model-driven architecture built into the C3 AI Platform and the C3 AI Type System work together to accelerate application development so that customers can skip the typically arduous data preparation that other providers require. The C3 AI Type System is a feature built into the platform that lets developers use different types of application components, including data, algorithms, and programs, to work together seamlessly even if they are not written in the same language or based on the same model. This is what allows customers to connect and integrate any type of database or data in days rather than weeks.

How the Connector Improves Data-Retrieval Workflows

Beyond connecting a database to the C3 AI Platform, the C3 AI Amazon Timestream Connector provides other benefits. Data scientists often need to query data from different database systems, including both SQL and NoSQL ones. This can be challenging, as it requires learning a new language and writing complex code to join tables and transform data in data science notebooks. The C3 AI Amazon Timestream Connector simplifies this process by providing a single line of code that can query both types of databases. This eliminates the need for complex code and data preprocessing, which can significantly reduce the time it takes to perform exploratory data analysis (EDA) and train machine learning (ML) models.

For this customer, implementing the C3 AI Amazon Timestream Connector helped reduce the time it took to perform EDA and train an ML model from around a week to just two hours.

The custom-built C3 AI Amazon Timestream Connector not only leverages customers’ investment into the Amazon Timestream database but, more importantly, enables integration with the C3 AI Type System, accelerating the application development process and decreasing time to value for customers from years to months.

How We Built the C3 AI Amazon Timestream Connector

Although building a connector allowed the C3 AI Platform to easily access Timestream data, the first phase of this project still required data standardization, preparation, and then unification — all the customer’s data was scattered across Timestream in multiple databases and tables for each plant. The goal was to build a unified federated data image that provides consistency in data labeling and storage. Unifying all the plant sensor data also allowed all plant operators to easily access it, enabling greater collaboration, and ultimately increasing productivity and simplifying workflows. This process was also about correlating the customer’s timeseries data to the manufacturing batch process records and work orders.

To unify Timestream data into a unified federated image, the connector made use of the C3 AI Type System, which provided a standard framework and a way to manage the metadata.

The C3 AI Type System allows you to make a single fetch call to retrieve the data instead of querying multiple tables. To improve performance across the system, all Amazon Timestream data are persisted into tables and those tables are scattered through multiple databases; this means that as you work to scale the AI application across the enterprise, the complexity of the ML model increases. Unified Type binds all tables together through metadata management. It hides the complexity of source data models, which results in improved productivity for data scientists and simplifies the machine learning pipelines.

How We Built the Connector: A Step-by-Step Workflow

Configuration

AWS Python SDK

AWS provides a Python SDK that allowed the C3 AI team to easily configure access to and query the customer’s Amazon Timestream database. The C3 AI Platform supports Python natively and allows developers to seamlessly invoke any Python APIs using the AWS authentication framework.

Authentication

To authenticate to the Timestream database:

  1. Create an AWS Identity and Access Management (IAM) role: An IAM role is a collection of permissions attached to users, groups, or other entities. For Timestream, we needed to create an IAM role that allows access to the database.
  2. Establish a trust relationship for cross-account access: This step is necessary to allow an IAM role from another AWS account to access the Timestream database.

With these two these two steps completed, the C3 AI Amazon Timestream Connector then connects to the database. The C3 AI Amazon Timestream Connector uses the AWS Security Token Service (AWS STS) to request temporary, limited-privilege credentials for AWS IAM users authenticate (federated users).

C3 AI Amazon Timestream Connector Implementor Type

C3 AI decided to implement the fetch method to retrieve data from Timestream as it is highly resilient to failures; it is designed to handle temporary errors and AWS service issues. It’s also extremely customizable.

To implement this new connector, the C3 AI team used the AwsTimestreamConnector type, which:

  1. Authenticates with AWS Timestream database.
  2. Submits the Timestream queries using the AWS Python SDK runtime.
  3. Parses the queries and validates their syntax.
  4. Handles errors.
  5. Retries the queries using exponential backoffs if failures occur due to AWS service downtime or throttling issues.

The fetch method is resilient because it will retry the queries as necessary until they succeed, with configurable retry logic. This ensures that the connector can eventually fetch the data from Timestream, even if there are temporary errors. The retry logic is configurable, so you can specify the number of times to retry a query, the delay between retries, and the maximum backoff time.

The retry logic is also designed to be aware of AWS service downtime and throttling issues, which helps to prevent the connector from overloading the service with too many requests.

How the C3 AI Platform and the Amazon Timestream Connector Work Together

The benefits of using the C3 AI Amazon Timestream Connector map to the benefits of using the C3 AI Platform — both accelerate time to production and application development.

A valued feature of the C3 AI Platform is the library of over 200 pre-built database connectors, which now includes the C3 AI Amazon Timestream Connector. These ready-to-use connectors allow developers to link and integrate legacy systems and databases quickly and easily.

Then, C3 AI’s data virtualization engine gives customers the ability to bypass the typically long process to extract, transfer, and load (ETL) data into a new system. The virtualization engine helps retrieve data and process it in-memory without making multiple copies of it, reducing significant costs around ETL tools and data management overhead. In addition, the ML pipelines are source agnostic and allow you to seamlessly point to the new Timestream table or a new Timestream environment.
These are the types of features that help organizations build and deploy AI applications quickly with C3 AI.

Now that this connector is built, moving forward, the company can onboard a new facility into the predictive maintenance workflow in a single day where before it would take months.

This is just one example of how adaptable the C3 AI Platform is — custom database connectors can be built quickly for any type of legacy storage system using these unique functions and features. The underlying architecture of C3 AI’s products — designed with customers in mind — enables us to work with organizations to launch AI applications into production fast; ultimately helping customers pull value from their data through intelligent insights — such as predicting machine repair needs at a manufacturing facility — in a matter of weeks.