The problems that have to be addressed to enable today’s AI and IoT applications are nontrivial. Massively parallel elastic computing and storage capacity are prerequisites. These services are provided today at increasingly low cost by Microsoft Azure, AWS, and others. The elastic cloud is a major breakthrough that has dramatically transformed modern computing. In addition to the cloud, multiple data services are necessary to develop, provision, and operate AI and IoT applications.
Figure 1 depicts the array of capabilities and services necessary for the domain of AI and IoT applications. Each of these utilities represents a development problem on the order of magnitude of a relatively simple enterprise software application such as CRM. This is not a trivial problem. Let’s take a look at some of these requirements.
Reference Architecture of AI Suite
The successful development of AI and IoT applications requires a complete suite of tools and services that are fully integrated and designed to work together.
Data Integration: This problem has haunted the computing industry for decades. Prerequisite to machine learning and AI at industrial scale is the availability of a unified, federated image of all the data contained in the multitude of (1) enterprise information systems – ERP, CRM, SCADA, HR, MRP – typically thousands of systems in each large enterprise; (2) sensor IoT networks – SIM chips, smart meters, programmable logic arrays, machine telemetry, bioinformatics; and (3) relevant extraprise data – weather, terrain, satellite imagery, social media, biometrics, trade data, pricing, market data, etc.
Data Persistence: The data aggregated and processed in these systems includes every type of structured and unstructured data imaginable. Personally identifiable information, census data, images, text, video, telemetry, voice, network topologies. There is no “one size fits all” database that is optimized for all these data types. This results in the need for a multiplicity of database technologies including but not limited to relational, NoSQL, key-value stores, distributed file systems, graph databases, and blobs.
Platform Services: A myriad of sophisticated platform services are necessary for any enterprise AI or IoT application. Examples include access control, data encryption in motion, encryption at rest, ETL, queuing, pipeline management, autoscaling, multitenancy, authentication, authorization, cybersecurity, time-series services, normalization, data privacy, GDPR privacy compliance, NERC-CIP compliance, and SOC2 compliance.
Analytics Processing: The volumes and velocity of data acquisition in such systems are blinding and the types of data and analytics requirements are highly divergent, requiring a range of analytics processing services. These include continuous analytics processing, MapReduce, batch processing, stream processing, and recursive processing.
Machine Learning Services: The whole point of these systems is to enable data scientists to develop and deploy machine learning models. There is a range of tools necessary to enable that, including Jupyter Notebooks, Python, DIGITS, R, and Scala. Increasingly important is an extensible curation of machine learning libraries such as TensorFlow, Caffe, Torch, Amazon Machine Learning, and AzureML. An effective AI and IoT platform needs to support them all.
Data Visualization Tools: Any viable AI architecture needs to enable a rich and varied set of data visualization tools including Excel, Tableau, Qlik, Spotfire, Oracle BI, Business Objects, Domo, Alteryx, and others.
Developer Tools and UI Frameworks: An organization’s IT development and data science teams each have adopted and become comfortable with a set of application development frameworks and user interface (UI) development tools. An AI and IoT platform must support all of these tools – including, for example, the Eclipse IDE, VI, Visual Studio, React, Angular, R Studio, and Jupyter – or it will be rejected as unusable by the IT development teams.
Open, Extensible, Future-Proof: The current pace of software and algorithm innovation is blinding. The techniques used today will be obsolete in 5 to 10 years. An AI and IoT platform architecture must therefore provide the capability to replace any components with their next-generation improvements. Moreover, the platform must enable the incorporation of any new open source or proprietary software innovations without adversely affecting the functionality or performance of an organization’s existing applications. This is a level-zero requirement.
To meet this extensive set of requirements, C3 AI has spent the last decade and invested more than $800 million in developing and enhancing the C3 AI Suite. The C3 AI Suite has been refined, tested, and proven in the most demanding industries and production environments – electric utilities, manufacturing, oil and gas, and defense – comprising petabyte-scale datasets from thousands of vastly disparate source systems, massive volumes of high-frequency time series data from millions of devices, and hundreds of thousands of machine learning models.