What is Enterprise AI?

 

Awash in “AI Platforms”

Industry analysts estimate that organizations will invest more than $250 billion annually in digital transformation software by 2025. According to McKinsey, companies will generate more than $13 trillion annually in added value from the use of these new technologies. This is the fastest-growing enterprise software market in history and represents an entire replacement market for enterprise application software.

Today the market is awash in open source “AI Platforms” that purport to be solutions sufficient to design, develop, provision, and operate enterprise AI and IoT applications. In this era of AI hype, there are literally hundreds of these in the market – and the number increases every day – that present themselves as comprehensive “AI Platforms.”

Figure 2

A Sea of “AI Platforms”

The market is awash with hundreds of open source components that purport to be an “AI platform.” Each component can provide value, but none provides a complete platform by itself.

A sea of Platforms

Examples include Cassandra, Cloudera, DataStax, Databricks, AWS IoT, and Hadoop. AWS, Azure, IBM, and Google each offer an elastic cloud computing platform. In addition, each offers an increasingly innovative library of microservices that can be used for data aggregation, ETL, queuing, data streaming, MapReduce, continuous analytics processing, machine learning services, data visualization, etc.

They all appear to do the same thing and they all appear to provide a complete AI platform. While many of these products are useful, the simple fact is that none offers the scope of utility necessary and sufficient to develop and operate an enterprise AI or IoT application.

Consider Cassandra, for example. It is a key-value data store, a special-purpose database that is particularly useful for storing and retrieving longitudinal data, like telemetry. For that purpose, it is an effective product. But that functionality represents perhaps one percent of the required solution. Likewise, HDFS is a distributed file system, useful for storing unstructured data. TensorFlow, a set of math libraries published by Google, is useful in enabling certain types of machine learning models. Databricks enables data virtualization, allowing data scientists or application developers to manipulate very large data sets across a cluster of computers. AWS IoT is a utility for gathering data from machine-readable IoT sensors. The point is: these utilities are all useful, but none is sufficient by itself. Each addresses only a small part of the problem required to develop and deploy an AI or IoT application.

Moreover, these utilities are written in different languages, with different computational models and frequently incompatible data structures, developed by programmers of varying levels of experience, training, and professionalism. They were not designed to work together. Few, if any, were written to commercial programming standards. Most have not proven commercially viable and the source code has been contributed to the open source community. The open source community is a kind of superstore in the cloud with a growing collection of hundreds of computer source code programs available for anyone to download, modify at will, and use at no cost.