Last month we announced that AutoML is now available in C3 AI Ex Machina. Already Ex Machina has provided tremendous value to companies looking to quickly deliver insight on large datasets, without wasting time on data wrangling and prep. Native ML capabilities have deepened these insights for many analysts, and AutoML makes machine learning with Ex Machina even more powerful, efficient, and accessible.

Building an ML model manually requires extensive domain expertise and significant time spent on trial and error. Selecting training features, model type, and the specific configuration of that model that optimizes accuracy are difficult problems even for expert data scientists. Automated machine learning eliminates this tedious experimentation process, saving time and trouble for both business experts and data scientists. Within Ex Machina AutoML comprises feature selection, hyperparameter optimization, and model selection.

Feature Selection

Feature selection is the process of determining which features in a dataset are the strongest indicators of the model output. When a dataset is highly dimensional, meaning it has a large number of features compared to the number of data points available (think a high number of columns vs. rows in a table), the model runs the risk of overfitting the training data. In other words, it will become overly tailored to one specific dataset and will not perform well on new, unseen data. A model is most effective when it is only trained with the features that contribute most heavily to the output. For a binary classification model that predicts whether a customer will churn, their monthly charges might be a highly relevant feature, while features like the customer’s age or gender may be less relevant. However, the relevant features for a model are not always obvious to the viewer, so automated feature selection helps ensure the most accurate model regardless of the transparency of the dataset.

In Ex Machina, users select the target variable (the variable the model should predict) and the scope of the training feature’s search (the features from which the model selects). Once the highest-performing model has been identified and trained, users can see the relative importance of each feature to the model outcome.

Hyperparameter Optimization

Hyperparameters are configurations that optimize accuracy for the model. Examples of hyperparameters include the number of trees or depth of trees for decision tree models, regularization parameters for regression models, or the learning rate. The same model performs differently depending on the hyperparameters, so data scientists typically need to test the same model with many different combinations of hyperparameters to determine the best choice before they move on. AutoML, however, can optimize hyperparameters on several different models simultaneously. In Ex Machina analysts can specify the breadth of the hyperparameter search, allowing them to save time by running the AutoML node through fewer hyperparameter values, or execute a more exhaustive search through a wider range of hyperparameter values.

Model Selection

The number of algorithms available to solve classification, regression, and clustering problems make selecting the right one a challenging and iterative process. In Ex Machina, users first select the kind of problem the want to solve. Then, they downselect from the available algorithms as much as they want. Users with limited data science knowledge may want the AutoML node to search through every available model, while experienced data scientists might limit the search to a subset of algorithms they know are likely to perform well. Regardless of the scope of the model search, Ex Machina automatically returns the highest-performing model with hyperparameters already optimized.

AutoML in Ex Machina offers varying levels of automation depending on the use case and the user’s background knowledge. A model can be trained more quickly with a smaller range of hyperparameter values and a narrower subset of algorithm choices, whereas a more complex project might warrant a broad search of all possible algorithms and values.

Finally, unlike other AutoML tools available on the market, C3 AI Ex Machina provides all the capabilities needed to generate, act on, and scale predictions, all within one interface. Ex Machina makes an extensive collection of data connectors available so that analysts can access the data they need in a matter of clicks, wherever it resides. Ex Machina also provides dozens of data blend, prep, and wrangling capabilities to speed up the process of making data ready for analysis. Analysts can communicate their findings to others quickly and effectively with dashboarding capabilities and connectors to their common business apps. And they can take action on predictions by publishing them to their other business applications.

All of these capabilities are provided in a single, unified, intuitive interface—users live in one product for all their no-code AI work. And they are all delivered via a cloud-native web app that business analysts can scale up and down on their own, without help from IT.

AutoML is a powerful technology that automates mundane data science work and makes a new class of analysis available to business experts. When delivered within a complete product that facilitates the entire no-code workflow, AutoML enables analysts to transform their operations and create immense business value for their organizations.

Learn More About C3 AI Ex Machina AutoML
About the author

Matt Connor is a Senior Product Manager focused on C3 AI Ex Machina. Matt has worked on the platform product management team and Ex Machina team since joining C3.ai in 2017 and has worked with clients across multiple industries to deliver AI solutions. Prior to C3.ai Matt worked as a quantitative analyst at Makena Capital, a leading endowment style investment firm. Matt earned his MBA from The Wharton School of the University of Pennsylvania and his BSE from Princeton University.