Overfitting

What is Overfitting?

Overfitting is the production of an analysis that corresponds too closely or exactly to a particular set of data, and therefore may fail to fit additional data or predict future observations reliably. Overfitting occurs when a model does not accurately capture the underlying structure of the data, which may happen in machine learning when the model fits the training data too well and does not generalize to the actual trends of the incoming real-world data, producing less accurate predictions.

 

Why is Overfitting Important?

Overfitting means that the model development process did not adequately learn from the training data to create a fit that accurately predicts future data results. To reduce the occurrences of overfitting as a problem, try:

  • Selecting the right features to evaluate
  • Trying alternative approaches
  • Validating against a properly selected data set independent of the training data
  • Continuously evaluating models

 

How C3.ai Enables Organizations to Avoid Overfitting

C3.ai provides a rich machine learning development environment – C3.ai ML Studio – as part of the C3 AI Suite to enable data scientists to develop, train, test, deploy, and operate ML models at scale. Functions like “experiment and “model management” make it easy to avoid or correct for overfitting during each phase of the development process and to continually monitor the performance of deployed models to maintain and maximize accuracy over time.