Best Practices in Ongoing Operations

Best Practices in Ongoing Operations

Setting your team up for success in prototyping is a necessary step toward positively impacting your business with AI-generated insights. Sustaining that initial success, however, requires a thoughtful approach to scaling and monitoring AI algorithms, ensuring user adoption, and monitoring and tracking business value being generated. In this article, we outline best practices distilled from our years at C3 AI of operating and monitoring AI models in production at global scale – millions of models operating against ongoing data updates within single- environment instances.

AI as Part of the Software Development Process

Developing and maintaining complex AI use cases requires a sophisticated and rigorous approach that includes implementing a method to periodically improve the deployed algorithms, designing for and monitoring nuanced edge cases, creating a robust set of automated tests to prevent regressions, and gracefully alerting administrators to issues. Because the process to implement highly scalable solutions is analogous to modern software development and deployment, those processes can be used as a model for AI development.

Given the highly iterative nature of algorithm configuration and application logic development, it is recommended that both algorithm development and application development proceed together, in lockstep, using modern software development approaches.

This typical development process involves six steps to ensure reliable and performant code is released to end users:

  1. Code reviews: Developers/data scientists review each other’s code to identify potential bugs and to streamline solutions that are simple and elegant.
  2. Unit and integration testing: Developers/data scientists test new functions with existing programs to identify and resolve issues. Common issues arise when data input requirements of a new machine learning model do not match the available data format in existing programs.
  3. Generation of a release candidate: Once a candidate “green build” is generated that includes the required functionality and passes unit and integration tests, the software build is deployed to the QA environment.
  4. Quality assurance: QA testers use the QA environment to test the new functionality. Bugs are identified and prioritized to be resolved quickly.
  5. Testing in preproduction: After QA is complete, the program is promoted to a preproduction environment. Preproduction is the final validation step to ensure the new features and bug fixes are fully functional before they are released to all users.
  6. Production deployment: A final version of the program is released and available for end users.



Figure 37 The software development process requires code reviews, testing, release, QA, preproduction, and production phases.