By Peter Kowalchuk, Engagement Director, C3 AI, Chetan Tewari, Senior AI Solution Manager, C3 AI, Sina Pakazad, Vice President, Data Science, C3 AI, Henrik Ohlsson, Chief Data Scientist, Data Science, C3 AI
In the oil and gas industries, real-time monitoring is central to safe and efficient operations. From drilling and well management in real-time operations centers (RTOCs), to command centers overseeing hydraulic fracturing, to production teams tracking well performance, every stage of the production process generates massive, fast-moving streams of sensor data. The challenge lies in turning this flood of data into clear, reliable insights to drive decision making. This is especially critical when the correct course of action requires context, history, and expert judgment.
At C3 AI, we’ve engineered new AI architectures that overcome this challenge of context. Our newest advances, retrieval-augmented explainability (RAE) and the expert companion (EC), embed domain knowledge, situational awareness, and continuous learning directly into workflows, enabling operators in the oil and gas industry to make faster, safer, and more informed decisions.
In this blog, we’ll take a closer look at how RAE and EC work — and how they’re transforming decision making across mission-critical operations.
Beyond Alerts: Why Real-Time Monitoring Needs an AI Update
In most real-time monitoring operations, engineers track key metrics — torque, pressure, flow rate, rate of penetration in drilling, proppant concentration in fracturing, or production rates across wells — to spot anomalies or unusual trends. Time series models can flag issues, but alerts like “anomaly detected” or “trend deviation” rarely provide the actionable context needed to respond effectively.
For example, drilling dysfunction alerts such as vibration or stick-slip may stem from weight-on-bit changes or mud motor problems. A pressure drop during fracturing could indicate proppant bridging, a faulty sensor, or a shift in operations. In production, a decline in liquid output might point to a stuck plunger, a failed controller, or changes in wellhead pressure. Without clarity on root causes, teams are left interpreting signals on their own, which slows response times and increases risk. Moving beyond basic anomaly detection toward systems that explain the cause behind each alert enables RTOCs to improve both decision-making speed and overall effectiveness.
How C3 AI Reinvented Time-Series Decision Support
Retrieval-Augmented Explainability: Learning from the Past in Real Time
Retrieval-augmented explainability (RAE), a form of retrieval-augmented generation (RAG), goes beyond simply flagging anomalies to explain why they happened. It does this by retrieving historically similar time periods, enriched with expert annotations and operational context, helping teams understand the underlying patterns and causes behind anomalies. This approach contrasts starkly with traditional decision support methods that present the raw numbers without any interpretation or context.
In our drilling dysfunction use case, let’s say that a model detects abnormal vibration patterns on the drill string. Using RAE, the system immediately retrieves similar historical time windows where expert annotations identified the issue’s root cause as bit balling. These cases are presented with confidence scores that compare the current abnormal vibration to those historical cases, providing engineers with both a clear, context-based recommendation as well as the reasoning driving it.
Explaining RAE in Action: Real-Time Monitoring with Context-Aware AI
- Real-time data is embedded in vector form using a foundation time series model.
As with retrieval-augmented generation (RAG), the process begins by chunking and embedding time series data into vector representations. In this setup, each time period, including real-time data as it streams in, is encoded using a foundation time series model and stored in a vector database. Real-time chunks are treated similarly to historical data but can also be compared against a dual-vector store architecture: one for general time series patterns and another for annotated segments curated by domain experts. This enables the system to dynamically retrieve not just similar operational states but also relevant contextual insights from annotated events, supporting real-time decision making with embedded historical knowledge. - Top-k similar cases are retrieved from dual vector stores, both raw and expert-annotated.
During the retrieval phase, the system uses the embedded query, often derived from the most recent or incoming real-time data, to search across both the general time series vector store (containing raw time chunks) and the expert-annotated vector store. RAE then dynamically merges the results, returning the top-k most similar historical periods. This combined retrieval approach ensures that the output includes not only operationally similar patterns but also, when available, high-value domain expert annotations.The dual-vector store design enhances both accuracy and explainability, forming the backbone of RAE’s retrieval engine and enabling it to surface relevant precedents and expert insights in response to time-sensitive queries.
- Uncertainty is quantified and cases are re-ranked based on annotation probability and pattern similarity.
After retrieving the top-k most similar time periods, RAE passes them, along with their similarity scores and any associated annotations, to its uncertainty quantification module. This module estimates the probability that a given annotation is relevant to the current query period by evaluating how frequently each annotation appears across the retrieved cases, weighted nonlinearly by their similarity to the query. Using non-parametric methods, RAE generates a probabilistic profile for each annotation, enabling it to assess the confidence of its recommendations. The system compares this distribution against a uniform baseline to flag low-confidence outputs, which may signal knowledge gaps or operational regime shifts. Finally, RAE re-ranks the retrieved time periods based on both similarity and annotation probability, surfacing the most actionable and trustworthy insights.This approach ensures that downstream users and systems receive not just relevant information, but indicators to assess the confidence of the recommendations, a critical component for high-stakes operational decision making.
- Explain the anomaly using verbalized, LLM-enhanced summaries and decision recommendations.
Once the relevant time periods are retrieved, scored, and re-ranked, RAE generates a natural language explanation to make the insights interpretable to engineers and operators. This explanation process is powered by an LLM, which takes in the top-ranked time chunks, associated annotations, and uncertainty scores to construct concise yet context-rich summaries. These summaries do more than describe what happened: they also highlight how the current scenario compares to known historical patterns, referencing expert annotations when available. Based on this understanding, the LLM also verbalizes decision recommendations grounded in domain knowledge, enabling the system to go beyond alerting to actively support human judgment. The result is a transparent, traceable explanation of the anomaly that improves user trust, accelerates situational awareness, and promotes confident action in time-sensitive operational environments.
This approach with RAE enables engineers to see not only that something is wrong, but also what it resembles, what experts concluded, and how confident the system is in that match.
Expert Companion: Codifying Field Expertise at Scale
The expert companion (EC) takes RAE a step further by making expert knowledge an integral, evolving part of the monitoring loop. While RAE retrieves similar past events with expert annotations to explain current anomalies, the EC ensures that expert insights continue to grow and refine the system over time. In real-time monitoring, when experts encounter new anomalies or unusual operational events, they can annotate and interpret these occurrences with their domain expertise. The EC captures and structures these annotations, linking them with the associated time series patterns, operational context, and corrective actions taken. These enriched insights are then added to the retrieval base, allowing the system to learn from each event and decision, and ensuring that human expertise and AI work together to enhance the system’s ability to explain, contextualize, and respond to future anomalies with greater accuracy and relevance . The expert companion is a strong example of innovation for AI in oil and gas operations.
The EC serves as the critical feedback loop that drives continuous improvement of the entire monitoring system. By capturing expert actions, decisions, and annotations during real-time operations, the EC feeds this knowledge back into the system, refining detection models and retrieval processes over time. This creates a cycle where humans and AI work together, combining the adaptability and speed of AI with human judgment and expertise. As a result, the monitoring system becomes more accurate, context-aware, and insightful with each event it processes, driving safer and more effective decision making across drilling, fracturing, and production operations.
Real-World Use Cases in Upstream Oil & Gas Operations
Let’s look at a few high-impact potential use case applications where RAE and EC systems can transform operations using AI in energy production:




Putting AI into Practice in Oil & Gas
The upstream industry faces a clear challenge: a need for actionable and explainable insights, delivered in critical moments during operations. However, there is an upside: the oil and gas industry has vast amounts of data that can be leveraged to produce these crucial insights. To evolve industrial operations, the next step is to fully integrate systems like RAE and EC into frontline upstream workflows such as drilling and production. This will provide operators not just basic data collection, but real-time, intelligent decision support.
As operational complexity continues to increase, and experienced field personnel become scarcer, adopting and advancing these intelligent systems will be critical. The future of upstream operations depends on making insight accessible, timely, and trustworthy, and these systems turn data into decisions that empower teams on the ground.
Visit our industry page to learn more about C3 AI’s solutions for oil and gas customers.
About the Authors
Peter Kowalchuk is an AI Engagement Director at C3 AI, with over 20 years of experience in AI and data engineering, delivering enterprise-scale solutions across the energy and industrial sectors. He has led transformative AI and cloud programs at C3 AI, Microsoft, and Halliburton, driving operational efficiency and innovation. He holds an MBA from Texas A&M, a Master of Science in Data Science from CUNY, and a bachelor’s degree in electrical engineering.
Chetan Tewari is a Senior AI Solution Manager at C3 AI, with experience delivering enterprise-scale machine learning systems across the energy and public sectors. At C3 AI, he leads technical teams to architect and deploy high-impact AI solutions for Fortune 500 and government clients. He previously built AI programs within the Government of Alberta and developed data science platforms for renewable energy and drilling operations. Chetan holds a Master of Science in Analytics from the Georgia Institute of Technology, and a Master of Engineering in Petroleum Engineering from the University of Calgary.
Sina Khoshfetrat Pakazad is the Vice President of Data Science at C3 AI, where he leads research and development in Generative AI, machine learning, and optimization. He holds a Ph.D. in Automatic Control from Linköping University and a Master of Science in Systems, Control, and Mechatronics from Chalmers University of Technology. With experience at Ericsson, Waymo, and C3 AI, he has contributed to AI-driven solutions across healthcare, finance, automotive, robotics, aerospace, telecommunications, supply chain optimization, and process industries. His recent research has been published in leading venues such as ICLR and EMNLP, focusing on multimodal data generation, instruction-following and decoding from large language models, and distributed optimization. Beyond this, he has co-invented patents on enterprise AI architectures and predictive modeling for manufacturing processes, reflecting his impact on both theoretical advancements and real-world AI applications.
Henrik Ohlsson is the Vice President and Chief Data Scientist at C3 AI. Before joining C3 AI, he held academic positions at the University of California, Berkeley, the University of Cambridge, and Linköping University. With over 70 published papers and 30 issued patents, he is a recognized leader in the field of artificial intelligence, with a broad interest in AI and its industrial applications. He is also a member of the World Economic Forum, where he contributes to discussions on the global impact of AI and emerging technologies.


