Introducing a novel retrieval augmented generation approach for enterprises
By Graham Neubig, Associate Professor of Computer Science, Carnegie Mellon University
One question every enterprise asks nowadays is, “How can I use the power of generative AI to better access and use my data?” This can take many forms, but one of the most valuable is getting answers to important questions from the large amounts of unstructured data gathering dust in stores of PDFs, word documents, and rarely touched structured databases that most companies have in abundance.
In my work at Carnegie Mellon, I focus on fundamentals and applications of large language models (LLMs) to use cases such as question answering, software engineering, translation, and dialog. I have been working with C3 AI on its generative AI offerings, and I will share why, from a technical perspective, I find it to be a highly adept solution to a challenging yet important problem.
Using generative AI in enterprise-critical situations is no easy task, requiring a solution that overcomes a number of issues that emerge from when using machine learning and natural language processing. These include:
Even when these hurdles have been overcome, there are also logistical hurdles in deploying these systems in enterprise settings.
The C3 Generative AI offering tackles these problems by accurately answering complex questions over large enterprise document collections, providing evidence for its reasoning, mitigating hallucinations, respecting privacy via access controls, and adapting easily to new settings. C3 Generative AI product is a sophisticated RAG framework that retrieves relevant information in response to the user query, and then uses an LLM to generate coherent results. C3 AI incorporates a number of novel improvements over other alternatives, all aimed at making the system robust and applicable to enterprise use cases.
High-precision, controllable information retrieval over multi-domain data sources. A first step to C3 AI’s technical approach is ensuring high-precision information retrieval over large collections of structured datasets (including relational tables, sensor data) and unstructured documents — a difficult task when handling a variety of data formats and domain-specific vocabulary. This is achieved through incorporating state-of-the-art methods for information retrieval that perform semantic matching of each part of the query to relevant parts of the retrieved documents. This is all backed up by robust information extractors that can parse and chunk passages from a variety of documents before indexing. This approach greatly reduces the number of hallucinations by ensuring that appropriate evidence is available to the model when it is generating results.
The underlying models can further be fine-tuned to specific enterprise use cases with the appropriate domain terminology. This is combined with the ability to rapidly engineer and automatically benchmark prompts designed for particular use cases.
Furthermore, many of C3 AI’s enterprise customers require strict access controls over the types of information that can be retrieved, a key component of the C3 AI Platform. For instance, retrieval can be restricted to adhere to access controls, ensuring no data is leaked to unauthorized viewers. There are also mechanisms to allow metadata filtering, which allows for retrieval of fresh information over stale information, or otherwise encourage the model to pay attention to particular varieties of documents or authors when answering questions.
Complex reasoning over heterogenous data. C3 AI’s technical approach makes it possible to answer complex questions where many other approaches fail. This is done by scaling up a new framework that C3 AI calls Read-Extract-Answer, inspired by the Demonstrate-Search-Predict paradigm.
C3 AI’s method works by gradually synthesizing information from multiple sources, retrieving new information as it becomes necessary to the model’s reasoning process. It’s particularly useful for complex questions enterprises need to ask from their datasets.
LLM-agnostic implementation. C3 AI’s method isn’t just a wrapper on top of a particular LLM, nor is it locked into a particular LLM ecosystem. This allows both the use of the latest open-source and commercial models, and implementation in air gapped environments that provide the strongest privacy and security guarantees. C3 AI pre-trains and fine-tunes open-source models to maximize accuracy on each step of the reasoning process, and tailor the LLM for specific use cases.
Automated evaluation framework. C3 AI rigorously benchmarks performance by using an automated evaluation framework based on three feedback loops. The innermost loop is fully automated and uses ML metrics to evaluate the model accuracy. The second loop uses ground truth to measure accuracy against expectations. The third and outermost loop involves the LLM grading itself;his evaluation loop requires prompt engineering and periodic calibration with human assistance.
This evaluation framework assesses each individual component of the RAG system, including the retrieval and generative components to identify errors.
In a business intelligence and analysis use case for a large agricultural firm, C3 Generative AI achieved almost 90% accuracy1 on a client-provided dataset that includes complex questions nested in dense tabular content in PDFs. Notably, this included a quick ramp up from a baseline approach that achieved approximately 12% accuracy, to a highly performant system using C3 AI library of adaptation strategies.
C3 AI also deployed its rapid adaptation framework to a manufacturing application to analyze scores of complex technical manuals. Within a week, C3 AI configured a RAG pipeline that achieved more than 90% accuracy2; when posing the same technical questions to OpenAI’s ChatGPT, it achieved a performance of just 37% accuracy.
C3 AI is innovating on the best methodology in generative AI to improve the ability of enterprises to better use their existing data. And this will serve as a solid platform for some of the things that are up next, including:
With strong expertise building enterprise-ready generative AI solutions and a broad array of domain-specific adaptation techniques, C3 AI has the ability to quickly build powerful RAG-based applications for a variety of enterprise use cases.
1 Accuracy: Human evaluation of the number of correct answers generated by C3 AI’s RAG solution that align with expected answers divided by the total of 45 questions.
2 Accuracy: Human evaluation of the number of correct answers generated by C3 AI’s RAG solution that align with expected answers divided by the total of 70 questions.