Connecting LLMs with Proprietary Knowledge and Data

dbredesen
Jan 30, 2024
3 min read

Have you ever wondered whether ChatGPT can converse about your company’s proprietary data and documents? LLMs (Large Language Models) such as ChatGPT and Bard are trained on immense amounts of information, but they are unlikely to know specifics about your company’s products, and (hopefully) weren’t trained on your internal documentation. However, there are many use cases where access to enterprise knowledge in a dialog with an LLM can prove beneficial, or even essential. To name a few:

Customer Service–a chatbot with access to an extensive knowledge base can lead users to a solution in an interactive, conversational manner. Conversational interfaces can be more intuitive and user-friendly, and lead to significantly faster times-to-resolution than reading user guides and support articles.
Analytics and insights–an LLM can help formulate insights across large bodies of proprietary text and data, which might otherwise take large amounts of preparation and analysis.
Internal knowledge systems–conversational access to internal documents can support rapid new hire onboarding, ongoing training, and streamlined daily operational efficiency.

Enter RAG (Retrieval Augmented Generation). RAG works by dynamically retrieving relevant information from your company’s data stores (databases, document libraries, etc.) during a conversation with an LLM, and then generating responses based on both the retrieved information and the LLM’s existing knowledge. This approach allows for accurate, contextually relevant, and up-to-date responses, especially in highly specialized fields or rapidly changing environments. (Note: there are many ways to retrieve documents by relevance to a query. Vector databases such as Pinecone and embedding APIs such as those in Google Vertex AI specialize in this type of retrieval; this will be the subject of a future article.)

An example in practice

Pixy, Inc. sells a range of digital cameras. It has an extensive knowledge base about the features and operation of these cameras, including user guides and previous support cases.

Jane enters a customer support chat with Pixy’s ChatGPT-powered chatbot, and asks “how can I disable the flash on the XR-8 model?”
The chatbot queries the Pixy document store for content relevant to the user’s question, and returns the 3 documents with the most relevant information, including the XR-8 User Guide and some support cases involving the flash operation. The content of these documents are added to the “context” of the chat (without showing them to the user). Now that this information is in ChatGPT’s context, it can converse about it in natural language, tailored to the user’s question.
The bot replies, “you can disable the flash on the XR-8 by pressing the Menu button, scrolling to the 3rd page, and switching ‘Flash On/Off/Auto’ to Off. Does this answer your question?”
The user can now ask follow-up questions about flash operation (which the chatbot may be able to answer based on the previous information retrieval), or may ask about other topics (which may trigger additional retrievals).

Figure 1: Illustration of interactions in a RAG-enabled LLM

RAG vs. Fine Tuning

Why not just fine-tune ChatGPT on your data? Fine-tuning a model to specific datasets can indeed enhance an LLM with specialized knowledge, but this method has its drawbacks. It's inherently static—fine-tuning enhances a model with a snapshot of information at a point in time, which can quickly become obsolete. Furthermore, each fine-tuning operation demands significant computational resources, so keeping up with rapidly changing data can be a costly prospect. Conversely, RAG presents a dynamic, scalable alternative that can access the current state of your enterprise knowledge, with trivial additional computational demands. It assures content relevance and recency by tapping into the latest data upon each query.

“It’s the difference between an open-book and a closed-book exam… In a RAG system, you are asking the model to respond to a question by browsing through the content in a book, as opposed to trying to remember facts from memory”, says Luis Lastras, director of language technologies at IBM Research. When used properly, RAG can bring the natural language capabilities of an LLM to a variety of proprietary knowledge applications, and support a broad range of outcomes from better user experiences to increased productivity.

To learn more about how RAG can benefit your products or knowledge platforms, contact us via the "Chat With Us" box on this site, or via email at info@developedby.ai.

Developed by AI

Connecting LLMs with Proprietary Knowledge and Data

Recent Posts

Comments