Unlocking AI's Potential: The Power of Retrieval-Augmented Generation

Unlocking AI's Potential: The Power of Retrieval-Augmented Generation

Generative Artificial Intelligence seems to have taken over the world. In the two short years since ChatGPT was released, diverse institutions have changed to accommodate the juggernaut of AI. Whether educational systems, art collectives, or global supply chains, AI and its implications find a place in public discourse, particularly Large Language Models (LLMs). The scope of customizability, convenience, and adaptability that AI provides is leading change transformation across industries. Thus, it is no surprise that, in Gartner's survey of 1,400 top executives, over 50% worked on integrating AI into their processes.


However, organizations must temper their pedal-to-the-metal approach to AI due to repeated misinformation or bias propagation from LLMs, creating a public furor. A recent high-profile case occurred in February 2024 when a disgruntled customer took Air Canada to court over misinformation he received from their chatbot about airline prices. Similarly, LLMs have been found to propose eugenics and race-based medicine, with disastrous consequences if integrated into healthcare (Omiye et al.). Consequently, the percentage of people who distrust AI is more significant than those who accept it, standing at 35% and 30%, respectively (2024 Edelman Trust Barometer, n.d.).


With significant conflicting benefits and harms, executives must make an informed decision before integrating Generative AI into their systems. This delicate balancing act between ensuring compliance and staying caught up in innovation is the Gordian knot for many boardroom discussions. Where does the solution lie?


RAGs to Riches: Retrieval-Augmented Generation As A Solution


The two fundamental problems of LLMs are a lack of sources and a tendency to 'hallucinate.' GenAI can often confidently proclaim its information is 'based on verified sources' without telling you the source. This lack of veracity is due to the probabilistic nature of the text it generates, which only looks at the most likely next word in a sentence without the intuition required to confirm its plausibility. Similarly, hallucination in AI occurs when the model provides a nonsensical or incorrect statement. To negate these harms, programmers have worked on a technique known as Retrieval Augmented Generation (RAG). In technical terms, this process utilizes a vector database to convert information into a numerical format. Then, the LLM can refer to the database and ensure its response matches the information available. The best way to explain RAG is through a Venn diagram.


Suppose we imagine the 'base' LLM to be the universal set. In that case, RAG provides a specialized 'subset' of information and ensures that the model checks in with that subset before starting any response to a user's prompt. Harking back to the example of Air Canada, RAG would have ensured that the chatbot checked in with the company's pricing list before providing any information to the consumer. In practical application, mixing LLMs with RAG has proven massively beneficial. For example, Algo Communications was geared for massive growth but couldn't train their customer service representatives at the same pace as their operations were scaling up. Thus, they began to use GenAI to train their new customer service representatives. Looking at this solution from the lens of past failures, an LLM would cause more trouble than good. For example, what if the model fed an employee the wrong processes due to hallucination? Even worse, what if the impact of the false training reached the consumer? To mitigate these harms, Algo Communications implemented RAG. Here's what Ryan Zoehner, the VP of Commercial Operations for Algo Communications, has to say about their first foray into using this tandem system (Bendersky) :


"In just two months after adding RAG, Algo's customer service team was able to more quickly and efficiently complete cases, which helped them move on to new inquiries 67% faster."


At first glance, RAG is the perfect solution for businesses seeking to implement a worry-free GenAI system. However, there are still other factors which need to be taken into consideration. For example, even though RAG may reduce the inaccuracy of responses, customers need to be aware of prompt engineering to get the best possible response for their needs. Furthermore, although a small, curated dataset can ensure that the answers given are accurate, it may reduce the scope of what your LLM can do.


Applications of RAG and Future Scope


While there are some drawbacks, it seems that Retrieval Augmented Generation has great scope across industries. As we saw with Algo Communications, companies can drastically reduce onboarding training time. Furthermore, a more straightforward way to reach into company databases and refer to policies and practices can be done using RAG. Instead of delving into archives or records, for example, an executive could type into a chatbox about the scenarios he is facing and the company's modus operandi for the same.


In the future, a holistic implementation of RAG with checks and balances, such as RAG, can create massive value for organizations by streamlining access to information. Furthermore, a secondary impact of creating high-quality databases for company use is generated through RAG, which can be transposed into further applications. Marrying the continuous learning of RAG databases with the fluidity of LLMs can be the way for companies that are wary of GenAI's 'shoot from the hip' approach to information dissemination. Over a period of time, it may condition people to regain their trust in responsible AI practices and embrace organizations that adopt the same into their fold.





Works Cited


2024 Edelman Trust Barometer. (n.d.-b). Edelman. https://www.edelman.com/trust/2024/trust-barometer


"Gartner Poll Finds 55% of Organizations Are in Piloting or Production." Gartner, 3 Oct. 2023, www.gartner.com/en/newsroom/press-releases/2023-10-03-gartner-poll-finds-55-percent-of-organizations-are-in-piloting-or-production-mode-with-generative-ai.


Omiye, Jesutofunmi A., et al. "Large Language Models Propagate Race-based Medicine." Npj Digital Medicine, vol. 6, no. 1, Oct. 2023, https://doi.org/10.1038/s41746-023-00939-z.


Bendersky, Ari. "RAG — the Hottest 3 Letters in Generative AI Right Now." Salesforce, 28 May 2024, www.salesforce.com/blog/what-is-retrieval-augmented-generation.


Albert Chan

Tech Leader Specializing in Practical AI Application for Sales & Marketing

1mo

Congratulations on your first article.

To view or add a comment, sign in

Insights from the community

Others also viewed

Explore topics