If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves.

This is exactly what I’ve been trying to communicate in my org in the past few months. It’s 2024 and we still can’t have a proper search engine in organizations to find relevant information from various sources. While this problem remains to be solved, organizations are adapting RAG and AI into their tooling, but are missing the important R of the RAG: Retrieval. I’ve been an advocate of prioritizing search engines over any AI related tool in the past few months, and I found it refreshing to read about this somewhere else:

Imagine you’re a company that wants to build an LLM-powered documentation experience. If you think of a vector database as just providing an expanded memory to your language model, you might just embed all of your company’s product docs, and then let users ask questions to your bot. When a user hits enter, you do a vector search for their query, find all of the chunks, load them into context, and then have your language model try to answer the question. In fact, that’s the approach we initially took at Stripe when I worked on their AI docs product.

Ultimately though, I found that approach to be a dead-end. The crux is that while vector search is better along some axes than traditional search, it’s not magic. Just like regular search, you’ll end up with irrelevant or missing documents in your results. Language models, just like humans, can only work with what they have and those irrelevant documents will likely mislead them.

If you want to make a good RAG tool that uses your documentation, you should start by making a search engine over those documents that would be good enough for a human to use themselves. This is likely something your organization has considered before, and if it doesn’t exist it’s because building a good search engine has traditionally been a significant undertaking.

source


Comment? Reply via Email, Mastodon or Twitter.