I'm trying to decide between:
Bleeding edge: Implementing something like LightRAG or GraphRAG.
Proven stack: Standard Hybrid Search (Weaviate/Elastic + Reranking) orchestrated by tools like Dify.
For those who have built RAG at this scale:
What is your preferred stack for 2025?
Is the complexity of Graph/LightRAG worth it over standard chunking/retrieval for this volume?
How do you handle maintenance and updates efficiently?
Looking for architectural advice and war stories.
Would the 10M documents be searched with a single vector search or would it be pre-filtered by other columns in your table first. If some prefiltering is happening it naturally make things faster. You will likely want to use regular text / tsvector based search as well and potentially feed the LLM with this as well since vector search isn't perfect.
You would then decide if you want to do re-ranking or not before handing it to the final LLM context window. These days, models are pretty good so they will do their own re-ranking to some extent but depends a bit on cost, latency and the quality of result that you are looking for.