AeGis RAG

See Project with Full Code on GitHub

🎯 The Objective

AEGIS-RAG is a production-grade Retrieval-Augmented Generation system designed for reliable question answering over policy documents.
It focuses on retrieval quality, ranking precision, and safe abstention to minimize hallucinations.

🏗️ The Architecture & Methodology

The system follows a three-stage pipeline:

  • Hybrid Retrieval: Dense (semantic) + BM25 (lexical) combined with RRF fusion
  • Reranking: Cross-encoder to optimize top-K relevance
  • Generation: LLM with grounded context and no-answer fallback

This separation enables independent optimization of retrieval, ranking, and generation.

📊 Key Performance Metrics

  • Recall@5: 0.85
  • nDCG@5: 0.80
  • Semantic Match: 0.75
  • Latency: ~0.73s (with reranking)

Reranking improves accuracy but introduces a ~7× latency trade-off.

💡 Core Insights & Business Impact

  • Retrieval quality is the main driver of performance
  • Reranking improves precision but reduces throughput
  • Safe abstention effectively prevents hallucinations
  • Fine-tuning is not necessary for small, structured datasets

⚙️ Technical Stack

Python, LangChain, Chroma, SentenceTransformers, BM25, Cross-Encoders, Ollama, RAGAS

Scroll to Top