Multi-Stage Ranking

Multi-Stage Ranking

Multi-Stage Ranking is a retrieval architecture that pipes results through progressively more complex and expensive models. It balances the “Efficiency vs. Effectiveness” trade-off.

The Funnel Analogy

Searching millions of documents is like finding a needle in a haystack. You can’t use a microscope (expensive model) on every straw.

  1. First Stage (Retrieval): Use a leaf-blower (BM25 or DPR) to quickly grab the top 1000 candidates.
  2. Second Stage (Reranking): Use a magnifying glass (Neural Reranking / MonoBERT) to find the best 100 from those.
  3. Final Stage (Optional): Use a microscope (Heavy LLMs) to pick the perfect top 10.

Standard Pipeline Structure

StageModel TypeDocuments HandledSpeedQuality
RetrievalBM25, DPRMillions 1000Sub-millisecondMedium
RerankingMonoBERT, Cross-Encoder1000 100DecisecondsHigh
Fine RerankingmonoT5, LLMs100 10SecondsMaximum

Trade-offs: Efficiency vs. Effectiveness

  • Latency: If the reranker is slow, we must retrieve fewer documents in the first stage.
  • Recall: If the first stage misses the relevant document, the reranker can never find it.
  • Cost: Running Transformers on 1000 documents per query is computationally expensive.

Connections

Appears In