Cross-Encoder

Cross-Encoder

A cross-encoder processes the query and document jointly as a single input to a transformer. It can model fine-grained interactions between query and document tokens via self-attention, making it highly effective but computationally expensive.

Input:  [CLS] query tokens [SEP] document tokens [SEP]
                    ↓
              [Transformer]
                    ↓
              [CLS] → relevance score

Scoring

$s (q, d) = σ (w^{⊤} \cdot BERT_{[CLS]} ([q; d]))$

Cross-Encoder vs Bi-Encoder

Property	Cross-Encoder	Bi-Encoder
Query-doc interaction	Full (self-attention)	None (independent encoding)
Effectiveness	Higher	Lower
Latency	High ( $O (n)$ per query)	Low (pre-compute docs)
Use case	Reranking top-k	First-stage retrieval
Can pre-compute docs?	❌	✅

In Practice

Cross-encoders are too expensive for full-collection retrieval. They’re used as rerankers in Multi-Stage Ranking: first retrieve top-k with BM25 or Dense Retrieval, then rerank with cross-encoder.

Appears In

IR-L05 - Neural IR Intro & Reranking

Study Notes

Explorer

Cross-Encoder

Cross-Encoder

Cross-Encoder vs Bi-Encoder

Appears In

Graph View

Table of Contents

Backlinks