Learning to Rank

Learning to Rank (LTR)

Learning to Rank is an application of Machine Learning to the problem of ranking. Instead of using a fixed heuristic like BM25, LTR models are trained on historical relevance data (e.g., click logs or human judgments) to automatically learn an optimal ranking function that combines hundreds of different features.

The Three Approaches

Pointwise: Treats ranking as a regression/classification problem. Predicts a score $s$ for a single $(q, d)$ pair.

Loss: $L (y, f (q, d))$ (e.g., MSE)

Pairwise: Focuses on the relative order of pairs of documents. Given $d_{i}$ and $d_{j}$ , the model learns if $d_{i} ≻ d_{j}$ .

Loss: $L (f (q, d_{i}), f (q, d_{j}), y_{ij})$ (e.g., RankNet)

Listwise: Optimizes the entire ranked list directly using IR metrics like NDCG or MAP.

Loss: $L (ranked_list, ground_truth)$ (e.g., LambdaMART)

Beyond One Feature

Traditional IR is like a one-string guitar (mostly using word overlap). LTR is an orchestra. It can use BM25 scores, but also PageRank, document age, user location, font size of headings, and URL length. The ML model learns the perfect “recipe” to combine these features for the best user experience.

Key Algorithms

LambdaMART: A Gradient Boosted Decision Tree (GBDT) approach that is widely considered the state-of-the-art for feature-based LTR.
RankNet: An early neural pairwise approach.
LambdaLoss: A generalized framework for optimizing listwise metrics.

Connections

Usage: Typically used as a reranker (Stage 2) after a fast model like BM25 (Stage 1) suggests candidates.
Features: Uses scores from BM25, Vector Space Model, and others as inputs.
Modern Context: Often integrated with BERT for IR in multi-stage pipelines.

Appears In

IR-L10 - Learning to Rank

Study Notes

Explorer

Learning to Rank

Learning to Rank

Key Algorithms

Connections

Appears In

Graph View

Table of Contents

Backlinks