Maximal Marginal Relevance (MMR)

Definition

Maximal Marginal Relevance

MMR is a greedy post-hoc re-ranking method that builds a recommendation (or retrieval) list one item at a time, at each step picking the item that maximizes a linear trade-off between its relevance to the query/user and its redundancy (maximum similarity) with the items already selected. It is the classic technique for injecting Diversity into a ranked list without retraining the underlying scorer — originally proposed by Carbonell & Goldstein (SIGIR 1998).

Intuition

Relevant but not redundant

A pure relevance ranking tends to stack near-identical items at the top (all five “action thrillers”). MMR fixes this at decoding/re-ranking time: after you have picked a few items, the marginal value of yet another near-duplicate is low. MMR re-scores each remaining candidate as “how relevant is it” minus “how similar is it to what I already chose,” so the next pick is relevant and adds something new.

In the RS-L04 - Generative Recommendation decoding pipeline this is exactly the “re-rank afterwards” knob: after Beam Search returns look-alike items sharing a Semantic ID prefix (e.g. all (12, 48, ·)), MMR rebuilds the list preferring items unlike those already chosen, so the user actually has a choice.

Mathematical Formulation

At each step, given the query/user representation $q$ , the set of candidate items $R$ , and the set of already-selected items $S$ , the next item is:

$MMR = ar g max_{d_{i} \in R ∖ S} [λ \cdot Sim_{1} (d_{i}, q) - (1 - λ) \cdot max_{d_{j} \in S} Sim_{2} (d_{i}, d_{j})]$

where:

$d_{i}$ — a candidate item not yet selected ( $d_{i} \in R ∖ S$ )
$q$ — the query / user-history representation (relevance target)
$Sim_{1} (d_{i}, q)$ — relevance score of candidate $i$ (e.g. the recommender’s predicted score, or the generative model’s log-likelihood $lo g p_{θ} (z_{i} ∣ x)$ )
$Sim_{2} (d_{i}, d_{j})$ — similarity between candidate $i$ and an already-selected item $j$ (e.g. cosine similarity of item embeddings, or shared category)
$max_{d_{j} \in S} Sim_{2} (d_{i}, d_{j})$ — the redundancy term: the candidate’s similarity to its nearest already-chosen item (its worst-case overlap)
$λ \in [0, 1]$ — trade-off knob: $λ = 1$ recovers pure relevance ranking; $λ = 0$ ranks purely for novelty/diversity; intermediate $λ$ balances the two
$S$ — the partial output list built so far (starts empty)

The first item is the most relevant one (the max over $S = \emptyset$ contributes nothing); every later item is chosen by the full criterion above.

Key Properties / Variants

Greedy, not globally optimal: MMR is a heuristic. It maximizes marginal gain step by step and does not solve the combinatorial “best diverse set of size K” problem (which is NP-hard, related to max-sum/max-min dispersion). It runs in roughly $O (K \cdot ∣ R ∣)$ similarity evaluations.
Post-processing intervention: in the RS-L02 - Evaluation Beyond Accuracy taxonomy MMR is a heuristic post-processing re-ranker — it operates on the output ranked list and never touches training. This is the same “greedy search to re-rank items” family as the heuristic methods in the FairDiverse toolkit.
Accuracy–diversity trade-off: raising diversity (lowering $λ$ ) typically costs a little top-1 accuracy but improves Intra-List Distance / Catalog Coverage / Novelty. This is the canonical Diversity-vs-Accuracy trade-off; $λ$ is the explicit dial.
Choice of $Sim_{2}$ defines “diversity”: embedding cosine gives content diversity; category/genre overlap gives categorical diversity. The metric you optimize against (e.g. ILD) should match the similarity you penalize.
MMR in GenRec decoding: one of two decoding-time diversity remedies in generative recommendation (the other being sampling / temperature and diverse beam search). It complements, rather than replaces, fixing diversity at the tokenizer (so popular items do not collapse onto a shared prefix) or rewarding diversity in RL fine-tuning.
Relation to fairness re-ranking: structurally identical to exposure-fairness re-rankers that add a per-step penalty/score to the relevance score; MMR’s penalty targets redundancy, fairness re-rankers’ penalty targets under-exposed groups.

Greedy MMR re-ranking, building a list of size $K$ :

Algorithm: MMR Re-Ranking
─────────────────────────────────────────────
Input:  candidate set R, query q, λ ∈ [0,1], list size K
Output: ordered list S
 
S ← []                                  # selected items (in order)
while |S| < K and R is not empty:
    for each d_i in R:                  # score every remaining candidate
        rel_i ← Sim1(d_i, q)
        if S is empty:
            red_i ← 0
        else:
            red_i ← max over d_j in S of Sim2(d_i, d_j)   # redundancy
        mmr_i ← λ * rel_i − (1 − λ) * red_i
    d* ← argmax_{d_i in R} mmr_i
    append d* to S
    remove d* from R
return S

Connections

Adds: Diversity (the objective MMR optimizes for) — measured by Intra-List Distance, Novelty, Catalog Coverage
Trades against: Top-K Recommendation accuracy (NDCG / Recall) — the Diversity-vs-Accuracy trade-off
Applied as: a heuristic post-processing re-ranking step (see Reranking)
Used in GenRec to fix homogeneous Beam Search output over Semantic IDs
Sibling re-rankers: exposure-based Fairness in Recommendation post-processors (same per-step penalty structure)

Study Notes

Explorer

Maximal Marginal Relevance (MMR)

Maximal Marginal Relevance (MMR)

Definition

Intuition

Mathematical Formulation

Key Properties / Variants

Connections

Appears In

Graph View

Table of Contents

Backlinks