Maximal Marginal Relevance (MMR)

Definition

Maximal Marginal Relevance

MMR is a greedy post-hoc re-ranking method that builds a recommendation (or retrieval) list one item at a time, at each step picking the item that maximizes a linear trade-off between its relevance to the query/user and its redundancy (maximum similarity) with the items already selected. It is the classic technique for injecting Diversity into a ranked list without retraining the underlying scorer — originally proposed by Carbonell & Goldstein (SIGIR 1998).

Intuition

Relevant but not redundant

A pure relevance ranking tends to stack near-identical items at the top (all five “action thrillers”). MMR fixes this at decoding/re-ranking time: after you have picked a few items, the marginal value of yet another near-duplicate is low. MMR re-scores each remaining candidate as “how relevant is it” minus “how similar is it to what I already chose,” so the next pick is relevant and adds something new.

In the RS-L04 - Generative Recommendation decoding pipeline this is exactly the “re-rank afterwards” knob: after Beam Search returns look-alike items sharing a Semantic ID prefix (e.g. all (12, 48, ·)), MMR rebuilds the list preferring items unlike those already chosen, so the user actually has a choice.

Mathematical Formulation

At each step, given the query/user representation , the set of candidate items , and the set of already-selected items , the next item is:

where:

  • — a candidate item not yet selected ()
  • — the query / user-history representation (relevance target)
  • — relevance score of candidate (e.g. the recommender’s predicted score, or the generative model’s log-likelihood )
  • — similarity between candidate and an already-selected item (e.g. cosine similarity of item embeddings, or shared category)
  • — the redundancy term: the candidate’s similarity to its nearest already-chosen item (its worst-case overlap)
  • — trade-off knob: recovers pure relevance ranking; ranks purely for novelty/diversity; intermediate balances the two
  • — the partial output list built so far (starts empty)

The first item is the most relevant one (the max over contributes nothing); every later item is chosen by the full criterion above.

Key Properties / Variants

  • Greedy, not globally optimal: MMR is a heuristic. It maximizes marginal gain step by step and does not solve the combinatorial “best diverse set of size K” problem (which is NP-hard, related to max-sum/max-min dispersion). It runs in roughly similarity evaluations.
  • Post-processing intervention: in the RS-L02 - Evaluation Beyond Accuracy taxonomy MMR is a heuristic post-processing re-ranker — it operates on the output ranked list and never touches training. This is the same “greedy search to re-rank items” family as the heuristic methods in the FairDiverse toolkit.
  • Accuracy–diversity trade-off: raising diversity (lowering ) typically costs a little top-1 accuracy but improves Intra-List Distance / Catalog Coverage / Novelty. This is the canonical Diversity-vs-Accuracy trade-off; is the explicit dial.
  • Choice of defines “diversity”: embedding cosine gives content diversity; category/genre overlap gives categorical diversity. The metric you optimize against (e.g. ILD) should match the similarity you penalize.
  • MMR in GenRec decoding: one of two decoding-time diversity remedies in generative recommendation (the other being sampling / temperature and diverse beam search). It complements, rather than replaces, fixing diversity at the tokenizer (so popular items do not collapse onto a shared prefix) or rewarding diversity in RL fine-tuning.
  • Relation to fairness re-ranking: structurally identical to exposure-fairness re-rankers that add a per-step penalty/score to the relevance score; MMR’s penalty targets redundancy, fairness re-rankers’ penalty targets under-exposed groups.

Greedy MMR re-ranking, building a list of size :

Algorithm: MMR Re-Ranking
─────────────────────────────────────────────
Input:  candidate set R, query q, λ ∈ [0,1], list size K
Output: ordered list S
 
S ← []                                  # selected items (in order)
while |S| < K and R is not empty:
    for each d_i in R:                  # score every remaining candidate
        rel_i ← Sim1(d_i, q)
        if S is empty:
            red_i ← 0
        else:
            red_i ← max over d_j in S of Sim2(d_i, d_j)   # redundancy
        mmr_i ← λ * rel_i − (1 − λ) * red_i
    d* ← argmax_{d_i in R} mmr_i
    append d* to S
    remove d* from R
return S

Connections

Appears In