Inverse Propensity Weighting

Definition

Inverse Propensity Weighting (IPW) is a general technique for obtaining unbiased estimates from selection-biased data. The core idea: weight observations inversely to their probability of being selected/observed.

In the context of Learning to Rank, IPW corrects for Position Bias by reweighting clicks by the inverse of their examination probability:

$Estimate = \sum_{observed} \frac{Observation}{P ( observed )}$

Intuition

Simple analogy: Imagine surveying people on the street:

Rich people are less likely to stop and answer your questions (low propensity)
Poor people are more likely to participate (high propensity)
Without correction, your survey is biased toward poor respondents
Solution: Up-weight the opinions of rich people and down-weight poor people’s opinions
Result: An unbiased view of the whole population

In ranking:

Clicks at position 1 have high propensity (100% examined)
Clicks at position 8 have low propensity (40% examined)
Without correction, you overlearn from position 1 and ignore position 8
Solution: Down-weight position 1 clicks, up-weight position 8 clicks
Result: An unbiased estimate of relevance independent of position

Mathematical Formulation

General IPW

For a selection process where observations are biased by propensity $p (x)$ :

$E [Y] = \frac{1}{n} \sum_{i = 1}^{n} \frac{Y _{i}}{p ( X _{i} )}$

This is an unbiased estimator:

$E [E [Y]] = E [\frac{Y}{p ( X )}] = E [Y]$

IPW for Position Bias

Under the Position-Based Click Model:

$Click_{d, k} = Exam_{k} \cdot Rel_{d}$

The unbiased estimator of relevance:

$Rel_{d} = \sum_{rankings} \frac{Click _{d, k}}{P ( Exam _{k} )}$

Or for a ranking:

$DCG_{IPS} = \sum_{k = 1}^{n} \frac{Click _{y [k]}}{P ( Exam _{k} ) \cdot l o g _{2} ( k + 1 )}$

Inverse Propensity Weights

Concrete weights for standard examination propensities:

Rank	$P (Exam_{k})$	Weight $1/ P (Exam_{k})$
1	1.00	1.00x
2	1.00	1.00x
3	0.90	1.11x
4	0.80	1.25x
5	0.70	1.43x
6	0.60	1.67x
7	0.50	2.00x
8	0.40	2.50x

A click at rank 8 is worth 2.5× a click at rank 1 in terms of relevance evidence.

Key Properties

Unbiasedness

IPW produces an unbiased estimate under the assumed model:

$E [IPS estimate] = True value$

This means:

On average, across many replicates, the estimate is correct
It doesn’t mean any single estimate is correct
It doesn’t mean low variance

Variance Problem

IPW suffers from high variance when propensities are low:

When $P (Exam_{k})$ is small, the weight $1/ P (Exam_{k})$ is large, amplifying noise.

Example:

If $P (Exam_{8}) = 0.1$ , weight = 10x
A single noisy click at rank 8 becomes a massive signal
Variance explodes

Variance-Bias Trade-off

Unbiasedness is a property of the expected value, not the actual estimate quality.

        Biased, Low Var    Unbiased, High Var    Unbiased, Low Var
        ________________   ________________      ________________
            ╱╲                    │                     ╱╲
           ╱  ╲                   │                    ╱  ╲
          ╱    ╲                  │                   ╱    ╲
       __|______|__          _____|_____         ___|______|___
       True Value            True Value          True Value
       (Consistent           (Correct            (Correct & 
        miss)                 on average)        stable)

Practical lesson: Unbiasedness alone is not sufficient. We need both unbiasedness AND low variance.

Practical Considerations

Propensity Clipping

Prevent extreme weights by clipping propensities:

$w_{k} = min (\frac{1}{P ( Exam _{k} )}, w_{m a x})$

Common choices: $w_{m a x} = 5, 10, 20$

Trade-off: Introduces slight bias but dramatically reduces variance.

Non-Zero Propensities

IPW requires all items to have non-zero propensity:

$P (Exam_{k}) > 0 \forall k$

Problem: In top-k ranking (k=10), items at rank 11+ have zero propensity. Standard IPW fails.

Solutions:

Enforce stochastic policies (show every item with some probability)
Use Doubly Robust Estimation instead

Propensity Estimation Error

IPW depends on accurate propensity estimates $P (Exam_{k})$ .

If estimates are wrong:

Weights are wrong
Estimates become biased

Estimation error compounds, especially for low-propensity items.

Assumptions

1. Correct User Model

Users must behave according to the assumed model (e.g., Position-Based Click Model).

Violation: If users follow Cascading Position Bias instead, PBM-based IPW fails.

2. Correct Propensity Estimation

Must have accurate estimates of $P (Exam_{k})$ .

How to ensure:

Online randomization (gold standard)
Intervention harvesting (good if assumptions hold)

3. Overlap / Positivity

All items must have non-zero propensity.

$P (Exam_{k}) > 0 \forall k$

Origin & Connection to Importance Sampling

IPW is essentially Importance Sampling applied to observational data:

In importance sampling, to estimate $E_{p} [f (X)]$ when samples are from $q (X)$ :

$E_{p} [f (X)] = \int f (x) p (x) d x = \int f (x) \frac{p ( x )}{q ( x )} q (x) d x \approx \frac{1}{n} \sum_{i} f (x_{i}) \frac{p ( x _{i} )}{q ( x _{i} )}$

In ranking:

$p (x) =$ true relevance distribution
$q (x) =$ observed distribution (biased by position)
Weight = $p (x) / q (x) =$ inverse propensity

Strengths & Weaknesses

Strengths

✓ Theoretically guaranteed unbiasedness (under assumptions)
✓ Works across any user model (if you model it correctly)
✓ Simple to implement
✓ No learned parameters needed (only propensities)

Weaknesses

✗ High variance from low-propensity items
✗ Sensitive to propensity estimation errors
✗ Fails with zero propensities
✗ Variance-reduction via clipping introduces bias
✗ Can be unstable in practice

Connections

Generalization: Doubly Robust Estimation combines IPW with learned models for lower variance
Alternative: Click Models provide lower variance but weaker guarantees
Causal inference: IPW is a core technique in causal inference for observational studies
Importance sampling: IPW is the observational version of importance sampling

Study Notes

Explorer

Inverse Propensity Weighting

Inverse Propensity Weighting

Definition

Intuition

Mathematical Formulation

General IPW

IPW for Position Bias

Inverse Propensity Weights

Key Properties

Unbiasedness

Variance Problem

Variance-Bias Trade-off

Practical Considerations

Propensity Clipping

Non-Zero Propensities

Propensity Estimation Error

Assumptions

1. Correct User Model

2. Correct Propensity Estimation

3. Overlap / Positivity

Origin & Connection to Importance Sampling

Strengths & Weaknesses

Strengths

Weaknesses

Connections

Appears In

Graph View

Table of Contents

Backlinks