Outlier Bias

Definition

Outlier Bias is the tendency of items with distinctive or unusual features (visual outliers) to receive disproportionately more clicks, independent of their relevance.

An item is an outlier if it has:

  • Distinctive visual appearance (color, size, layout)
  • Unusual feature values (much higher/lower price, rating, etc.)
  • Salient characteristics (new label, exclusive offer)

These properties attract user attention above-and-beyond position-based and relevance-based factors.

Intuition

Visual Example

E-commerce product search:

Displayed products:
[1] Blue widget, $20, 4.0★
[2] Blue widget, $22, 3.8★
[3] Red widget (OUTLIER!), $50, 4.2★ ← Unusual color, higher price
[4] Blue widget, $18, 3.9★
[5] Blue widget, $25, 4.1★

Users notice item 3 due to color difference.
Item 3 gets more clicks than expected by position + relevance.

Why It Happens

Humans have evolved attention mechanisms to notice anomalies:

  • Different colors/shapes → attention
  • Extreme values → attention
  • Scarcity signals → attention

Evolutionary value: notice unusual things (danger, opportunity).

In modern UI: outliers get clicked more, even if lower quality.

Types of Outliers

1. Visual Outliers

Distinctive appearance:

  • Different color in a homogeneous list
  • Larger/smaller text or images
  • Special formatting (badges, icons)
  • High-contrast presentation

2. Feature-Based Outliers

Unusual feature values:

Price outliers: [20, 22, 20, 150, 19] ← 150 is outlier
Rating outliers: [4.0, 4.1, 3.9, 2.0, 4.0] ← 2.0 is outlier
Popularity: [100k views, 95k, 102k, 2M, 98k] ← 2M is outlier

3. Contextual Outliers

Unusual in context:

  • Scientific article in a shopping list
  • Educational content in entertainment feed
  • Expensive item among budget options

Mathematical Formulation

Outlier-Aware Position-Based Model

Extend Position-Based Click Model to include outlier effects:

where:

If rank contains an outlier:

  • Examination probability increases by (e.g., +10%)

Context-Aware Position Model

More refined: examination at rank depends on surrounding items’ features:

Neural parameterization:

Empirical Evidence

E-commerce Study

Sarvi et al. (2023) analyzed outlier effects:

Scenario: Grocery shopping results
  Typical items: Similar price, features, appearance
  Outlier: Premium/luxury version

Click-through rate by outlier position:
  Position 1: 15% (position bias)
  Position 1 (outlier): 20% (+5% outlier bonus)
  Position 5: 8% (lower position)
  Position 5 (outlier): 14% (+6% outlier bonus)

Outlier bonus ≈ 5-10% CTR increase, varies with context.

Magnitude

  • Strong outliers (e.g., bright color in grayscale): 20-40% CTR increase
  • Moderate outliers (e.g., price 2x neighbors): 5-15% increase
  • Weak outliers (slight distinction): 1-5% increase

Why It Matters

Learning Problem

A ranking model trained on clicks without accounting for outlier bias:

1. Model learns: "Red items are good"
   (Actually: red items are visually distinctive)

2. Model learns: "Expensive items are good"
   (Actually: expensive items stand out in price listings)

3. Model learns: "Rare items are good"
   (Actually: rare items get attention)

Result: Model optimizes for visibility, not relevance.

Ranking Consequences

True ranking (by relevance):
  [B (great), A (good), D (ok), C (poor)]

Biased ranking (based on clicks with outlier bias):
  [C (visually outlier), B (great), A (good), D (ok)]

Poor-quality item C gets ranked first due to visual distinctiveness.

Fairness Issues

  • Bias against plain items: Equally relevant but less distinctive items get lower ranking
  • Bias toward novel items: New/unusual items overexposed
  • Bias toward extreme values: Both extremely good and bad items overrepresented

Detection

How to Detect Outlier Bias

Method 1: Residual Analysis

For each item:
  Expected CTR = f(position, relevance)
  Observed CTR = actual clicks
  Residual = Observed - Expected

High positive residual → item is clicked more than predicted
High negative residual → item is clicked less than predicted

Analyze residuals vs. feature distinctiveness.

Method 2: A/B Testing

Test 1: Show items in ranked order
Test 2: Shuffle items to remove ranking cues

If outlier items still get more clicks in Test 2:
  → Outlier bias exists
  → Effect size ≈ Test2_CTR(outlier) - Test2_CTR(non-outlier)

Method 3: Visual Analysis

Manual review of high-residual items:

  • Are they visually distinctive?
  • Do they have unusual feature values?
  • Are they contextual outliers?

Modeling Approaches

Approach 1: Feature-Based Outlier Detection

Identify outliers using statistical methods:

outlier_score(item_d) = 
  max over features f:
    |value_f(d) - median_f| / std_f(d)

If outlier_score > threshold: item is outlier.

Approach 2: Learning Outlier Effects

Fit a model with explicit outlier terms:

Learn which feature dimensions cause outlier effects.

Approach 3: Neural Context Modeling

Use neural networks to learn interaction effects:

Input: Features of items at positions k-1, k, k+1

Neural network learns:
  - How does position k's features affect examination?
  - How do neighbors' features modulate examination?
  
Output: P(Exam_k | context)

Approach 4: Visual Feature Extraction

For visual distinctiveness:

1. Compute visual features of item display (color, contrast, etc.)
2. Compare to surrounding items
3. Compute distinctiveness score
4. Include in click model

Correction Methods

Method 1: Explicit Modeling

Include outlier terms in click model:

Fit model to separate position bias, outlier bias, and relevance.

Method 2: Post-Processing

After learning initial ranking:

1. Apply outlier correction to feature importance
2. Down-weight distinctive features
3. Retrain ranking model

Method 3: Diverse Display

Avoid creating outliers:

Design principle: Show diverse items to avoid outliers
  Instead of: [Blue, Blue, Red, Blue] (Red is outlier)
  Use: [Red, Blue, Yellow, Green] (All equally distinctive)

Reduces comparative distinctiveness.

Method 4: Doubly Robust with Outlier Term

Use Doubly Robust Estimation with explicit outlier correction:

Method 5: User Studies

Collect manual relevance judgments to ground truth:

Users: Rate items 1-5 for relevance (no position cues)
System: Compare to click-based judgments
Result: Identify which items are over/under-clicked
         due to visual features.

Real-World Examples

E-commerce: Price Outliers

Camera search results (typical prices: $400-500):
  [Canon: $450, 4.5★]
  [Sony: $470, 4.6★]
  [Premium: $1500, 4.4★] ← Outlier

Premium camera gets extra clicks due to:
  1. High price is unusual (stands out)
  2. Users might be curious
  3. Visual distinctiveness in price listing

But if purchased: Often lower satisfaction
(users didn't mean to click premium option).

Recommendation: Notification Badges

Netflix recommendations:
  [Movie A: normal card]
  [Movie B: normal card]
  [Movie C: normal card + "NEW!" badge] ← Outlier

Badged item gets more clicks, not due to relevance
but due to visual distinctiveness of badge.

News Feed: Sensational Headlines

Social media feed (typical headlines):
  "Tech company releases update"
  "Scientists discover new species"
  "SHOCKING: Celebrity scandal!!!" ← Outlier

ALL-CAPS, sensational headlines get more clicks
independent of news value or relevance.

Ethical Considerations

Negative Effects

  1. User manipulation: Distinctive features exploit attention biases
  2. Quality degradation: Relevant but plain items get lower visibility
  3. Filter bubble: Unusual/extreme content dominates
  4. Misinformation: Sensational false content gets amplified

Positive Effects

  1. Exploration: Outliers can introduce serendipity and discovery
  2. Novelty: New items get a fair chance (not buried by popularity)
  3. Diversity: Prevents homogeneous results

Trade-offs

Pure Relevance Ranking:
  + Optimal utility for users
  - Boring, low diversity
  - Misses novel high-quality items

With Outlier Effects:
  + Exploration, serendipity
  + Novel items get exposure
  - Can promote low-quality distinctive items
  - Reduces pure relevance

Connections

Appears In