Contrastive Learning

Contrastive Learning

Contrastive Learning is a training paradigm that learns representations by encouraging positive pairs to be close together in the embedding space and negative pairs to be far apart.

InfoNCE Loss

For a query , a positive document , and a set of negative documents , the contrastive loss is often defined as:

where:

  • — A similarity metric (e.g., dot product or cosine similarity)
  • — Temperature parameter scaling the distribution

In Information Retrieval

Contrastive learning is the foundation of Dense Retrieval models like DPR.

  • Positive Pair: (Query, Relevant Document).
  • Negative Pair: (Query, Irrelevant Document).

Negative Strategies

The choice of negatives is the most important factor in contrastive training:

  • Random Negatives: Documents sampled randomly from the collection (too easy).
  • In-batch Negatives: Efficiently using other positive documents in the current training batch as negatives.
  • Hard Negative Mining: Deliberately selecting documents that “look” relevant but aren’t (e.g., high BM25 score but labeled irrelevant).

Connections

  • Main application: Training DPR bi-encoders.
  • Optimization technique: Hard Negative Mining.
  • Metric: Often evaluated using MRR or Recall@k.

Appears In