Information Retrieval

Information Retrieval

Information Retrieval (IR) is the activity of obtaining information system resources that are relevant to an information need from a collection of those resources. It involves finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections.

Core Components

Document collection (corpus): The set of documents to search
Query: A user’s expression of their information need
Relevance: The degree to which a document satisfies the information need
Ranking function: Scores and orders documents by estimated relevance

IR Pipeline

Query → [Query Processing] → [Matching/Retrieval] → [Ranking] → Ranked Results
                                     ↑
                              [Inverted Index]
                                     ↑
                     Documents → [Indexing] → [Text Processing]

Retrieval Paradigms

Paradigm	Example	How it works
Sparse retrieval	BM25, TF-IDF	Exact term matching via Inverted Index
Dense retrieval	DPR, ColBERT	Semantic matching via learned embeddings
Learned sparse	SPLADE	Neural term weights in inverted index
Generative	DSI, GENRE	Generate document IDs directly
Reranking	MonoBERT	Re-score top-k from first-stage retrieval

Appears In

IR-L02 - IR Fundamentals
All IR lectures

Study Notes

Explorer

Information Retrieval

Information Retrieval

Core Components

IR Pipeline

Retrieval Paradigms

Appears In

Graph View

Table of Contents

Backlinks