Session-based Recommendation

Definition

Session-based Recommendation

Session-based recommendation predicts the next item(s) a user will interact with based only on the current session’s short-term browsing behavior — the ordered sequence of clicks/views/adds within one visit — rather than a long-lived user profile. It is a paradigm of Sequential Recommendation: items arrive in chronological order and the model conditions on that order.

Canonical example: a user browsing a phone is shown a phone-case; the recommendation is driven by what is happening now, not by who the user is across all time.

Intuition

Why "session" and not "user"

Classic MF / CF treats interactions as an unordered set tied to a stable user identity. That fails in two situations session-based RecSys targets:

Anonymous / logged-out users — there is no persistent profile, only the events seen so far in this session.

Short-term intent that overrides long-term taste — a user who usually buys rock music is, right now, building a workout playlist. The session signal is more informative than the historical average.

So we drop the user vector and instead build a representation of the session itself (a short ordered list of items) and decode the next item from it. The split is: Sequential Recommendation = use interaction order; session-based = the special case where the conditioning context is the current session only (often with no user ID), as opposed to user-based recommendation which spans a persistent history.

Mathematical Formulation

Given the current session as an ordered sequence of interacted items $s = ⟨ i_{1}, i_{2}, \dots, i_{t} ⟩$ , the task is next-item prediction: model the conditional distribution over the catalog $I$ for the next position,

$P (i_{t + 1} ∣ i_{1}, i_{2}, \dots, i_{t})$

A simple first-order Markov Chain approximates this with only the last item, $P (i_{t + 1} ∣ i_{t})$ , estimated from a global transition matrix. Modern session-based models instead encode the whole prefix into a hidden state. The first deep model, GRU4Rec [Hidasi et al., 2015], uses a GRU RNN that consumes the one-hot items left-to-right:

$h_{t} = GRU (h_{t - 1}, e_{i_{t}}), \overset{r}{^}_{s} = h_{t} W_{out}$

where:

$h_{t - 1}, h_{t}$ — recurrent hidden state before/after consuming item $i_{t}$ ; $h_{t}$ summarizes the session so far
$e_{i_{t}}$ — dense embedding of the current item (each item has its own learned embedding)
$W_{out}$ — output projection producing a score $\overset{r}{^}_{s, j}$ for every candidate item $j \in I$
the highest-scoring items become the recommendation list

GRU4Rec is trained with the pairwise BPR loss (a positive next item vs. sampled negatives):

$L_{BPR} = - \frac{1}{N _{S}} \sum_{j = 1}^{N_{S}} lo g σ (\overset{r}{^}_{s, i} - \overset{r}{^}_{s, j})$

where:

$\overset{r}{^}_{s, i}$ — score for the true next item $i$ in session $s$
$\overset{r}{^}_{s, j}$ — score for negative sample $j$
$N_{S}$ — number of negative samples per positive instance
$σ (\cdot)$ — sigmoid; the objective pushes the true next item above sampled negatives

Key Properties / Variants

Conditioning context: the current session only. Contrast with user-based sequential recommendation, which conditions on a persistent cross-session history. Both are sub-paradigms of Sequential Recommendation.
No (or optional) user ID: well-suited to anonymous traffic; user/item representations can still be augmented with side-information / content features when available.
Targets beyond single items: the next-step output can be a single item, a basket, a bundle, or a playlist (next-basket recommendation).
Model lineage (increasing capacity, all usable as session-based when fed the current session):
- First-order Markov Chain / Factorized Personalized Markov Chains (FPMC) — short, sparse sessions; first-order transitions only.
- GRU4Rec (GRU/RNN) — the original deep session-based model; captures short within-session temporal patterns; BPR / TOP1-max loss.
- Self-Attentive Sequential Recommendation (SASRec) — Self-Attention with item + positional embeddings and a causal mask; faster and stronger than RNN/CNN.
- BERT4Rec — bidirectional Transformer trained with a Cloze/masked-item objective; appends a [MASK] at the end at inference for next-item prediction.
Loss matters as much as architecture (Klenitskiy & Vasilev, 2023): SASRec trained with full cross-entropy or BCE with many negatives (“SASRec+”) beats BERT4Rec; too few negatives causes overconfidence. BPR / BCE / CE are model-agnostic.
Data sparsity is the core difficulty for count-based session models; mitigated by skipping, clustering, and mixture-of-orders, or sidestepped by embedding-based deep models.

Generic next-item inference for an encoder-style session model:

Algorithm: Session-based Next-Item Recommendation (inference)
─────────────────────────────────────────────────────────────
Input: current session s = <i_1, ..., i_t>, item embedding table E
  h ← initial state
  for k = 1 ... t:                 # consume the session in order
    h ← Encoder(h, E[i_k])         # GRU step, or self-attention over prefix
  scores ← h · E^T                 # score ALL candidate items
  mask out items already in s (optional, domain-dependent)
  return Top-K items by score

Connections

Sub-paradigm of: Sequential Recommendation (uses interaction order); contrast with user-based history
Departs from: Matrix Factorization / Collaborative Filtering (which ignore order)
Probabilistic basis: Markov Chain, FPMC
Deep models: Gated Recurrent Unit (GRU), Recurrent Neural Network (RNN), SASRec, BERT4Rec, Self-Attention, Transformer Model
Trained with: Bayesian Personalized Ranking (BPR), Negative Sampling
Output framing: Next-Item Prediction, Top-K Recommendation
Evaluated with: Recall, MRR, NDCG, Hit Rate (and Beyond-Accuracy Metrics like Diversity)

Study Notes

Explorer

Session-based Recommendation

Session-based Recommendation

Definition

Intuition

Mathematical Formulation

Key Properties / Variants

Connections

Appears In

Graph View

Table of Contents

Backlinks