Spaced Repetition Algorithms: From Ebbinghaus to FSRS — A Deep Dive

The scheduler is the whole game

A card appears. You answer. The app picks the next interval — tomorrow, next week, three months from now. That single decision is the product. Get it right and you keep 95% of what you study with almost no waste. Get it wrong and you either forget (intervals too long) or grind through cards you already own (intervals too short).

Everything else in a flashcard app — the UI, the deck format, the gamification — is wallpaper around that one number. The scheduling engine is the load-bearing wall.

The lineage runs 140 years. A German psychologist memorizing nonsense syllables alone in his apartment in 1885. A Polish molecular biology student writing DOS code in 1987. A Chinese student named Jarrett Ye training neural-net weights on hundreds of millions of Anki reviews in 2022. Same problem the whole way: when does a memory next need attention.

This is the history, the math, and the engineering.

1885: Ebbinghaus and the forgetting curve

In 1885 Hermann Ebbinghaus published Über das Gedächtnis — Memory: A Contribution to Experimental Psychology. He was the subject and the experimenter. He memorized lists of 13 CVC nonsense trigrams (WID, ZOF, BUP — chosen because they carried no prior associations) and tested himself from 20 minutes out to 31 days. His metric was savings: how much less time relearning a list took versus learning it cold.

Time Since Learning	Retention (%)	Lost (%)
20 minutes	58.2	41.8
1 hour	44.2	55.8
8.8 hours	35.8	64.2
1 day	33.7	66.3
2 days	27.8	72.2
6 days	25.4	74.6
31 days	21.1	78.9

Source: Ebbinghaus (1885), Section 29. Replicated by Murre & Dros (2015) in PLOS ONE with modern controls.

The classic shape of Ebbinghaus's forgetting curve: rapid initial loss, then a long, slow tail. Source: Wikimedia Commons (public domain).

The shape — cliff first, long tail after — has held up across thousands of studies for 140 years. Cepeda et al. (2006) ran a meta-analysis of 184 articles, 317 experiments. Spaced study beat massed study by 10–30% across virtually every condition tested.

Ebbinghaus's deeper claim is the one that matters here: each successful review resets the curve at a shallower slope. Forgetting still happens, but slower. The window before you forget grows. That is the spacing effect — the oldest, most replicated finding in experimental psychology. Everything downstream is engineering on top of it.

Exponential vs. power law — the curve was wrong for 30 years

What function describes the curve? The whole scheduler hangs on the answer.

Ebbinghaus himself fit his data to a power function: b = 100k / ((log t)^c + k). For decades after SuperMemo entered the scene, the field assumed forgetting was exponential: R(t) = e^-t/S, where S is memory stability. SuperMemo rode this for 30 years.

In 2024 the FSRS team showed the assumption was wrong. A power function fits real-world data better:

R(t, S) = (1 + F · t/S)^C

where F = 19/81 and C = -0.5.

The mechanism: individual memories may decay exponentially, but a flashcard deck is a mixture of memories at different strengths. Mix exponentials and you get a power law in the aggregate. Ebbinghaus had quietly known this — his own data fits a power function better than an exponential. It took the SRS field 139 years to notice.

Model	Equation	Tail Behavior	Fit to Real Data
Exponential	R = e^-t/S	Drops to near-zero quickly	Good for single-item, poor for mixed decks
Power law	R = (1 + t/9S)^-1	Heavy tail, slower decay	Better fit across 10K+ user collections

Source: A technical explanation of FSRS, Expertium's Blog.

1967: Pimsleur's graduated intervals

Before computers, Paul Pimsleur published a graduated interval recall schedule in 1967, designed for cassette-tape language instruction:

5 sec → 25 sec → 2 min → 10 min → 1 hour → 5 hours → 1 day → 5 days → 25 days → 4 months → 2 years

Each interval is roughly 5× the previous. Hand-tuned for audio, where you cannot flip back. Pimsleur courses still ship this schedule unchanged. Fixed. No adaptation to the learner. Effective anyway — proof that any exponential schedule beats no schedule.

1972: Leitner's boxes — spaced repetition with cardboard

Sebastian Leitner published So lernt man lernen in 1972. Five physical boxes of flashcards. Get a card right, it moves rightward (longer interval). Get it wrong, back to Box 1.

Box	Review Frequency
1	Every day
2	Every 2 days
3	Every 4 days
4	Every 9 days
5	Every 14 days

Diagram of the Leitner flashcard box system: cards advance to the next box on correct recall and return to the first box on failure
The Leitner box system: a correct answer promotes the card one box rightward (less frequent review); a wrong answer demotes it back to box 1. Source: Wikimedia Commons (CC0).

The Leitner system is the only spaced repetition algorithm you can run with cardboard and a kitchen drawer. The insight — spend your study budget on what you know least — is the seed every later algorithm grew from.

1985–1987: Wozniak and SM-2

On February 25, 1985, a 22-year-old molecular biology student in Poznań named Piotr Wozniak — sick of forgetting English vocabulary and biochem facts — started hand-tracking optimal inter-repetition intervals in a notebook. Two and a half years later, on December 13, 1987, he shipped SuperMemo 1.0 for DOS. The first computer program that scheduled flashcards.

The algorithm inside, SM-2, tracks three variables per card:

n: repetition number
EF: easiness factor (init 2.5, adjusted by responses)
I: inter-repetition interval in days

The interval schedule:

I(1) = 1 day
I(2) = 6 days
I(n) = I(n-1) × EF    for n > 2

EF after each review:

EF' = EF + (0.1 - (5-q) × (0.08 + (5-q) × 0.02))

q is the 0–5 quality rating. EF floored at 1.3.

That is the whole algorithm. Two formulas, three variables, fits in a tweet. One person, no formal memory model, just empirical self-observation. Low-entropy keystrokes from the start.

And it works. SM-2 has been running, nearly untouched, inside Anki since 2006, inside Mnemosyne since 2003, and inside dozens of clones. The most widely deployed spaced repetition algorithm in history — written by one student, given away.

The cracks:

Limitation	Consequence
No probability model	Cannot predict how likely you are to recall a card at any moment
Fixed initial intervals (1, 6)	No adaptation to card difficulty before first review
Linear EF adjustment	Overreacts to single bad reviews; slow to recover
No per-user optimization	Same formula for a medical student and a casual hobbyist
No forgetting model	When you fail a card, it just resets — no signal about what went wrong

1989–2016: the SuperMemo divergence

Wozniak kept iterating. SM-4 (1989) introduced an optimization matrix. SM-5 (1989) made it converge faster. SM-8 through SM-18 piled on two-component memory (stability + retrievability), neural-net optimization, and incremental reading.

All of it stayed locked inside a paid Windows product. The rest of the world kept shipping SM-2. The interesting algorithms sat behind a license screen for 20 years while every open-source flashcard app cargo-culted a 1987 design.

Wozniak's own history is one of the most remarkable single-author research programs in software. It is also a case study in what proprietary isolation does to a field.

2016: Duolingo's Half-Life Regression

In 2016 Burr Settles and Brendan Meeder at Duolingo published A Trainable Spaced Repetition Model for Language Learning (ACL 2016). The algorithm: Half-Life Regression (HLR).

HLR models each word's half-life in memory — the time until recall probability drops to 50%. Unlike SM-2 it:

Uses logistic regression with psycholinguistic features (word frequency, cognate status, user history)
Trains on millions of real review records
Predicts actual recall probabilities, not just "next time"

On Duolingo's data HLR cut prediction error by 45%+ versus baselines. In live A/B:

Metric	Improvement
Practice session retention	+9.5%
Lesson retention	+1.7%
Overall daily activity	+12%

Source: Settles & Meeder (2016), ACL. Code: github.com/duolingo/halflife-regression.

Proof of concept: ML on real review data beats hand-tuned heuristics by a wide margin. But HLR was wedded to Duolingo's feature set and never escaped the building.

2022–2025: FSRS, finally open

In 2022 Jarrett Ye released FSRS — Free Spaced Repetition Scheduler. Open-source, modern ML, written for Anki. By November 2023 Anki shipped it as a native option. By 2025 it was the default for new users.

FSRS models memory with the DSR (Difficulty, Stability, Retrievability) framework:

Variable	Symbol	Definition	Range
Difficulty	D	How hard it is to increase stability for this card	1–10
Stability	S	Days for retrievability to drop from 100% to 90%	0.1–36,500
Retrievability	R	Probability of successful recall right now	0–1

The core equations.

Forgetting curve (power function):

R(t, S) = (1 + F · t/S)^C, where F = 19/81, C = -0.5

Stability after successful recall:

S'_r = S · (1 + e^w₈ · (11 - D) · S^-w₉ · (e^{w₁₀·(1-R}) - 1) · hard/easy)

Stability after forgetting (lapse):

S'_f = w₁₁ · D^-w₁₂ · ((S+1)^w₁₃ - 1) · e^{w₁₄·(1-R})

The 19 weights (w₀ through w₁₈) are optimized per-user via gradient descent on review history. That is the move: FSRS treats scheduling as a machine learning problem, with log loss between predicted and actual recall as the objective.

Source: The Algorithm (FSRS Wiki), ABC of FSRS.

The benchmark: receipts

The open-spaced-repetition/srs-benchmark project scores algorithms on real Anki review data across thousands of user collections. Metric: log loss between predicted recall probability and binary outcome. Lower is better.

Algorithm	Year	Model Type	Parameters	Log Loss ↓	Notes
SM-2 (trainable)	1987	Linear EF	2	0.346	Added probability layer for benchmark
Leitner	1972	Fixed boxes	0	~0.36	No probability prediction natively
HLR (Duolingo)	2016	Logistic regression	3+	0.327	Feature-engineered
FSRS v3	2022	DSR exponential	13	0.332	First release
FSRS v4	2023	DSR power	17	0.326	Power curve, +4 params
FSRS-5	2024	DSR power + same-day	19	0.325	Same-day review handling
FSRS-6	2025	DSR power + flat curve	21	0.324	Optimizable curve flatness

Source: Benchmark of Spaced Repetition Algorithms, Expertium's Blog. Dataset: 10,000+ Anki user collections.

Headline: FSRS-5 beats SM-2 in 97.4% of user collections. Against SM-17 — SuperMemo's current proprietary algorithm — FSRS-6 wins in 83.3% of collections. The open one beats the locked one.

Translated into hours: users switching from SM-2 to FSRS report 20–30% fewer reviews for the same retention level. For someone doing 200 reviews a day, that is 40–60 fewer cards per session, compounded over years. That is real wall-clock time off your life.

Why FSRS works — three innovations

1. Per-user parameter optimization. SM-2 ships the same formula for everyone. FSRS trains 19 weights on your review history. If you consistently nail 30-day intervals, FSRS notices your stability grows fast and stretches your intervals. If you struggle with kanji compounds, it tightens them. The model adapts to the learner — finally.

2. Difficulty modulates stability growth, not just base interval. A difficult card (D=8) with high stability (S=90 days) gains less stability on a successful review than an easy card (D=3) at the same stability. Hard things need more reinforcement even after you "know" them. SM-2 could not see this; FSRS encodes it directly.

3. Retrievability-aware scheduling. FSRS knows your exact recall probability at any moment. Reviewing at R=0.70 produces a larger stability gain than reviewing at R=0.95 — because retrieving at lower confidence is a desirable difficulty. This is Robert Bjork's theory, implemented in code.

The science underneath: why spacing works

The spacing effect is not just an empirical regularity. Three converging lines of evidence:

The testing effect. Roediger & Karpicke (2006): students who tested themselves three times (STTT) recalled 61% after one week. Students who studied four times (SSSS) recalled 40%. Testing is not assessment — it is the most powerful encoding event you have. Rowland's 2014 meta-analysis of 159 studies pegged the effect at Hedges' g = 0.50.

Desirable difficulties. Robert and Elizabeth Bjork coined the framework in 1994: conditions that make learning harder short-term — spacing, interleaving, retrieval practice — produce better long-term retention. The difficulty is the mechanism, not a cost paid for the result.

Consolidation. Memory consolidation during sleep transfers labile hippocampal traces to stable neocortical representations. Spacing reviews across sleep cycles gives consolidation room to operate. Cramming competes with itself for the same resource.

Study	Year	N	Key Finding	Effect Size
Ebbinghaus	1885	1	Forgetting follows power law decay	—
Cepeda et al. (meta)	2006	184 articles	Spacing produces 10–30% better retention	d = 0.42–0.77
Roediger & Karpicke	2006	120	Testing beats restudying at 1 week (61% vs 40%)	large
Rowland (meta)	2014	159 studies	Testing effect robust across conditions	g = 0.50
Kornell & Bjork	2008	120	Interleaving doubles classification accuracy	d = 0.99

The canon

Five texts. If you want to go deep, start here.

Text	Author(s)	Year	Why It Matters
Spaced Repetition for Efficient Learning	Gwern Branwen	2009	The definitive overview. 50,000+ words. History, research, practice, software. If you read one thing, read this.
Augmenting Long-term Memory	Michael Nielsen	2018	A working scientist using Anki daily for years. The "memory is a choice" frame that reset how people thought about SRS.
Make It Stick	Brown, Roediger, McDaniel	2014	The science of learning distilled for practitioners. Spacing, testing, interleaving, and why most study habits are theatre.
Andy Matuschak's notes	Andy Matuschak	2019–	Frontier work on "mnemonic media" — embedding spaced repetition inside reading itself.
A Three-Day Journey from Novice to Expert	Jarrett Ye	2023	FSRS's creator walking you from zero to the DSR model in three sittings.

Gwern earns special mention. Continuously updated since 2009, arguably the most thorough single piece ever written on the subject. His 5-minute rule for what to add to your deck: if you will spend more than five minutes over your lifetime looking the thing up or suffering from not knowing it, it is worth a card. That heuristic ends most "what should I Anki?" arguments.

Nielsen reframed the entire conversation in 2018: "The single biggest change that Anki brings about is that it means memory is no longer a haphazard event, to be left to chance. Rather, it guarantees I will remember something, with minimal effort. Anki makes memory a choice."

Timeline: 1885–2025

Year	Event	Innovation
1885	Ebbinghaus publishes Über das Gedächtnis	Quantified forgetting for the first time
1932	C.J. Spitzer tests 3,600 students	First large-scale spacing effect study
1967	Pimsleur's graduated interval recall	Hand-tuned schedule for audio learning
1972	Leitner's box system	Physical spaced repetition without computation
1985	Wozniak begins self-experiments	Birth of computational spaced repetition
1987	SuperMemo 1.0 / SM-2	First computer scheduling algorithm
1989	SM-4, SM-5	First adaptive algorithms (optimization matrix)
1991	SuperMemo 2.0 released as freeware	SM-2 spreads globally
1994	Bjork coins "desirable difficulties"	Theoretical framework for why spacing works
2003	Mnemosyne released	First open-source SRS (uses SM-2)
2006	Anki released (Damien Elmes)	SM-2 goes mainstream; 10M+ users eventually
2006	Roediger & Karpicke testing effect paper	Landmark retrieval practice evidence
2016	Duolingo's HLR paper	ML-based scheduling enters the literature
2022	Jarrett Ye releases FSRS v3	Open-source DSR model for Anki
2023	FSRS v4 (power curve)	Power function replaces exponential
2023	Anki 23.10 ships native FSRS	FSRS reaches millions of users
2024	FSRS-5 (same-day reviews, 19 params)	Handles short-term memory
2025	FSRS-6 (21 params)	Optimizable curve flatness

Implementing FSRS — it is small

Fernando Borretti wrote Implementing FSRS in 100 Lines. He was not exaggerating. The core algorithm fits in one file. The weight in the system is the optimizer that trains the 19 parameters against your review history — and even that is a few hundred lines of straightforward gradient descent.

My Kanji implements FSRS-5 natively in Ruby. The study session system uses the DSR model to schedule kanji reviews, with per-user weight optimization. It lives in app/services/fsrs_scheduler.rb — single file, ~200 lines, no external dependencies. Evergreen knowledge ported into a Rails app.

What is still unsolved

Spaced repetition is good. It is not done.

1. Cold start. FSRS needs review history to optimize weights. A brand-new user gets defaults. Matuschak notes the first 100 reviews are essentially flying blind.

2. Inter-item interference. Learn 待 (wait) and 持 (hold) on the same day, they collide. No production algorithm models this. The FSRS team has discussed it; it remains open research.

3. Recall is not understanding. Current SRS asks "can you retrieve this?" — not whether you understand it in context, can use it productively, or have integrated it with the rest of your knowledge. Matuschak's mnemonic medium is the most interesting work pushing past pure recall.

4. Emotional engagement. Matuschak argues the critical thing to optimize in an SRS is emotional connection to the review session and its contents. No algorithm does this. The 200th card of a session feels different from the 5th, and the scheduler is blind to that.

5. The right retention target. FSRS lets you set a desired retention rate (default: 90%). Is 90% optimal? Higher means more reviews. Lower means more forgetting. The Expertium benchmark shows diminishing returns above 90%, but the right point depends on the learner's goals, time budget, and material. No algorithm adapts it dynamically yet.

The bottom line

Spaced repetition is the closest thing in learning to a free lunch. The spacing effect has been replicated for 140 years across every population, material, and condition anyone has tested. The only question left is how efficiently your algorithm exploits it.

SM-2 was a breakthrough in 1987 and still works. FSRS is measurably better — fewer reviews, accurate predictions, per-user adaptation — and it is open source. Over a year of daily study, a 20% reduction in review load is dozens of hours back. Time you spend learning new material instead of paying tax on a worse scheduler.

The algorithm decides when you forget. Pick a good one.

References

Bjork, R.A. & Bjork, E.L. (2011). Creating desirable difficulties to enhance learning. In Psychology and the Real World. Worth Publishers.
Branwen, G. (2009–). Spaced Repetition for Efficient Learning. gwern.net.
Brown, P.C., Roediger, H.L., & McDaniel, M.A. (2014). Make It Stick: The Science of Successful Learning. Harvard University Press.
Cepeda, N.J., Pashler, H., Vul, E., Wixted, J.T., & Rohrer, D. (2006). Distributed practice in verbal recall tasks: A review and quantitative synthesis. Psychological Bulletin, 132(3), 354–380.
Ebbinghaus, H. (1885). Über das Gedächtnis. English trans.: Memory: A Contribution to Experimental Psychology (1913).
Expertium. (2024). Benchmark of Spaced Repetition Algorithms. GitHub Pages.
Expertium. (2024). A technical explanation of FSRS. GitHub Pages.
Leitner, S. (1972). So lernt man lernen. Herder.
Matuschak, A. (2019–). Spaced repetition memory system notes. andymatuschak.org.
Murre, J.M.J. & Dros, J. (2015). Replication and analysis of Ebbinghaus' forgetting curve. PLOS ONE, 10(7), e0120644.
Nielsen, M. (2018). Augmenting Long-term Memory. augmentingcognition.com.
Roediger, H.L. & Karpicke, J.D. (2006). Test-enhanced learning. Psychological Science, 17(3), 249–255.
Settles, B. & Meeder, B. (2016). A Trainable Spaced Repetition Model for Language Learning. Proc. ACL 2016.
Wozniak, P. (1990–). The true history of spaced repetition. supermemo.com.
Wozniak, P. (1987). Algorithm SM-2. supermemo.guru.
Ye, J. (2022–). FSRS4Anki. GitHub.
Ye, J. (2023). Spaced Repetition Algorithm: A Three-Day Journey from Novice to Expert. FSRS Wiki.