The Forgetting Curve — pedagogy.dev

In the 1880s Hermann Ebbinghaus memorized nonsense syllables and tested himself over time. He found that retention drops sharply at first, then levels off — a curve that turns up again and again in both human and machine learning.

The shape

Memory retention R after time t is often modeled as exponential decay:

R = e^(−t / S)

R — proportion retained (1 = perfect, 0 = gone).
t — time since learning.
S — memory strength (stability). Larger S = slower forgetting.

The key feature is that loss is fastest immediately after learning. Most of what you forget, you forget soon.

Why it’s a “pattern,” not just a fact

The same exponential form appears across learning systems:

Spaced repetition algorithms (SM-2, FSRS) explicitly estimate S and schedule the next review for the moment R is predicted to dip to ~90%. Each successful review increases S, flattening the curve — this is the mechanism behind The Spacing Effect.
Exponential decay also governs learning-rate schedules and exponential moving averages in model training — the same e^(−t/S) skeleton, repurposed.

The practical takeaway

You can’t stop forgetting — but you can raise S. Every well-timed retrieval bends the curve flatter. Review just before you’d forget, not after.