Knowledge tracing without the black box
Bayesian Knowledge Tracing made legible to the people who teach and the people who build — not just to the model.
Confidentiality — Institution, datasets, and proprietary tooling are abstracted. What follows is the transferable decision logic, not internal exposure.
01 — Context
An adaptive-practice product used a knowledge-tracing model to decide what each student saw next. It worked — but no one outside the data team could say why a student was shown a particular problem, and teachers had quietly stopped trusting the recommendation.
A model nobody can question is a model nobody can correct.
02 — The real decision
The question was not “is the model accurate?” It was: can a teacher and an engineer hold the same mental picture of what the model believes, and act on it? Accuracy that can’t be inspected buys very little in a classroom.
A recommendation a teacher can’t interrogate isn’t personalization. It’s a slot machine with a progress bar.
03 — My role
I owned the translation layer: turning the BKT parameters — prior knowledge, learn rate, slip, and guess — into language a teacher could reason with, and a set of guardrails an engineer could enforce. I did not rebuild the model; I made its beliefs legible.
04 — Constraints
- No retraining
- The change had to wrap the existing model, not replace it.
- Glanceable
- A teacher needed the gist in seconds, the detail on demand.
- Honest about doubt
- Low-confidence beliefs had to look low-confidence.
05 — The logic used
We exposed the four BKT parameters as a small, named story per skill: what we assumed coming in, how fast this student tends to learn it, and how noisy the evidence is. Slip and guess stopped being hidden knobs and became a stated reason a green cell might still be wrong.
prior knowledge → "where we started believing"
learn rate → "how fast this clicks for them"
slip / guess → "how noisy the evidence is"
posterior → "what we believe now, and how sure"
06 — Alternatives considered
We could have shown a single mastery percentage and hidden the machinery. It tested well in demos and badly in classrooms: teachers either over-trusted it or ignored it entirely. Exposing the uncertainty cost us a cleaner-looking UI and bought us a model teachers would actually argue with.
07 — The system designed
Same answers, two stories: a percent-correct score and the BKT posterior disagree about what this student knows.
- Percent
- 60% correct — flattens easy and hard into one number.
- Posterior
- Likely mastered; two misses were low-confidence slips.
- Why it matters
- The percent would re-teach what the student already holds.
08 — Validation & quality criteria
- Every recommendation could be traced to a stated belief and its confidence.
- Teachers in review sessions could correctly predict the next item the model would choose — the test that it had become legible.
- A confidently-wrong belief was logged as a defect, not smoothed over.
09 — Reflections
The win was not a better model; it was a model that earned the right to be questioned. Legibility is not a UI veneer over the math — it is a constraint you design the math to satisfy.