What mastery measurement actually measures

Mastery is one of the most overloaded words in education. It is printed on dashboards, embedded in product names, and used to justify an enormous amount of software. Almost none of it agrees on what a single mastered cell is supposed to mean.

Pull on the thread and you find two very different objects hiding under one word. The first is a sequence: a student moved through a set of activities in an expected order. The second is a claim: there is good reason to believe this student can do a specific thing, under specified conditions, again.

A surprising amount of EdTech measures the first while reporting the second. The activity log is real; the inference is decorative.

A claim has falsifiers

The cleanest test I know is to ask: what would make this green cell wrong? If you can answer — “the student would fail items of this type at this difficulty” — you have a measurement. If you cannot, you have a progress bar with good intentions.

Instrumentation should make a teacher a better arguer, not a more obedient one.

This is where psychometrics earns its keep. Item Response Theory lets difficulty and discrimination live in the items rather than in our assumptions. Bayesian Knowledge Tracing lets belief accumulate across encounters instead of resetting each quiz. Neither is exotic; both are mostly a discipline for being honest about evidence.

What this changes in the product

Once mastery is a claim, the interface stops being a wall of green and becomes an argument you can inspect: the claim, the evidence, the confidence. A teacher can disagree with it — which, counterintuitively, is what makes it trustworthy.

The cost is restraint. You measure fewer things, more honestly, and you resist the urge to report a number you cannot defend. In my experience that trade is always worth making, and it is almost always unpopular at first.

This essay abstracts patterns from several projects; no institution, dataset, or proprietary system is described.