EuraStudyThe Lab
Research & engineering10 entriesUpdated 20 June 2026

Research & engineeringEdition MMXXVI

The Lab

Bringing AI into the education of Europe.

The Lab is where EuraStudy is worked out in the open — the learning science we build on, the questions we have not settled, and the craft behind every surface a student meets. We treat tutoring, assessment and memory as problems with a literature, build only the methods that literature supports, and show our working: every claim cited, every figure computed to specification, every result we are unsure of named as such.

§ 01 · Methods

Methods we build on

The instruments the platform actually runs on. The evidence behind each is cited under From the field.

M·01
Knowledge tracing

A running, per-topic estimate of what each student has actually mastered, updated with every answer — so practice targets what is not yet secure rather than what is already known. We advance on evidence, not on the calendar.

Read · How a Machine Reads What You Know · A·02 Assessment
Fig. mastery over practice
M·02
Item response theory & adaptive testing

Every question carries a calibrated difficulty; the adaptive diagnostic chooses the item that tells us the most about where a student stands — converging on a fair estimate of readiness in a handful of well-chosen questions.

Read · Twenty Questions · A·02 Assessment
Fig. the zone of practice
M·03
Spaced repetition & the forgetting curve

Memory decays after study; a review timed just before recall fails lifts it back to full and flattens the next decay. Reviews widen over expanding intervals — the spacing and testing effects, two of the most robust results in the science of learning.

Read · The Half-Life of Knowing · A·03 Learning science
Fig. review before you forget
§ 02 · Index

Four areas of work

Every dispatch is filed under exactly one area — the finer index a research group publishes under.

A·01
AI tutoring

How the tutor decides what to say — and, more often, what to withhold.

1Entry
Latest · Withholding the Answer
A·02
Assessment & feedback

Measuring answers, marking like an examiner, and judging the tutor itself.

3Entries
Latest · Twenty Questions
A·03
Learning science

What the evidence on practice, memory and difficulty actually supports.

2Entries
Latest · The Half-Life of Knowing
A·04
Design & craft

The instruments, typography and figures the whole platform is built from.

4Entries
Latest · A Calculus of Diagrams
§ 03 · The issue

The work

Newest first — each piece carries the figures it argues from.

Filter
10 of 10
EngineeringDesign & craft
A Calculus of Diagrams

Most diagrams in educational software are pictures someone drew once. Ours are values in a typed language, compiled to pixels by a function that cannot lie. A formal account of the figure engine — its grammar, its determinism, and the proof obligations that keep nearly two thousand diagrams honest.

Read
Fig. D·01
D·01 · 20 Jun
9 min read
ResearchLearning science
The Half-Life of Knowing

You can know something on Tuesday and not know it on Friday — memory has a half-life, and it is shorter than anyone would like. Spaced repetition is the engineering discipline built on that uncomfortable fact: schedule each review for the moment a memory is about to fade, and a little forgetting becomes the thing that makes learning stick. We trace the idea from Ebbinghaus to the algorithms now built into the tools millions revise with.

Read
Fig. D·02
D·02 · 19 Jun
14 min read
ResearchAssessment & feedback
Twenty Questions

A good adaptive test can pin down what you know in a dozen questions, not fifty — because it chooses each one to be the most revealing it can ask. We trace the quiet mathematics of item response theory and computerized adaptive testing, from the shape of a single question to the loop that learns you in real time, and the places where adaptivity has to be reined in.

Read
Fig. D·03
D·03 · 18 Jun
13 min read
ResearchAssessment & feedback
How a Machine Reads What You Know

Every adaptive tutor rests on a quiet act of inference: guessing the knowledge it cannot see from the answers it can. We trace that idea from Bayesian Knowledge Tracing to its deep-learning successors — and the honest places where the deeper model is not the better one.

Read
Fig. D·04
D·04 · 17 Jun
9 min read
EngineeringDesign & craft
On the Making of a Quiet Machine

A study of the obsessions behind a learning platform built for four national examinations — where nothing is accidental, and restraint is the most exacting discipline of all.

Read
Fig. D·05
D·05 · 14 Jun
7 min read
ResearchAI tutoring
Withholding the Answer

A system that hands over the answer is not teaching. We argue that the central design problem for a machine tutor is not how to explain, but when and how much to withhold.

Read
Fig. D·06
D·06 · 11 Jun
9 min read
EngineeringDesign & craft
Drawn, Not Decorated

Every chart, curve and diagram a student meets is drawn to exact specification by a single figure engine — and verified before it ships. Never faked, never screenshotted.

Read
Fig. D·07
D·07 · 8 Jun
5 min read
ResearchAssessment & feedback
How Should We Measure a Tutor?

A tutor that keeps students busy is not the same as a tutor that helps them learn. We argue for measuring AI tutors by learning gains and transfer — and against the engagement metrics that quietly reward the wrong thing.

Read
Fig. D·08
D·08 · 5 Jun
8 min read
EngineeringDesign & craft
One Platform, Four National Exams

The Austrian Matura and the German Abitur are live; the French Baccalauréat and Spanish Selectividad are on the waitlist. The hard part was never the content — it was deciding what four exams could share without flattening any of them.

Read
Fig. D·09
D·09 · 2 Jun
6 min read
ResearchLearning science
Adaptive Practice and Its Limits

Adaptivity is the most over-promised word in educational technology. Two effects in the learning-science record are real and worth building on; almost everything sold above them is decoration.

Read
Fig. D·10
D·10 · 28 May
8 min read

Reading the fieldBEYOND EURASTUDY

From the field.

A standing reading of the research on artificial intelligence and learning — the work of others, across decades, that the rest of this notebook is built on. These are published findings by researchers across the field, not EuraStudy’s own results; we summarise them and point to the original work.

  1. 01Tutoring efficacy

    One-to-one tutoring moved the average student to the 98th percentile.

    Students who worked with a personal tutor outperformed conventionally taught peers by about two standard deviations — Bloom’s “two sigma” result. It set the central ambition that has driven educational technology ever since: to reproduce, at scale, what a good tutor does for one learner.

    Benjamin S. Bloom1984The 2 Sigma ProblemEducational Researcher

  2. 02Tutoring efficacy

    Intelligent tutoring systems came within a whisker of human tutors.

    Reviewing decades of controlled studies, VanLehn measured human tutoring at roughly 0.79 standard deviations over no tutoring and step-based intelligent tutors at about 0.76 — far below Bloom’s famous 2.0, and close enough to each other to reframe the question from “can a machine tutor?” to “what does effective tutoring actually consist of?”

    Kurt VanLehn2011The Relative Effectiveness of Human Tutoring, Intelligent Tutoring Systems, and Other Tutoring SystemsEducational Psychologist

  3. 03Evidence & meta-analysis

    Across fifty controlled evaluations, intelligent tutors raised scores by about two-thirds of a standard deviation.

    The median system raised scores by about two-thirds of a standard deviation — but far more on the locally designed tests that match what a system actually taught (around 0.73) than on standardised exams (around 0.13). Real, and a reminder that the size of an effect depends heavily on what you choose to measure.

    James A. Kulik & J. D. Fletcher2016Effectiveness of Intelligent Tutoring Systems: A Meta-Analytic ReviewReview of Educational Research

  4. 04Memory & practice

    Being tested on material beats re-reading it — and the gap widens with time.

    Learners who practised retrieving what they had studied remembered substantially more a week later than those who simply restudied — even though the restudiers felt more confident at the time. The “testing effect” is among the most robust results in the science of learning, and the reason deliberate practice, not mere exposure, sits at the centre of exam preparation.

    Henry L. Roediger III & Jeffrey D. Karpicke2006Test-Enhanced LearningPsychological Science

  5. 05Cognitive load

    Working memory is the bottleneck — and the help a novice needs becomes noise for an expert.

    Cognitive load theory holds that instruction fails when it overwhelms a narrow working memory. Later work on the “expertise-reversal effect” sharpened the point: scaffolding that helps a beginner actively hinders a more advanced learner. Together they argue that good tutoring must adapt its support to the individual, not just to the topic.

    John Sweller1988Cognitive Load During Problem Solving: Effects on LearningCognitive Science

  6. 06Learning theory

    Good help is temporary: a scaffold exists in order to be removed.

    Wood, Bruner and Ross named “scaffolding” — the support an expert lends so a learner can do what they cannot yet do alone, an idea since drawn together with Vygotsky’s zone of proximal development. Its defining feature is that it fades: support that never withdraws breeds dependence, not competence. It is the principle behind any tutor that deliberately holds back the answer.

    David Wood, Jerome S. Bruner & Gail Ross1976The Role of Tutoring in Problem SolvingJournal of Child Psychology and Psychiatry

  7. 07Feedback

    Feedback is one of the most powerful influences on learning — and one of the most variable.

    Synthesising hundreds of studies, Hattie and Timperley placed feedback among the strongest levers on achievement, with effects ranging from large to outright negative. What separated them was whether the feedback told a learner where they were going, how they were doing, and what to do next. Feedback that grades without directing can achieve nothing at all.

    John Hattie & Helen Timperley2007The Power of FeedbackReview of Educational Research

  8. 08Critical perspectives

    Perhaps the machine should stay simple, and the intelligence should stay human.

    Baker argues the field over-invested in modelling the learner’s mind and under-invested in the simpler, robust systems that actually help — and in keeping teachers in the loop. A standing corrective for anyone building an AI tutor: sophistication is not the goal; better learning is.

    Ryan S. Baker2016Stupid Tutoring Systems, Intelligent HumansInternational Journal of Artificial Intelligence in Education

Selected reading · 8 works · a starting point, not a survey

Every figure on the Lab is computed to specification and verified before it ships — never a screenshot. Every external claim is cited to a real, published work. Where we are unsure, we say so.