EGRA-AI: marking children's reading in African languages, automatically

Client:: Documented in AI for Education's "Learning by Doing 2024/25" case study series. A research consortium project.
Collaborators:: University of Cape Town (Dr Cally Ardington); Western Sydney University, International Centre for Neuromorphic Systems (Dr Saeed Afshar, Dr Sergio Chevtchenko, Nikhil Navas); Binding Constraints Lab (Sipumelele Lucwaba); Funda Wande; DataQuest (field coordination); Neurabuild

The problem

In South Africa, 81% of ten-year-olds cannot read for meaning (PIRLS 2021). Knowing where children struggle means assessing them, and reading assessment is done one child at a time, by hand. At national scale that is slow, costly, and rare, especially across African languages where almost no tooling exists.

Our approach

Neurabuild worked embedded with the ICNS researchers at Western Sydney University, the way we always do: close to the people who understand the research. We built the data collection and app layer on the ReadUp platform, an offline-first reading-assessment app that runs on phones and tablets. Architecture led by Dawid Loubser, development led by Graham Withey, with Ben Blaine on Voice AI and child-speech data collection. The WSU team led the model work. After an initial Sepedi pilot, we adapted the system to isiXhosa, a structurally different language, and replaced a brittle phonetic-rule approach with a fine-tuned Wav2Vec 2.0 model trained on in-domain child speech. To get there we collected 148,962 recordings from 2,796 learners in Grades 1 to 4, and produced 446,886 labels through triple mother-tongue marking.

Outcome

The isiXhosa model reached 95% item-level accuracy on items where all three human markers agreed, and 91% across all items. On a held-out set the AI's marking correlated with human marking at 0.99. The models work with as few as 50 correct and 50 incorrect examples per item (85.3% diagnostic efficiency), rising to 91.7% at 200 each, which is what makes new languages affordable to add. Children scored almost the same whether a teacher guided them or they self-administered (0.839 correlation), so one fieldworker can assess several children at once. The models are available from Western Sydney University under a free perpetual licence. The methods are now carried forward by AI for Education.

AI for Education — "Learning by Doing 2024/25" case study

SCOPE YOUR PROBLEM