Speech Recognition (ASR) Systems
We build automatic speech recognition systems, including for low-resource languages, children's voices, and fully on-device use where there is no connectivity. This is our deepest and hardest-to-copy capability.
The problem
Off-the-shelf ASR is built for adult speakers, major languages, and clean audio with a good connection. The moment you need a child's voice, an under-resourced language, a noisy classroom, or an offline device, the easy options stop working.
Our approach
We work embedded with the researchers who understand the language and the acoustics, and we engineer ASR that holds up in the real setting. We have fine-tuned models on in-domain child speech to mark reading in isiXhosa and Sepedi, built the data-collection and labelling pipeline behind it, and shipped Whisper-based recognition for clinical use. We treat data collection, annotation, and on-device constraints as part of the problem, not an afterthought.
What you walk away with
A speech recognition system built for your language, speakers, and conditions, with the data and deployment path it needs to keep improving.
Led by
- Graham Withey, Development
- Dawid Loubser, Architecture
Proof
Common questions
- Can you handle low-resource or African languages?
- Yes. It is much of what we do.
- Children's voices?
- Yes. Child speech is a core specialism, not an edge case.
- Can the model run offline, on-device?
- Yes; offline-first is central to several of our systems.
- How little data can you start with?
- Our reading-assessment models reached strong accuracy with as few as 50 correct and 50 incorrect examples per item.