Nigerian English Speech Recognition

Orinode's Nigerian English ASR achieves 10.9% WER on naturalistic Nigerian-accented English (May 2026, 200 samples) — a substantial improvement over off-the-shelf US/UK-trained ASR systems on the same audio.

Looking for the deployable product? This Nigerian English ASR powers Maraba — an AI call agent for Nigerian businesses that handles Lagos, Abuja, Port Harcourt accents natively, with mid-sentence Hausa/Yorùbá/Igbo code-switching.

What "Nigerian English" actually is

Nigerian English (ISO 639-3 eng-NG) is the local variant of English spoken by an estimated 100+ million Nigerians as a second language. It has stable, systematic phonological and lexical features that diverge from General American or Received Pronunciation:

Vowel mergers — many speakers neutralize /ɪ/–/iː/ ("bit" / "beat"), /ʊ/–/uː/ ("full" / "fool"), /ʌ/–/ɔ/ ("but" / "bought").
Non-rhotic in most regions — /r/ is not pronounced post-vocalically; "car" sounds closer to "kah".
Substrate prosody — speech rhythm is syllable-timed rather than stress-timed, shifting the temporal cues English-trained ASR depends on.
Distinctive vocabulary — flash my phone, okada, kabukabu, NEPA, wahala — domain-specific terms that out-of-the-box ASR transcribes incorrectly or omits.

Why Whisper-large-v3 alone is not enough

OpenAI's Whisper-large-v3 was trained on ~680k hours of multilingual audio, dominated by US and European English. Its English language token (<|en|>) decodes Nigerian English passably but consistently:

Substitutes "beat" for "bit" and vice-versa.
Drops or mis-inserts function words ("the", "a", "to") at higher rates.
Hallucinates entirely on short utterances with strong Nigerian prosody.

Orinode's fine-tuned encoder corrects these systematic errors by training on Nigerian-accented English specifically.

Architecture

Encoder: Whisper-large-v3 fine-tuned on Nigerian English (Common Voice + LDC Nigerian English subset + Orinode-curated radio archives).
Decoder + adapter: shared with the rest of Maraba v1 — Gemma-2-9b-it with LoRA on q/k/v/o projections.
Language token: standard <|en|> — but the encoder has been re-aligned to Nigerian-accented English at the acoustic level.

Performance (May 2026)

Metric	Value	N
WER (normalized)	10.92%	200
WER (raw, case-sensitive)	12.03%	200

Code-switching with Nigerian English

Real Nigerian English speech routinely embeds Hausa, Yorùbá, Igbo, or Pidgin tokens within an otherwise English utterance — e.g. "My boss said sannu, but he no fit even pronounce the name properly". Maraba v1 handles this via a multilingual decoder that can emit any of the six trained languages mid-sentence. See our Naija customer-call code-switch dataset on Hugging Face for representative training data.

Get the model

Open weights on huggingface.co/Orinode. Production API access: [email protected].

For the deployable voice agent built on this model, see Maraba at maraba.ai.