Orinode builds voice AI infrastructure for Nigerian languages. Our models are trained on the speech of real people. Our products are deployed in contexts — healthcare, finance, telecommunications — where AI errors carry real consequences.
This document states plainly how we operate: how we collect data, what contributors are entitled to, how we measure the fairness of what we build, and what we commit to doing when things go wrong. It is written for grant reviewers, research partners, pilot customers, and the communities whose languages we work with.
We do not believe that a framework document substitutes for practice. Every commitment here is either already in operation or has a named deadline. We will update this document when practices change.
All speakers contributing audio to our training corpus go through an explicit, informed consent process before recording begins. The consent form is available in English, Hausa, Yoruba, Igbo, and Nigerian Pidgin.
The consent form explains in plain language:
No deceptive framing is used. Participants are not recruited with misleading descriptions of what the recordings will be used for. We do not record in contexts where participants could reasonably feel pressured to participate (e.g., in employer-organized group sessions without individual opt-out).
We pay contributors at above-market rates for the Nigerian gig economy. Our current target is 1.5× the prevailing hourly rate for comparable transcription or voice annotation work on platforms operating in Nigeria.
Current rate: ₦3,500–₦5,000 per hour of completed recordings, depending on language and session complexity. This will be updated as market rates change.
Payment is processed within 5 business days of session completion. We do not withhold payment pending quality review — contributors are paid for their time, not penalized post-hoc for audio that doesn't meet technical criteria.
Our position: the economic value created by African linguistic data should benefit African contributors directly and immediately, not only as an externality of a company's eventual commercial success.
Contributors have the right to request deletion of their recordings from our corpus at any time, subject to the technical constraint noted in Section 1: recordings already incorporated into published model weights cannot be fully removed from those weights without retraining.
To exercise a deletion request: email [email protected] with the subject line "Data deletion request" and the session date and language. We will acknowledge within 5 business days and action within 30 days.
We do not sell or share individual contributor data with third parties. Aggregated, anonymized dataset releases do not include personally identifiable information (names, phone numbers, location below state level).
Our data handling practices are aligned with Nigeria's Nigeria Data Protection Regulation (NDPR) and its successor, the Nigeria Data Protection Act 2023 (NDPA). We do not currently operate in the European Economic Area and have not conducted a GDPR assessment, but we will do so before any EU expansion.
AI systems can encode and amplify bias. In voice AI for Nigerian languages, the most significant risks are:
Our evaluation methodology addresses all three:
Speaker-stratified evaluation. Our held-out test sets are stratified by speaker gender, region of origin, and dominant language. We report Word Error Rate (WER) and Character Error Rate (CER) broken down by each stratum, not only as a single aggregate figure. An aggregate WER that masks 3× worse performance on female speakers is not an acceptable result. See our public benchmark table for current baseline comparisons.
Code-switching evaluation. We maintain a dedicated code-switched test set drawn from real conversational speech. Performance on this set is reported separately from monolingual performance.
Adversarial probing. Before any production deployment, we test the system with inputs specifically designed to elicit failure modes: fast speech, heavy noise, strong regional accents, and multi-party crosstalk. We document failure modes publicly alongside performance claims.
Publication commitment. We commit to publishing our full evaluation results — including results where our system underperforms — in our model release documentation. We will not cherry-pick metrics or restrict evaluation to favorable test conditions.
Our primary pilot sectors — healthcare and financial services — carry elevated risk for AI errors. We apply additional constraints for these deployments:
We commit to transparency on failures, not only successes:
We do not expect to get everything right. We expect to be honest when we don't.
For questions about this framework, data rights requests, or to report a concern:
Data rights: [email protected]
Research ethics: [email protected]
General: [email protected]
Orinode Ltd · Lagos, Nigeria