About Kolokwa
Kolokwa is a crowdsourced voice collection project dedicated to building the first open dataset for Liberian English. Our goal is to enable AI systems -- speech recognition, voice assistants, translation tools -- to understand the way Liberians actually speak.
The Problem
Current AI voice technology is trained primarily on American and British English. When Liberians speak naturally -- using Kolokwa, our everyday English with its unique vocabulary, grammar, and pronunciation -- these systems fail. Voice assistants don't understand us. Transcription is inaccurate. Translation tools are useless for our dialect.
Our Solution
We're building a high-quality, open-source voice dataset by asking Liberians around the world to record themselves speaking common Kolokwa phrases. Each recording is paired with its English translation, creating training data that AI systems can learn from.
How It Works
- You see a Kolokwa phrase with its English translation.
- You record yourself saying the phrase naturally.
- Your recording is reviewed and added to the dataset.
- Researchers and developers use the dataset to train AI models.
Open Data
The Kolokwa dataset will be released as an open-source resource, freely available to researchers, developers, and organizations working on language technology. We believe that language data should be a public good, especially for underrepresented languages.
Privacy First
We take your privacy seriously. Recordings are associated with anonymous session identifiers, not your personal identity. You can request deletion of your data at any time. Read our privacy policy for full details.
Get Involved
Whether you're in Monrovia or Minnesota, if you speak Kolokwa, we need your voice. Every recording helps, and it only takes a few minutes to make a difference.