About Kolokwa

Kolokwa is a crowdsourced voice collection project dedicated to building the first open dataset for Liberian English. Our goal is to enable AI systems -- speech recognition, voice assistants, translation tools -- to understand the way Liberians actually speak.

The Problem

Current AI voice technology is trained primarily on American and British English. When Liberians speak naturally -- using Kolokwa, our everyday English with its unique vocabulary, grammar, and pronunciation -- these systems fail. Voice assistants don't understand us. Transcription is inaccurate. Translation tools are useless for our dialect.

Our Solution

We're building a high-quality, open-source voice dataset by asking Liberians around the world to record themselves speaking common Kolokwa phrases. Each recording is paired with its English translation, creating training data that AI systems can learn from.

How It Works

  1. You see a Kolokwa phrase with its English translation.
  2. You record yourself saying the phrase naturally.
  3. Your recording is reviewed and added to the dataset.
  4. Researchers and developers use the dataset to train AI models.

Open Data

The Kolokwa dataset will be released as an open-source resource, freely available to researchers, developers, and organizations working on language technology. We believe that language data should be a public good, especially for underrepresented languages.

Privacy First

We take your privacy seriously. Recordings are associated with anonymous session identifiers, not your personal identity. You can request deletion of your data at any time. Read our privacy policy for full details.

Get Involved

Whether you're in Monrovia or Minnesota, if you speak Kolokwa, we need your voice. Every recording helps, and it only takes a few minutes to make a difference.