News

Bridging the AI Divide: Prioritizing African Languages and Local Knowledge

Dr. Lilian Wanzare on Model Selection and two groundbreaking projects: Ken Corpus and Kenyan Sign Language (KSL)

African languages and local communities are frequently sidelined in the global rush to develop Artificial Intelligence. Yet, to create truly inclusive, impactful, and equitable AI, the foundational models must be built on the rich, diverse linguistic and knowledge systems of the continent. The time has come to shift the narrative and place African voices at the core of AI’s development and design.

The African AI & Equality Toolbox emphasizes that Stage Four: Model Selection and Development is a critical checkpoint for rights-respecting AI. For developers working in contexts like Kenya, this stage isn’t just technical—it’s profoundly ethical.

As Dr. Wanzare highlights, the core challenge is the “fundamental problem: there is not enough quality data for different use cases” in African languages. Building a sophisticated model from scratch is often impossible due to limitations in data, compute power, and funding.

So, what’s the solution? Dr. Wanzare’s approach to model selection is a masterclass in strategic constraint management:

Prioritizing Adaptability: Instead of striving for the largest, most data-hungry models, her team prioritizes models that are able to be fine-tuned with smaller data sets. This focuses effort on what is “practically possible” and allows for incremental improvement.
Transfer Learning with an Audit: While leveraging pre-trained foundational models is an option (transfer learning), her team subjects these global models to a critical local audit. They don’t just look at benchmark scores; they check: Is the model capturing the nuances of dialects? Can it handle code-switching? Is the output culturally and linguistically accurate, as validated by the local community?

This iterative and context-driven model selection is what enables groundbreaking projects like Ken Corpus (empowering African languages) and the Kenyan Sign Language (KSL) Project.

The KSL Project perfectly illustrates how user needs must dictate model choice. The initial assumption was that deaf students could use existing English-based Automatic Speech Recognition (ASR) for translation. However, the community clarified that their minds are visual, not auditory, making ASR inadequate.

This insight fundamentally shifted the model selection:

The Problem: Exclusion in higher education.
The Priority: Visual understanding and autonomy.
The Model Choice: Developing an avatar output and leveraging pose estimation models like MediaPipe.

By focusing on movement and facial features rather than just transcription, the model was designed to capture the complex non-manual features of KSL—the expressions that convey meaning. Dr. Wanzare’s team chose models capable of sequential learning to capture the flow of signs, proving that an effective model is one that genuinely reflects the community’s way of communicating.Representation and Participation: The Core Principles

For Dr. Wanzare, if she had to prioritize one ethical principle from the Toolbox, it is representation and participation.

“Is it representative enough of our communities?” is the ultimate filter for model success. This community-centered approach also cleverly addresses the tricky issue of transparency. While large language models are often “black boxes,” the continuous co-creation—where the community provides data, tests the output, sees the errors, and witnesses the model’s iteration—builds trust. They may not see how the neurons fire, but they feel the ownership and the evolution.

Dr. Lilian Wanzare’s work is a global blueprint: African AI innovation thrives when the technical act of model selection is driven by deep social insight, cultural respect, and a commitment to inclusivity. The future of AI is local, and its models must be chosen with the community, not just for them.

We’re creating a global community that brings together individuals passionate about inclusive, human rights-based AI.

Join our AI & Equality community of students, academics, data scientists and AI practitioners who believe in responsible and fair AI.