Problem
Hausa is the second most spoken language in Africa, with an estimated 64 million native speakers. Despite its widespread use, natural language processing (NLP) for Hausa remains significantly underdeveloped, starting with basic access to an online repository of the Hausa language. The Center for Nigerian Languages owns a soft copy of the first Hausa-to-Hausa Dictionary. However, no version of the dictionary exists digitally, despite the growing adoption of mobile devices throughout Nigeria and other Hausa-speaking countries.
Solution
We worked with the Centre for Research in Nigerian Languages to create the first online Hausa-to-Hausa living dictionary. We digitized the entire physical copy containing 28,000+ word entries, providing definitions, word forms, and synonyms, with support for word pronunciations provided by the Center for Nigerian Languages teams. Kamusun Hausa is a living dictionary, meaning that users may suggest words to the dictionary, which will then be queued for approval by language experts at the Center for Nigerian Languages using the Kamusun Hausa Admin Dashboard. This digitized dictionary allows for users to access language resources even without a physical copy and allows for preservation of the Hausa language.
Additionally, the platform offers an API for users to bulk download dictionary entries, which can be used for other analysis or natural language processing tasks.
Design
Users range in levels of technology access and we anticipate most users will access the dictionary through their mobile devices. Because of this, Kamusun Hausa was developed to be both mobile- and desktop-friendly, in addition to ensuring a simple navigation flow throughout the pages. Since Kamusun Hausa is a living dictionary, we designed flows for both user submission and admin approval of new words.
Moreover, it was a big challenge to design for a language we don’t speak ourselves. Because the nature of this project relied on delivering accurate, reliable information on the Hausa language, we consulted language experts during the design process and to resolve any missing information discovered while converting the dictionary.
Tech Stack
Kamusun Hausa is a Next.JS full-stack application that was developed and designed from scratch. The backend is built with MongoDB, Prisma, and tRPC for API calls, and the frontend is built with React, TailwindCSS, and Material UI. TypeScript is used throughout the application. Kamusun Hausa has been fully deployed using Vercel and MongoDB Atlas.
Features
Word Pronunciation
To support accurate understanding and long-term language preservation, each dictionary entry includes an associated word pronunciation accessible directly from the word’s page. These pronunciations are recorded by the Center for Nigerian Languages to ensure linguistic consistency and reliability across the platform. While expert recordings form the foundation of this feature, users may also suggest alternative pronunciations, which can be reviewed by language specialists. By pairing written definitions with standardized audio references, the dictionary improves accessibility for learners and researchers alike and helps capture spoken aspects of Hausa that are often lost in purely text-based resources.
Suggest New Words
As a living dictionary, Kamusun Hausa enables community participation by allowing users to suggest new words and enrich existing entries. Contributors can submit proposed words along with definitions, word classes, pronunciations, and other relevant linguistic details, helping the dictionary evolve alongside real-world language use. To maintain accuracy and scholarly integrity, all submissions are reviewed and moderated by administrators at the Center for Nigerian Languages before being incorporated. This collaborative workflow balances open contribution with expert oversight, supporting the project’s broader goal of preserving and expanding access to the Hausa language in a sustainable, community-informed way.
Admin Dashboard
To support long-term sustainability and data quality, the project includes an administrative dashboard that gives the Center for Nigerian Languages direct oversight of the dictionary’s content. Through this interface, administrators can review and approve user-submitted entries, add or remove words as needed, and manage updates to ensure linguistic accuracy. The dashboard also supports mass uploads of new entries using a custom parser developed during the project, enabling efficient expansion of the dictionary beyond manual, one-by-one updates. In addition, administrators can generate and manage API keys for external users, allowing controlled access to bulk dictionary data for research and natural language processing efforts aligned with the project’s broader mission of expanding access to Hausa language resources.