Chris Emezue is one of the co-principal investigators for Naijavoices and a seasoned researcher with a difference.
He is committed to developing intelligent and reliable systems for under-represented contexts (this includes African and indigenous languages and cultures). His research cuts across natural language processing, causality, and reinforcement learning.
In this interview, he explains what Nigeria and indeed Africa stand to benefit from the emergence and application of Artificial Intelligence, AI.
Excerpts:
Introduction:
I am Chris Emezue, a seasoned researcher affiliated with the Mila-Quebec AI Institute in Canada, a prestigious research institute known for its groundbreaking scientific advancements in artificial intelligence. As a researcher, I am committed to developing intelligent and reliable
systems for under-represented contexts (this includes African and indigenous languages and cultures).
My research cuts across natural language processing, causality, and reinforcement learning. As a dedicated contributor to the field of AfricaNLP, I have worked on several key projects to improve the representation of low-resource African language technologies (MMTAfrica, OkwuGbe, AfroDigits, IgboAPI). I am also with Lanfrica Labs, a research institute with the mission to foster technological innovation in language technologies. I am one of the
principal investigators for the NaijaVoices project. My co-principal Investigator is Prof G.M.T Emezue.
Gloria Monica Tobechukwu Emezue, popularly known as GMT Emezue, is a Professor of English at the Alex Ekwueme Federal University, Ndufu Alike Ikwo in Ebonyi State, Nigeria. She
is a literary critic, who could best be described as an Afrocentric scholar with a passion for identifying and promoting positive African values, indigenous practices and knowledge systems,
which includes African languages.
How is artificial intelligence connected to the NaijaVoices project?
NaijaVoices is to AI, what fuel is to a car. Fuel is a much-needed resource for powering a car. Datasets (like the NaijaVoices’) are the needed resources for developing AI applications. Just
like fuel scarcity has led to the high cost of fuel and other things in Nigeria, that is how datasets, especially for African technologies, are very scarce and expensive to create.
Therefore, the NaijaVoices project is ushering in a crucial resource to power the creation of artificial intelligence technologies for Nigeria. Imagine being able to write a text in Yoruba, Igbo or Hause and hear a machine read it out for you in your language. Imagine the immense potential of speech
technologies in Nigeria.
All these will only be possible when the underlying data needed to create them is available.
As a co-principal in this project, what has been your biggest challenge and
responsibility so far?
The major responsibility for me was management. In the Niaiavoices project, we managed and worked with roughly six thousand people. While my co-principal Investigator, Prof Gloria, dealt
with managing the on-the-ground activities in Nigeria, I was responsible for global management: financial reporting, progress management, evaluations, and other such responsibilities. This project has by far been one of the largest management experiences that I have been involved in.
And perhaps the greatest task for our managerial team was effectively coordinating six thousand people.
What is the composition of your team like and from what geographical locations were they drawn for
The NaijaVoices team structure is made up of different teams at different levels These teams include the managerial team, the local coordinating team, the technical team, the sentence-generating and recording-facilitating team, and the secretarial team, to mention but a few.
We leveraged a participatory approach to construct our teams to ensure that all the needed people and perspectives were present.
For example, we ensured that our sentence-generating and recording teams comprised local native users of these three target languages – through
collaborating with experts and language us in local communities.
To achieve our objective of getting an adequate representation of voices, we even went deeper into expanding the definition of “native speakers” by including the different spectrums of being a “native speaker”. We included a speaker who was born in the native country, a speaker born in the native country but lives elsewhere, a speaker born outside the native country, but whose parents speak these languages, second language users with high competence skills etc.
This wide net enabled us to capture diverse perspectives in, for example, generating contextual native sentences.
How long has the project lasted and what notable achievements has your team recorded?
It is an amazing feat in itself that the implementation phase of the NaijaVoices project lasted just 4 months. A typical project of this scale would require at least 1 year (which was the
or the final duration of our project proposal). Of course, the short duration made it more intense, but we were able to achieve the goals of the NaijaVoices project partly due to deliberate planning.
For example, one key ingredient here was our managerial team, led by Global KANAC Ltd, which consisted of experts with acumen for simultaneously handling a large number of people in
Nigeria. Furthermore, Lanfrica Labs’ provision of compatible technical platforms bolstered our efficiency.
Perhaps our most noteworthy achievement is the NaijaVoices dataset, a unique 1,500-hour-audio, and speech data for the three major Nigerian languages. More information about the
datasets can be found at https://naijavoices.com/.
How do you intend to handle the criticisms springing up around artificial Intelligence, especially as it concerns the African setting and mentality?
The scepticism of many Nigerians concerning artificial intelligence (AI) is indeed a reality. During our recording exercise in the NaijaVoices project, we interacted with close to 6,000
people spanning various religious backgrounds and cultures from different parts of Nigeria. I learned two fundamental things that helped us to achieve success.
Firstly, if you explain AI to someone in their language using ideas, and values that they understand and relate to, it leads to a much better understanding of how this technology can truly positively affect their
lives.
Secondly, the best way to create a life-long positive impact through AI for the Nigerian people is to first investigate and understand the challenges of the Nigerian people and then figure out how to leverage AI to design potential solutions to that problem. This statement is important because currently, with the hype of AI in the world, there is a lot of misinformation about AI.
Furthermore, the global North tech powers are not readily interested in solving African problems. That is why I believe that it is up to us Africans to create and build our downtown sets our technological solutions to the challenges in Africa.
What do Nigeria and Africa stand to benefit from this project?
I always dreamed of a future where I could use the Igbo language to send a WhatsApp voice note to my Hausa (or Yoruba)-speaking friend and they could listen to the voice note in their
own Hausa (or Yoruba) language, not minding the fact that the original voice note was in the Igbo language. For half a decade this has felt like a relatively far-fetched dream due to the scarcity of relevant datasets required to build such technologies.
But with NaijaVoices, this dream is one big step closer to reality! The NaijaVoices datasets are like a seed that has been planted on the complex mosaic of Africa. This seed will usher in a wave of language technologies that redefine how we interact with one another in our multilingual continent. I believe that our multi-linguality and multiculturality should not be a source of division but rather, they should be sources of unity and empowerment.
Thus, by enabling cross-lingual communication across these different
languages, which is what NaijaVoices has set out to do, we usher in peace, unity, economic development, and the overall advancement of Nigeria and our African continent in general.
Are there challenges facing the project that require assistance from government or private individuals?
At NaijaVoices, we are on a mission to create empowerment and opportunities through languages. The NaijaVoices dataset is therefore just the beginning. We are spearheading the development of sophisticated speech models for Nigerian applications as well as creating more datasets (which will even include more Nigerian languages).
Having worked in this terrain of high-quality, large dataset creation in Nigeria, we are conversant with all the requirements that would enable the smooth achievement of our goals. The best form of assistance for us right now is financial sponsorships and partnerships that take us closer to
achieving our mission,
What should Nigerians expect in years to come regarding the project?
There are over 400 languages/dialects that are spoken in Nigeria, but current digital technologies do not support most of them. Interestingly, many of these languages are becoming extinct due to under-appreciation and lack of usage, especially in digital technologies, which is a clear pathway to the future preservation of languages that would be spoken in tomorrow’s world.
The name “NaijaVoices” represents the preservation and empowerment of the voices (i.e. languages, dialects, experiences, perspectives) of eaceverygerian by ensuring that
digital technologies are accessible in their language.
Our organization, made up of passionate individuals possessing a wide range of expertise, will work in the coming years towards creating more datasets and developing voice technologies that adequately represent all our voices.
I have always had a dream that can adequately and simultaneously communicate with ourselves in our different languages like Ibibio, Fulani,
Ijaw, or any of our several Nigerian Languages and dialects without first resorting to the English language like we currently do. During this communication exercise, each person would be speaking his or her language and the other person would understand what has been said in his or her language.