The potential and pitfalls of Arabic language tools

January 20, 2018

There are more than 7000 known living languages in the world. One of them, Arabic, is a beautiful ancient language that emerged during the Iron Age in northwestern Arabia and is now the official spoken language in over 20 countries. It is the first language for more than 420 million people, making it the sixth most spoken language in the world. This means there is an enormous market for Arabic content online.

Abdullah Arif, a Dubai-based cutting edge product development manager, is working on a number of development projects focused on the Arabic language including fonts, processors and dictionaries. He has spotted opportunities for investors and other developers to meet a growing demand for Arabic-language tools and products.

A gap in the market

In Dubai and other international cities in the Middle East, there are many different Arabic nationalities and within those groups there are even more dialects. Abdullah saw a need for a dialect-based dictionary to support communication. 'You have these moments when someone is speaking to you and you don't quite get what they're saying.' He spotted the opportunity to develop a user-generated dictionary to help people over these hurdles.

Abdullah created a resource where the public can contribute entries, like Wikipedia, and users can find the meaning for spoken varieties and dialectic expressions in Arabic. So far there have been over 3,000 people contributing and it's growing rapidly.

After spotting this gap and others like it, Arif developed his own font, user-generated dictionary and a markdown editor just for Arabic because he realised there weren't any products out there that were good enough.

The Arabic language data challenge

Arif has received lots of requests from researchers and programmers for access to the data from his dictionary. Other developers are working on problems such as translation engines from spoken varieties to classical Arabic, but they're finding it hard to access to data to build on. Abdullah reports that there is actually lots of good research out there, PhDs and so on, but it's hidden and it's hard work to find it.

The inadequacy of bots

Another major challenge facing Arabic-language products is bots. These are a popular web development helping people interact with companies online, so there is a large market waiting for them. But bots are a big challenge for Arabic speakers for two reasons. The main issue is the range of dialects and variations of Arabic. It's hard to get a bot to understand them all. So far no one has used research to create a workable solution so this is a definite investment opportunity.

Issues with Arabic language features

People assume text is simple. It isn't. There are significant differences in syntax and character formation between Latin-based languages and Arabic. The majority of established markdown editors within the programming community and other processing products were created in a Latin-language environment so the visual interface was designed to reflect that.

Latin languages write from left to right and utilise a simple 26 character alphabet with few variants whereas Arabic has 28 characters and each one looks different in different positions within a word. Regular markdown editors can't handle that or the right to left text direction.

In Arabic, there are up to 4 forms for each character (initial, medial, final and isolated) and they need to connect together in a cursive style to form the word properly. On top of this, some Arabic words, such as the word for God, Allah, have unique ligatures when they are written. So they look different that you would expect if you simply joined the letters together. And finally, there are also marks which carry meaning that don't always correspond to Latin marks. Trying to write in Arabic on a Latin-based markdown editor is like trying to fit a square peg in a round hole.

Arabic font

Abdullah recognised that there are also typographical needs. The font industry in Arabic is very new so there isn't much available. He didn't like what was on offer so he decided to create his own font, a fixed web font. He admits he's no fontexpert and it was a learning experience. So he decided to share his knowledge by making it open source. There is huge growth potential for fonts but the font industry can only take off if people learn about the process.

Investment opportunities in Arabic language tools

As with any identified digital need, there are opportunities for investment. Arif identifies a number of areas for potential products, but he has high standards. He doesn't want to make something adequate, he wants to create something beautiful. Abdullah envisions products that make it easier to type Arabic and improve bot interaction for dialects. There is also room for products dealing with high-quality translation and oral dictation, such as for messaging. Then there's the potential for Arabic content. This will only grow when Arabic becomes a more usable language to deal with so these tools offer huge scope for investment.

What do you think? Share this on Facebook, LinkedIn and Twitter.

Written by
Alexander Rauser
Alexander Rauser


Alexander Rauser is the author of Boardroom Guide to Digital Accountability and Digital Strategy: A Guide to Digital Business Transformation, and creator of the DSX Program, a digital strategy and transformation program for Enterprises.

Related Articles

Subscribe to our blog

Subscribe to our blog to receive relevant news and tips about digital transformation, app development, website development, UX, and UI design. Promise we won't spam you.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.