lexiQA, the leading API-based quality control solution, announces new state-of-the-art spellcheckers, starting with four morphologically challenging languages. The first release, to be out in July, will include Burmese and three Indian languages (Hindi, Gujarati, Tamil).
Morphologically rich low-resource languages have been a traditional challenge for NLP when it comes to spellchecking. Both desktop and cloud-based commercial solutions support almost none of these languages; still, when supported, their spellchecking performance ranges from mediocre (Hindi and Gujarati) to disappointing (Tamil) and non-existent (Burmese). When it comes to open-source spellchecking engines like Hunspell, on the other hand, these languages are either not supported at all or appear in the form of limited, mostly outdated wordlists.
Building spellcheckers based on wordlists often fails for these languages, due to their agglutinative nature, non-standard transliteration practices, and complex derivational morphological processes. Manually crafted rules may also handle only a limited number of cases. For example, in Tamil, one may find long compound words made up of several words joined together. A wordlist-based approach would not be able to handle all possible word combinations and would, therefore, fail. And to make things worse, for a language like Burmese, with its preferential use of spacing and its recent switch from Zwagyi to Unicode, building an accurate spellchecking engine becomes even more challenging.
lexiQA will be bundling these spellcheckers with its brand-new Named Entity Recognition feature. Based on tests conducted in Q2, the data shows a dramatic decrease in false positives as lexiQA’s hybrid mechanism is also able to successfully detect OOV (out-of-vocabulary) words, while false negatives are kept to an absolute minimum, currently averaging lower than 1 per 1,000 words. Support for more languages will be rolled out in the coming months.
If you are interested in the very telling results of our performance compared to some of the current and most popular QA tools on the market, you can request a copy right here.
For more information, please visit: https://lexiqa.net/
Get in touch with lexiQA here: https://lexiqa.net/contact/
Lexiqa Limited is headquartered in the United Kingdom with staff also in Germany, Greece, and India. Its solution, lexiQA, is an online QA solution that can be used via API within an online translation/localization environment.