Unpacking the State-of-the-Art in Handwritten Text Recognition

Natural Language Processing Handwriting Recognition

In an anachronistic quirk of artificial intelligence (AI), people may soon be able to see a new play by Lope de Vega, one of the foremost writers of Spain’s Golden Age. And no, this is not yet another case of ChatGPT being taken too far — it was written by the playwright himself.

La francesa Laura (The Frenchwoman Laura) may not be one of Lope de Vega’s greatest works, but the story of its discovery garnered plenty of attention when the news broke earlier this year. Once again, AI played the hero, facilitating the arduous process of authorship attribution by digitizing the handwritten text for stylometric analysis.

It was READ-COOP SCE’s Transkribus that aided and abetted Álvaro Cuéllar and Germán Vega’s research into authorship in the Golden Age. A sophisticated text recognition platform designed to revolutionize access to historical documents, Transkribus is rapidly gaining popularity with libraries and archives seeking to digitize primary sources for large-scale search and analysis.

Making printed text machine-readable is hardly a new innovation. Optical character recognition (OCR) has been replacing manual data entry for decades, converting printed documents to indexable and searchable formats that can be queried and transformed. Its earliest applications were less concerned with parsing historical records and focused on challenges such as reading assistance for people who are blind and automatic mail sorting.

Nowadays, the technology is ubiquitous, underpinning a huge amount of business automation in order to more efficiently slap you with that parking fine or reject your insurance claim. You may have made more direct use of OCR by converting a scanned document to an editable format or translating a sign in an unfamiliar language with Google Lens. Its breadth of application has made text recognition essential in sectors ranging from financial services to healthcare to logistics. It could even help address data scarcity issues in NLP by converting scanned documents to training corpora for lower-resource languages.

Room for Improvement

In recent decades, machine learning approaches have seen OCR progress beyond basic pattern-matching algorithms that compare scanned images of characters against an internal database to more sophisticated feature extraction that allows models to generalize to unseen fonts and handwriting styles. Yet text recognition remains an area of active research with considerable scope for improvement, particularly when it comes to lower-resource languages and scripts, multilingual text, and handwritten text.

One of the classic errors that OCR is prone to, even on printed English text, is confusing lowercase ‘l’, uppercase ‘I’, and the digit ‘1’; interpreting an apostrophe as an acute accent or vice versa; and incorrectly segmenting words are all common issues that can throw a spanner in the works for downstream tasks. To avoid the need for manual review, error correction techniques that leverage spelling dictionaries or language models can be used to improve the OCR transcription.

While error correction can significantly boost text recognition accuracy, it comes with its own drawbacks. This process can struggle with jargon, slang, or named entities that fall out of vocabulary and can affect transcription fidelity by normalizing away errors that exist in the original document. Applications such as automatic marking of handwritten text are tricky as they require the high accuracy yielded by recognition error correction whilst preserving relevant orthographic errors for feedback.

Handwriting’s Decline?

Finding new plays by famous playwrights is not the only benefit of handwriting recognition. Taking notes by hand activates memory and learning centers in the brain in different ways than typing, and may improve retention. However, typed notes are much easier to edit, store, and search. With sophisticated handwritten text recognition, it is possible to have the best of both worlds.

More and more products are emerging to support this, many using online handwriting recognition which recognizes text as it is written by tracking stroke information on a touchscreen or through a digital pen. Google’s Gboard, Microsoft’s OneNote, and Apple’s Scribble combined with the Apple Pencil all convert handwriting to text in a limited set of languages.

Ironically, while computers are getting better at recognizing handwritten text, schools are debating whether to bother teaching cursive at all. Who knows, with more sophisticated recognition and enticing gadgets to scribble on and with, the technology that led to handwriting’s decline may become the impetus for its revival.