Many of us are familiar with the frustration, stress, and occasional panic caused by unexpected transport disruptions. We may groan when the conductor announces delays due to a signal failure or get swept up in a small stampede as station staff call out a last-minute platform change.
Some of us may be all too familiar with the unpleasant sensation of realizing one is on the wrong train only when an unexpected stop appears on the screen or is rattled off by a text-to-speech (TTS) voice.
If you are a hearing person, you may have never thought about the fact that delays, cancellations, and safety announcements are typically communicated via loudspeaker, with corresponding updates to information screens being limited if available at all.
This makes what is an annoying situation for hearing people a far more frustrating one for the Deaf and hard of hearing, and in some cases dangerous or frightening. Even standard travel information is rarely available in the native sign language of many Deaf people who are expected to make do with written information in their second or third language.
The UK’s Network Rail is one of several transport companies around the world working to address this gap in accessibility. As of April 2023, it has rolled out travel advice screens in British Sign Language (BSL) at eight major railway stations across the UK. These screens provide access to both standard announcements as well as disruption or emergency updates in the preferred language of over 87,000 Deaf people in the UK. Although long overdue, this is an encouraging step towards improving the accessibility and inclusiveness of public spaces.
Surrey Research Park-based startup Signapse is one of the companies behind the new technology. It is working to develop real-time sign language translation software and has already created synthetic signers to be used in contexts such as broadcast, transport, and website accessibility.
With a team comprising both Deaf and hearing people and close partnerships with Deaf organizations, Signapse is attentive to aligning its technology with the values and needs of the Deaf community.
Signapse’s synthetic signers are a clear reflection of this consultative approach. Rather than following the popular trend of signing avatars, which are generally dispreferred by Deaf people, these signers are convincingly human. Signapse uses generative adversarial networks (GANs) to create photo-realistic videos based on recordings of real Deaf actors who are compensated when their image is used.
Much like the original TTS voices used in transport announcements, the BSL videos are pre-recorded and stitched together to generate standard messaging. AI does the work of smoothing the transitions to improve the fluency of the resulting BSL video.
For unusual situations that cannot be communicated through the existing message library, a team of interpreters is on call to help convert new information to BSL so the video announcement is available within an hour.
With its horizon mission of unconstrained sign language translation, Signapse is not only looking to improve that turnaround time but to extend its technology beyond domain-specific use cases like transport. This will be no mean feat.
A Unique Challenge
There are over 70 million Deaf people worldwide using anywhere from around 150 to over 300 sign languages, yet these natural languages have been largely neglected by the field of natural language processing (NLP). In the same way that there is increased attention on extending language technology to low-resource spoken languages to protect them from endangerment, there is a growing urgency to support sign language technology and avoid perpetuating linguistic suppression in the digital age.
Due to the visual nature of sign languages, most advancements in sign language processing (SLP) have been driven by the Computer Vision (CV) community. However, CV approaches fail to capture the linguistic structure of the data, something that NLP excels at. NLP researchers have been urged to turn their considerable talents and expertise to developing effective tools for recognizing, translating, and generating sign language in collaboration with CV researchers and the Deaf community.
Sign languages present a unique challenge to NLP for a number of reasons. For one, they have no widely accepted written standard, making it difficult to apply existing text-dependent NLP pipelines.
Using text to capture the intricacies of hand gestures, facial expressions, head and body movements, as well as the capacity to convey multiple ‘words’ in parallel and use space grammatically, is far from straightforward. Although a number of both universal and language-specific notation systems have been proposed, none have been reliably adopted.
This in turn compounds the challenge of data scarcity as highly specialized annotators trained to use special notation or gloss schema are required to produce labeled data. Moreover, sign languages are considered extremely low-resource as even unlabelled data is difficult to collect and anonymize.
So while unsupervised learning on huge datasets has led to impressive jumps in performance for NLP, it is unlikely to be available for sign languages any time soon.
Sign languages are considered extremely low-resource as even unlabelled data is difficult to collect and anonymize.
At least work on another major challenge, sign language generation, is showing some exciting improvements. If Signapse’s photo-realistic approach can be generalized to a broader range of domains, it might be possible to perform data augmentation to train new SLP models.
However, there is no doubt that more real-world sign language corpora, along with the ability to anonymize signers in video footage, are needed to improve the state of SLP. As it should be, close collaboration with and involvement of the Deaf community will be the most important factor in determining the future of technology in their native languages.