On the Radar: NLP at Amazon and MIT, Interpreter Trouble in Paradise

Amazon will open a machine learning development facility in Turin, Italy before the end of the year, the e-commerce and cloud computing giant said in its official blog. It will be the company’s 15th development center in the EU.

The new hub will be dedicated to developing machine learning capabilities for Alexa, with a focus on advancements in speech recognition and natural language understanding. The cloud-based voice service behind Echo Dot and Amazon Fire TV, Alexa uses machine learning to, among other things, detect wake words (i.e., the word used to trigger a device to perform an action).

Said Alexa VP and Chief Scientist Rohit Prasad, “Our Turin center will play a critical role in advancing Alexa’s spoken language understanding capabilities.”

Turin was selected for its deep talent pool and proximity to institutions such as the the University of Turin and Politecnico di Torino, which, between them, have well over a hundred thousand students.

The Turin Development Center will soon begin hiring speech recognition and natural language understanding scientists to fill an initial 10 posts. Interested candidates are encouraged to visit the company’s jobs website for details.

ESL-to-English Translations, Anyone?

Non-native English speakers could prove to be a more valuable source of linguistic data for machine learning than first imagined. That possibility was put forth recently by a team of researchers at the Massachusetts Institute of Technology (MIT).

“Most of the people who speak English in the world or produce English text are non-native speakers. This characteristic is often overlooked when we study English scientifically or when we do natural language processing for English,” said Yevgeni Berzak, an MIT graduate student.

Berzak led a team that, after thousands of hours of research, “released the first major database of fully annotated English sentences written by non-native speakers,” a July 29, 2016 article in MIT News said. The team, which had previously shown how grammatical nuances could provide linguistic insight, now hopes their dataset could also be applied to machine learning.

If machine learning systems were trained in nonstandard English, the article pointed out, then computers would be better equipped to process the language as written or spoken by non-native English speakers—dropped prepositions, wrong tenses, and misused auxiliary verbs and all.

All 5,124 sentences in the dataset were culled from a collection of exam essays by ESL (English as a second language) students from Cambridge University. The MIT team then added detailed annotations for both grammatically correct and incorrect sentences.

“This could be cast as a machine translation task, where the system learns to translate from ESL to English,” said Joakim Nivre, professor of computational linguistics at Sweden’s Uppsala University.

The MIT team actually hopes the dataset could apply to grammar-correction software for native speakers of other languages.

Unpaid Court Terps Make Themselves Scarce

If the courts need them, they should not be treated like trash. So said a court interpreter working in the Northern Mariana Islands. Interpreters’ fees have been delayed or gone unpaid in the local trial court causing delays in legal proceedings, the Marianas Variety reported.

Court interpreter rates in the Commonwealth are not fixed. Some are reportedly paid USD 15 an hour, others, USD 20; and then compensation would often be delayed—if they are paid at all. Interpreters are sometimes even paid by the minute for short arraignments.

Compounding the issue, an interpreter may “go hungry while waiting for the resolution of a case, some of which take up to two years,” the newspaper article quoted one interpreter as saying. The same interpreter lamented that, at times, “there is only one translator for the entire proceeding,” when there should be one for the government, another for the defendant, and a separate interpreter for the court.

Given such conditions in the local trial court, interpreters have flocked to the district court where things are better: an hourly rate of around USD 100 with payment released as soon as the interpreter files an invoice.

In a follow-up story, however, the Marianas Variety said Superior Court clerk Patrick Diaz denied that there were any payment delays in the Commonwealth’s judiciary. Said Diaz, “Once an interpreter submits a bill to us, the turnaround time for payment is about 15 days.”

He added the judiciary has since reviewed its records and will immediately address any unpaid invoices that may have been missed. Diaz added the judiciary has submitted a budget increase request for 2017 and, if approved, the court can afford to increase hourly rates for professional services.

On July 18, 2016, because there was no available interpreter, an associate judge rescheduled the bench trial of a man charged with beating up his wife. Attorneys for both sides said they had exhausted their list of more than 15 Chinese translators to no avail.

Slator reached out to Diaz for further comment, but received no reply as of press time.

And this concludes Capita’s MoJ adventure…

In other court interpreting news, after finally hitting its performance target of 98% for the supply of court interpreters to the UK Ministry of Justice (MoJ), as reported by Slator back in April, Capita TI has failed to repeat the achievement.

According to latest figures, published July 21, 2016, perennial MoJ contractor Capita TI completed 97% of language service requests between January and March 2016, once more falling beneath the prescribed 98%.

Capita TI has been the MoJ’s sole language service provider since 2012. Its current MoJ contract expires at the end of October and new contractual arrangements will take effect from October 31, 2016.

New MoJ contract winner thebigword said it will transfer a number of staff from incumbent Capita TI once it takes over. The Leeds-based language service provider will handle face-to-face and telephone interpretation as well as translation and transcription for the MoJ.