Neural machine translation (NMT) is infamous for being data hungry—not only does an NMT engine require a lot of data, it needs clean, high-quality data. This is an issue for so-called low-resource languages, for which there are few sources of training data.
For social media giant Facebook, this issue is not an abstract problem. The social network breached the two billion user mark in 2017, and the platform performs 4.5 billion translations a day.
A portion of those billions of translations is for low-resource languages, such as Vietnamese, Turkish, and Tagalog, the main dialect of the Philippines, the social media capital of the world with over 47 million Facebook users.
So Facebook is throwing a bit of cash at the problem. Facebook research has opened up a research grant for the academic community to tackle the problem of low-resource NMT.
“One of the big challenges is to achieve great translation accuracy in the absence of large quantities of parallel corpora,” Facebook’s announcement read. “This is especially true for Neural Machine Translation (NMT) which uses models with a large amount of parameters.”
So Facebook research will be funding up to four research proposals with grants ranging from USD 20,000 to USD 40,000 for a period of one year starting from June 2018, with the option for further funding after evaluation. Facebook research opened up submissions for the grant up to April 18, 2018. Successful awardees will be notified by May 2018.
Research proposals need to address low-resource NMT specifically. As Facebook put it, topics include, but are not limited to:
- Unsupervised NMT for low resource language pairs
- Comparable corpora mining, again for low resource pairs, and
- Monolingual resources for low resource NMT
Or a combination of any of the above.
Facebook research requires applicants to submit a summary of the proposed project along with a timeline with quarterly milestones, a draft budget description, and of course the CV of participants.
Facebook research will also be taking the awardees to a workshop in September 2018, a couple of months before the six-month mark on November 2018, when research progress will be evaluated, where “opportunities for a second round of funding will be determined.”
The Facebook AI Research team (FAIR) is fairly active in contributing to open source materials and research, as well as submitting research papers to research portal Arxiv.org. Among their most recent research topics include understanding intrinsic and extrinsic uncertainty in NMT models, and post-editing research involving very simple human interactions.