500-Million-Sentence Dataset Can Boost Machine Translation for Low-Resource Languages
500-million-sentence dataset from Jörg Tiedemann, Professor at University of Helsinki, can improve back translation by centralizing monolingual content for 188 languages.