4 years ago
May 16, 2017
Iconic’s language tech creates first English version of world’s oldest chemical journal
- World’s oldest chemistry abstracts journal made available in English for the first time.
- Machine translation technology, incorporating statistical and neural approaches, customised for historical chemical patent information.
- Over 3 million abstracts and 1 billion words extracted and translated from German.
Dublin – May 16, 2017- Iconic Translation Machines (Iconic), a leading Machine Translation (MT) software and solutions provider is pleased to announce its involvement in the creation of ChemZent™, the first and only indexed and searchable English-language version of Chemisches Zentralblatt – the oldest compendium of German chemistry abstracts dating from 1830-1969.
Iconic partnered with CAS, a division of the American Chemical Society, to produce ChemZent. This new CAS solution provides immeasurable value to researchers and institutions worldwide by allowing users to access the entire Chemisches Zentralblatt collection in one place using SciFinder ®, searchable in English with indexing of relevant chemical substances and concepts for ease of discoverability.
Iconic enabled this solution by developing innovative machine learning technology to extend its existing machine translation and natural language processing solutions. Iconic’s unrivalled expertise together with CAS industry-leading scientific information analysis made the launch of ChemZent possible within one year of idea inception.
The process of creating ChemZent involved large scale digitisation and translation of 140 years’ worth of German chemical information – journals and patents – for indexing and search. Iconic digitised 800,000 image-based PDF documents via Optical Character Recognition (OCR).
It then extracted individual articles, separated them into fields by author and title, and machine translated them from German into English, before CAS indexed the records for search. On completion more than 3 million chemical abstracts and one billion words were translated across the entire Chemisches Zentralblatt collection.
Iconic CEO Dr. John Tinsley spoke of the unique challenge the project presented:
“This was a truly fascinating project to complete. German is a particularly difficult language for machine translation and both the format and style of the original content made it a real challenge. This product was something that CAS was initially unsure was even possible and certainly could not have been achieved without the deep expertise from both sides.”
Iconic and CAS also presented at the recent 253rd Spring National Meeting and Exposition hosted by the American Chemical Society (ACS) in San Francisco where they detailed their collaboration.
“Access to the historic collection of Chemisches Zentralblatt via ChemZent in SciFinder has opened this scientific resource to chemists the world over now searchable in English with summaries in English as well,” said Dr. Matthew J. Toussant, Senior Vice President of Product and Content Development for CAS. “The ability to do an exhaustive search that includes this important foundational chemistry makes information relevant to research more easily accessible.”
ChemZent is now available in SciFinder and with extremely high demand has already exceeded first-year projected sales. The global scientific community who previously could not benefit from the historical chemical information in English-language format can now search and find this valuable information expected to fuel future scientific discoveries. The new solution provides insight into the work of early influencers of chemistry including Louis Pasteur, Albert Einstein, and Marie Curie.
Iconic and CAS continue to collaborate on exciting new projects and initiatives delivered through a combination of world-leading technology and deep subject matter expertise.
“An added benefit for us here at Iconic is that this project has allowed us to greatly improve our language processing technology by incorporating neural machine translation and scaling our architecture to turn around large-volume, complex projects like ChemZent in record time,” said Tinsley.
Iconic plans to leverage its new machine translation and natural language processing competencies to further the discoverability of patent information worldwide.
Full case study available here: www.iconictranslation.com/cas-chemzent/
About Iconic Translation Machines
Iconic Translation Machines is a leading machine translation software and solutions provider who specialise in custom solutions tailored with subject matter expertise for specific industry sectors including legal, life sciences and financial services. Iconic is the MT partner of choice for some of the world’s largest translation companies, information providers, and government and enterprise organisations, helping them to translate more content, more accurately and in less time, resulting in significant cost savings and increased revenue. Iconic is based in Dublin, Ireland. www.iconictranslation.com
Diane O’Reilly, Head of Sales & Marketing, Iconic Translation Machines