The Workshop on Machine Translation (WMT) 2021, a prestigious international event whose main objective is to present advances in machine translation research, was held last November. In addition, research groups had the possibility to submit a machine translation result from a system developed for the WMT competition. The competition is made up of different categories, such as translating news, medical data, similar languages, languages with few resources, automatic post-editing, machine translation quality measurement, etc.
Participants present their machine translation systems and their translations are analyzed and evaluated in the competition. As in the previous edition, the Pangeanic team actively participated in this important conference, sharing our knowledge and experience on artificial intelligence (AI) technologies applied to machine translation and language processing.
Pangeanic at WMT 2021: Data Providers For Romance Languages
The Punta Cana region in the Dominican Republic was chosen as the paradisaical setting to host, from November 7 to 11, 2021, the latest edition of the WMT, which could also be followed online. This event was part of the EMNLP 2021 high-level international conference on natural language processing.
As a consequence of the growing interest in taking advantage of the similarity between certain languages to improve the quality of machine translation in languages with fewer resources, WMT 2021 included machine translation of similar languages in its program. Three language families were identified: Tamil and Telugu (Dravidian languages), Spanish, Catalan, Portuguese and Romanian (Romance languages) and Bambara and Mandinka (West African languages).
Pangeanic’s contribution was to provide parallel translations between Spanish, Catalan and Portuguese, in order to build a test dataset to evaluate the quality and performance of machine translation systems applied to the different Romance languages, as they have many similarities in vocabulary and grammatical structures.
The Importance of Development and Test Data in Machine Translation
Events such as WMT are absolutely indispensable in order to present new trends in machine translation and to analyze the quality of machine translations in different tasks and fields of application, such as the translation of news or specialized texts (medical, legal, technical, etc.).
The aim of these international conferences, which bring together the most qualified experts and companies specialized in the development and use of automatic language processing systems, is to establish automated metrics, as objectively as possible, to evaluate the quality of translations by creating accurate, segmented and objective scoring systems.
The ultimate goal of the evaluation work carried out at WMT 2021 is to achieve maximum correlation between automatic and human evaluations.
PangeaMT, AI in Language Processing
Through our PangeaMT subdivision, Pangeanic has developed its own AI technology, which allows us to create customized machine translation solutions that adapt to the specific and concrete needs of each client.
In addition, we have created ECO, a language processing ecosystem that, in a matter of minutes, obtains the information it needs to adapt to each user’s language style, in addition to obtaining valuable actionable information from both structured and unstructured text data.
Our system allows us to comply with privacy regulations (anonymization) and to train specialized models in order to perform increasingly fast and more sophisticated machine translations.
You may be interested in: PangeaMT, at the forefront of machine translation solution development
Another of our high-level services that we make available to our clients is our private cloud service, from where it is possible to work practically anywhere with total security and with the guarantee of having continuity of service at all times, since our servers are located in three major jurisdictions: Europe, U.S.A. and Japan.
We are able to provide top quality machine translations in over 500 language combinations, and our anonymization is available in 25 languages.
Our participation in important events like WMT 2021 only demonstrates and reinforces our commitment to machine translation and natural language processing, with the best team and the most advanced and efficient technology.