Try our custom LLM Masker
Featured Image

2 min read

19/11/2021

Human parity: the utopia of machine translation

One of the most ambitious concepts in the machine translation industry is human parity. In today's article, we explore the limits of machine translation and its future, which is getting closer and closer to utopia. 

What is human parity in machine translation? 

Human parity is considered utopian by industry experts, as achieving it would require a machine translation system to produce equal or comparable quality to that of a human translator.  Machine translation engines are becoming more and more powerful, and their hyper-specialization allows them to achieve quality that was unthinkable until recently. However, absolute human parity requires going beyond the established limits.   

Here is an example of human parity in machine translation engines:

We begin with the following sentence, which the engine has to translate from Spanish to English:

"¡La presentación de nuestros compañeros en la última conferencia de KTLC estuvo genial!"

To achieve parity, we use an English reference sentence written by human translators:

"Our coworkers' presentation at the last KTLC conference was awesome!"

Our translation model generates the sentence in English:

"Our colleagues' presentation at the last KTLC conference was great!"

The model's translation has reached human parity if it is equivalent or similar to the reference sentence

Human vs. machine translation: differences and benefits.

Despite the dizzying advances of machine translation systems towards achieving excellence, with them being increasingly autonomous and requiring less human intervention, there is still a long way to go before reaching parity.  One of the main differences between the two translations lies in the translator's knowledge of expressions and subtleties in the language he or she is working in.  In addition, being able to enrich a text with a personal style and approach is something that would have to involve deep learning in order for a translation engine to be able to replicate it.  Another highly complex matter which is to be added to all the above is the context. The machines and artificial intelligence still come up against a great barrier when reading and applying the series of circumstances that allow for the correct transfer of information and understanding of the text. Knowing how to interpret the place, culture and time periods of both the sender and the receiver is still a human skill that technology has not yet managed to emulate.
Do you want to know more about the importance of human translation?  In the following article, Ángela Franco, Production Coordinator, delves into the subject. Continue reading: When to review a translation?
In contrast, machine translation models are able to generate translations with near-human parity if the text being translated resembles the data they have been trained with. To this we must add the high efficiency of these machine translation systems, capable of processing and translating large volumes of information and data with a speed unattainable for humans. 

Is the great challenge in developing translation solutions getting closer to the end? 

The goal in the field of machine translation is clear; achieve human-like quality thanks to engines trained to be deeply specialized.  At Pangeanic we are working on developing translation engines capable of generating this equivalence to human capabilities. Mercedes García, leader of the research department, explains how we are working towards this ambitious (and less and less utopian) goal: Currently, translation engines based on the Transformer neural architecture provide the highest quality translations. At Pangeanic, we use this type of architecture in our deep adaptive technology that trains models quickly and efficiently, providing the best quality and adapting to customer needs in a user-friendly environment.

Do you know about ECO by Pangeanic?

The ECO platform allows you to automatically translate hundreds of millions of words in record time, anonymize content, summarize, extract knowledge and key data, and convert unstructured data into structured content. "At Pangeanic, we have developed the Deep Adaptive ECOsystem technology that allows us to train the machines individually for each user using data with content from a specific domain. Consequently, this system helps the machine translation quality become more and more similar to that of human translation". Mercedes Garcia , leader of Pangeanic's research department