Neural Machine Translation Research with Mercedes García-Martínez

Pangeanic is a company in constant technological development: our award-winning R&D is focused on AI and Neural Machine Translation research and development to offer our customers the best quality translations and innovative services. Mercedes García-Martínez, who has a PhD in computer science with a specialization in neural machine translation research, has recently joined our team.

Mercedes did her PhD at the computer lab of the University of Le Mans in France. Her thesis, titled " Factored Neural Machine Translation", was one of the first in the world to investigate neural translation models and to apply factorial models to this type of machine translation for the first time. In general, it consists of helping to improve translation quality, through linguistic knowledge, increasing the available vocabulary without having to increase the size of the neural network. She has also taken specialized courses in neural networks, such as the one given in Montreal, Canada, at the prestigious MILA laboratory and another on machine translation ( MT Marathon 2012). She has also participated in courses on translation technologies and research on the translation process at the Copenhagen Business School. Not to mention, she has more than 20 scientific publications in international journals and conferences and 166 citations in Google Scholar. Today she shares her opinion with us about the changes we will see in the near future thanks to neural machine translation engines.

What is your opinion on the technological change that society is undergoing?

Society is evolving by leaps and bounds thanks to artificial intelligence. Just looking back a few years, we see great changes in all disciplines. Today's technology is very sophisticated; machines are capable of learning large amounts of data, automating tasks and making fairly accurate predictions. One of the main branches of artificial intelligence is artificial neural networks. These are inspired by the physical functioning of the human brain. In this way, the machine is able to memorize a lot of data and learn how to solve a task using given examples. Neural architectures are in continuous evolution and are one of the most popular research areas. They can learn any task and the only setback is the need for a large volume of data. I don't think this will be the case for long, though, as more and more data are being collected today, so neural networks will be very much in the spotlight in the coming years.

What is your experience as a researcher in neural machine translation?

I have been involved in machine translation projects for over seven years. I have carried out projects to incorporate machine translation engines in companies ( PangeaMT and Celer Solutions) and European research projects (CASMACAT). I have also organized summer schools on translation technologies at the Copenhagen Business School: Translation data analysis (TDA) and ASR & Eye-tracking Enabled Computer-Assisted Translation (SEECAT). I have also participated in the Machine Translation Workshop where universities and companies compete to achieve the highest quality of machine translation in a given task.

What do you see yourself doing five years from now?

I'm sure that in 5 years, neural machine translation will have changed a lot. In recent years, every month there has been some improvement that is being integrated into the neural machine translation process. There is still much room for improvement and many areas to be researched because it is a very modern technology and needs time to achieve optimal use. In preneural statistical strategies, it took about 10 years to achieve system maturity. So, in 5 years I see myself improving the neural models and integrating new features to make the quality even better.

What do you predict for the future of the translation industry?

The language industry has changed a lot in recent years. For example, it is no longer possible to work without a computer. Since the emergence of the neural network paradigm in machine translation, the translation industry has shown great interest given its higher quality, closer to that generated by a human translator than that obtained from statistical machine translation. The machines automate and facilitate the arduous, repetitive and uninteresting tasks that the human translator does not need to do, allowing for faster delivery of translation jobs. However, there will still be a need for a specialist human translator to proofread the machine translations, as they are not perfect, and thus achieve the quality demanded by the client. Some domains, such as catalogs, are easy to translate automatically and will require almost no human intervention. On the contrary, literary texts, in which many expressions and metaphors are used, are very difficult to translate automatically and will require human professional work in the very long term. Furthermore, the translation of nearby and widely used languages is easier to carry out automatically, but when it comes to translating languages from different families and with few speakers, machine translation does not yet achieve good quality and depends on the intervention of a professional human translator.

Amando Estela

08/09/18