Curva Fin Bloque

LocWorld & TAUS Barcelona 2011 – Interoperability and DIY SMT

This is a summary of what has been learnt, discussed and exchanged during our attendance to both TAUS and LocWorld venues in Barcelona. It highlights many of the issues that feature high not only in the translation industry but in the software industry in general.


Pangeanic and their Machine Translation division PangeaMT were well represented at the Localization World edition in Barcelona in June’11. Manuel Herranz, CEO, who had spoken at TAUS Executive Forum (see ppt here) in Barcelona the week before was joined on this occasion by Elia Yuste (Business Development Manager and PangeaMT Lead) and Antonio Lagarda (PangeaMT R&D).

The announcement that Pangeanic’s PangeaMT package will provide a customized API for users to its different engines became a hot topic after the news that Google will deprecate its free version.

Toni and Manuel busy explaining the advantages of open-source, DIY SMT

PangeaMT’s  philosophy is rather different in coverage, as it develops domain-specific engines and thus its APIs can make calls to specialist engines (tourism, real estate, engineering, electronics, legal, even marketing). The PangeaMT booth was visited constantly by translation specialists, consultants and practitioners operating internationally and looking for customizable, scalable, open source MT solutions.

This was an extraordinary setting to discuss collaboration and representation avenues, sales opportunities and most importantly, demo the powerful PangeaMT DIY technology which enables users to build their own MT solution with their own data in-house or with a SaaS model.

Elia Yuste also took part in the  Conference program as  speakers at the E7 track hosted by Jaap van der  Meer (TAUS) and entitled  MT Experiences at IBM, Sony Europe and  Sybase. Fausto Prastaro (Sony UK) and Elia Yuste and Kerstin Bier (Sybase, a SAP company) and Manuel  Herranz presented their two  respective use cases – click here to view.

Manuel, Elia and Toni at Pangeanic-sponsored dinner, Localization Word Barcelona 2011


Interoperability the key point and the buzz word for the future if you are dealing with software, translation and more even if you are producing translation software (machine translation software in our case). In a nutshell:

  • It costs money even though most vendors and clients can’t quantify it.
  • Some are calling for organizations or leaders.
  • Leveraging of TM losses 20% when switching vendors because of different tools being used.
  • Some translators simply refuse using one tool or another. Some tools also make translators not-so-productive, thus pushing rates up.
  • The current mix of free, cloud-based, licensed, SaaS and LSP-hosted tools is too much of an offering as they do not talk to each other. Perhaps new models are required. Human translator’s life should be made easier rather than focusing on formatting handling.

Several industry players (Iris Orriss, from Microsoft, Karen Combe from PTC, Minette Normal from Autodesk and Eric Blassin from Lionbridge) stated that there are interoperability issues between software and documentation and that it costs money, that CMS doesn’t work well with TMS even with own same supplier. Suppliers can’t find ways to keep up and solve interoperability problems. Ideally, we should use UI with documentation but it is very difficult and there is no budget for it even though inconsistencies do occur.  LSP’s are charged with the burden of  costs and reduced potential of innovation and efficiency. You are at the mercy of the tools and sometimes it is trial and error to find the best solution. Lack of interoperability is frustrating.

At TAUS, Smith Yewell, CEO Welocalize commented that lack of interoperability is causing productivity to be impaired and profitability to be undermined. We calculated it costs us 3M $ passing formats up and down and manual work. Because formatting affects transfer of data from our repositories (TMS or databases in MySQL, Oracle) to translation environment (Trados), quality issues happen as the wrong translator can be matched to the wrong job. Personally, I would agree that is the case with most LSPs facing conversion problems from different publishing, web and other formats, from Indesign to flash, html and even doc/rtf. Having a single, across-the-industry format would level the playground and increase efficiency.

Interesting presentations came from the EU, which  is the largest buyer of translation services. They are embracing the Moses platform to solve part of their problems in over more than 400 language combinations. The task the face is massive, as they have to work with any kind of combination from straight-forward Romance languages and English to morphology-rich Eastern European languages, agglutinating Finnish and Hungarian… Daunting task, but there are also many EU-sponsored R&D programs which can feed back eventually and help the solution.  Spyrodon Pilos (EU’s DGT) stated that in 05/2010 Commission Task force was confirmed so that the need for MT is addressed. On 12/2010 ECMT service suspended (rule-based system, Systran), so the EU is looking at Open-source software. The new name is MT@EC and it has to be built on trust, confidentiality and continuity. The EU is building a data-driven system using all its internal TMs, cleaning and preparing them, filtering it and processing them for MT. Benchmarks are established internally with basic Moses releases, then they will set up SMT engines and develop user interfaces and tools for capturing feedback in order to improve them. Also using and checking Apertium.


Next time you think languages, think Pangeanic
Translation Services, Translation Technologies, Machine Translation