Curva Fin Bloque
POST 20 SEPTEMBER, 2013

Understanding Machine Translation Customization and DIY MT

by Manuel Herranz

The same mistake that was made by many translation agencies, translation companies and now language service providers is being made by tough machine translation companies. “My (machine) translation is better than yours”, “my machine translation system works, everybody else’s doesn’t”. Translation companies have learnt that they cannot sell translation services on “translation quality claims” only or “I am better than you because…”  – but it seems that some machine translation companies have to learn the same lesson. I am referring particularly to those with risky levels of investment /venture capital to repay and without the testing ground of in-house native speakers or a real translation department where to test their technologies and MT before release. At times, such companies obtained their “high quality clean data” by bombarding Google Translate and applying cleaning cycles which included manual revision by local, non-native graduates. Many LSPs fall for the big marketing campaigns, strong wordings – the limelight is always very attractive. Translation Memory technologies are a good proof of that.

Bad-mouthing the competition is the worst marketing tool I would recommend to anybody in sales, marketing or representing a company. Talk about your strengths. Acknowledge what you cannot do but what you can do to solve the problem. If you cannot match some offerings from the competition, saying it doesn’t work is a terrible policy. There are tens of use cases and applications, conferences, presentations to prove that, for example DIY MT works and is in good health, being used at LSPs, institutions and corporations. As far as I know, automated retraining and Moses packaging are part of at least two EU-funded programs. As platforms such as Gala provide an excellent platform for machine translation webminars, monopolistic attitudes become more and more aggressive.

But I want to minimize self-promotion. What Kirti Vashee seems to forget in his virulent blog entries is that no company will release a tool that doesn’t work nor install a product that cannot do what it claims it can. I was an industrial engineer for many years to learn at least the difference between what works and doesn’t work. When it comes to hardware tools, quality may be easy to spot. When it comes to services (and in machine translation is clear, “my output” “my clients” “my productivity” and “my technological independence”) quality is what works best for me. Claiming that in 2013 MT is so complex only one company fully understands it, is presumptuous to say the least.

Let me quote some translation agencies (the term Language Service Provider being unknown to the majority of people outside the language industry). They are not big companies, possibly what economists call small and medium-size companies.

Tilde, Apsic, Lexcelera, Pangeanic. I am sure other four at least could make it to this list. What do these companies have in common? All of them were/are  translation companies that have transformed themselves into higher solution providers either by developing software solutions that solved particular problems in translation or by customizing technology into their processes. With the help of EU funds and a clear vision to fill a market need, Tilde led R&D projects aimed at developing machine translation for less-resourced languages. Automated engine creation and re-training were part of the initial EU-funded project.

Apsic is the developer of one of the best consistency-checking software (XBench) which is a must of any company wanting to ensure terminology consistency and error-free supplies over hundreds of files.

Pangeanic has developed a management system on top of Moses which manages training sets and automatically cleans some data, trains engines and creates new engines with a variety of other customizable features.

As MT customizers, we know that initially some settings, parameters, weighs and features need to be configured carefully to get a good start. But I do not know of any company in the software business that insists on manual processes and cannot automate what it has to do repetitively.

Next time you think languages, think Pangeanic
Your Machine Translation Customization Solutions