Curva Fin Bloque
DIARY IN A HOUSE TRANSLATOR 3 DECEMBER, 2014

6 Steps to create TMX file from Excel or other formats

6 Steps to create TMX file from Excel or other formats

Sometimes, a professional translator may require to create TMX file from Excel or other formats in order to reuse bilingual material accumulated with experience over the years. Or perhaps there is no source material but the translator would like to build a theme or field-specific translation memory for a job as a reference. This is useful sometimes in our profession when clients provide us with text to translate but no previous reference material, no terminology database… no reference. As professional translators, our job is to be as terminologically accurate as possible and this can only be achieved if we have tools and reference material available to check and upon which to base our work.

But “Create TMX file from Excel” is not a feature most CAT tools provide automatically (I’m quoting Excel as the main program as it is readily available in most desktop and laptop PCs or macs, but really any bilingual table will serve the purpose).  This example comes from a Google Document where one linguist had collected game terminology English – German.

English-German text in Google docs - gaming terms from Proz
English-German text in Google docs – gaming terms from Proz. Perfect to create TMX file from Excel-like format

Creating a TMX from a bilingual corpus

OK, so we have an aligned corpus in bilingual format (delimited format). Imagine now we have this aligned corpus in Excel (or any other delimited format) and we want to make use of that content in our favorite CAT tool. The process I’m about to show you is good for all CAT tools, as they all can import TMX format. Our target is to turn xls or similar format to a format our CAT tool will read successfully.

Follow these steps to convert a bilingual text format file to TMX. We are going to make use of a very-handy and free open-source tool called Olifant, a tool developed in the Okapi Framework.

  1. Ensure you have an aligned corpus in Excel, with the leftmost column containing the source text and the target in the next column. If your corpus is not perfectly aligned, you may need to check this or even try alignment tools like LF-Aligner, from example. Paste the bilingual table in Notepad and save the file, ensuring the encoding is set to UTF-8. Now you have a bilingual source language-target language file.
  2. OK, let’s visit the download page to get Olifant. Unzip it, install and launch it.
  3. Press Ctrl+N to create a new Translation Memory with the name you choose. Add your language code to the target field. Make sure to use the local (for example ES-MX for Spanish, Mexico, FR-CA for French Canada, EN-GB for English Great Britain, etc).

    New Translation Memory - Olifant
    New Translation Memory – Olifant
  4. Go to File>Import. Now choose Tab-delimited files (.txt) from the dropdown menu. Locate the file you created in step 2 and click Open.
  5. In the Destination Field, set the Field Type of Column 1 to Text, Language EN-US (or whatever source language you’re working with), and for Column 2, Text again as Field Type and your target language code in the Language field.
  6. Press OK and hit Save. Your bilingual corpus has been converted to a TMX file!

Olifant is fitted with powerful editing tools including advanced Find/Replace. It is an extremely useful tool to clean TMs of unuseful segments. It also gives you the ability to delete, add, merge and edit segments on the spot. We will discuss these in further posts.

Further reading: Translation Memory

Leave a Reply

Your email address will not be published. Required fields are marked *

Where we are

USA

Boston

One Boston Place
Suite 2600
Boston MA 02108
(617) 621-4084
[email protected]

New York

228 E 45TH St Rm 9E
10017-3337 New York, NY

[email protected]  

Europe

Valencia

Pangeanic Headquarters

Av. Cortes Valencianas, 26-5,

Ofi 107

46015 Valencia (Spain)

(+34) 96 333 63 33
[email protected]

London

Flat8, 279 Church Road,
Crystal Palace
SE19 2QQ
United Kingdom
+44 203 5400 256

[email protected]

Madrid

Atrium
Castellana 91
Madrid 28046
Spain
(+34) 91 326 29 33
[email protected]

Asia

Hong Kong

21st Floor, CMA Building
64 Connaught Road Central
Hong Kong
Toll Free: +852 2157 3950
[email protected]

Tokyo

Ogawa Building 3F

3-37 Kanda Sakuma-cho

Chiyoda-ku, Tokyo

101-0025

[email protected]

Shanghai

Tomson Commercial Building,
Room 316-317
710 Dong Fang Road
Pu Dong, Shanghai 200122, China

[email protected]