Curva Fin Bloque
NEWS 24 AUGUST, 2014

Hindi to Punjabi machine translation software

Hindi to Punjabi machine translation software

As reported by the Hindustan Times, the University of Edinburgh in Scotland and the Technology Development for Indian Languages (TDIL) of the Indian government have hosted the first  Hindi to Punjabi machine translation software, overcoming several of Moses initial limitations.

The web-interface program has not been released to the public yet. It is a long effort developed by a faculty member from the Punjabi University, Mr. Ajit Singh, an assistant professor at the MM Modi College, with help from the University’s Computer Science Department.

The software was developed recently and tested at both organizations. It took several months to develop as the problem was that the translate.cgi expected to have many copies of the daemon.pl running, all listening on different ports. Each one should wrap a different instance of Moses. Therefore, a web-based translation system could not be based on the latest versions of Moses, which are all multi-thread as it had been written before Moses had threads. The program had to be multi-process.

Initially, Mr Singh installed the moses server and the web server on a single machine on a linux platform. Afterwards, he tested on the local host the system to work.
However, when he installed the system on web server for public use, part of the system worked fine but most of it was not getting the translation of the input text. Instead, the input text was transliterated in the post processing script written in transliterate.pl

Hindi to Punjabi Machine Translation System
Hindi to Punjabi Machine Translation System

Although there is not “from English” or “into English” translation, developers cannot hide their satisfaction. Users of the software are not proficient in English anyway, which makes this development quite unique. Vishal Goyal, assistant professor at the department of Computer Science at the Punjabi University confirmed that “the software has been made available online on the servers of the Edinburgh University and the TDIL.” In 2011, over 33 million people spoke Punjabi in India, whereas around 50% of the population in Pakistan (some 78 million) are native Punjabi speakers. Punjabi is written in two scripts: the Gurumukhī script and Shahmukhī script.

The system provides a Devanagari Typing Pad on the screen, although users can type  from their own system keyboard. In this case, they need to choose Keyboard Mapping for typing. In Pakistan, Punjabi is generally written using the Shahmukhī script, created from a modification of the Persian Nastaʿlīq script. In India, Punjabi is most frequently rendered in Gurumukhī, but it is also written in the Devanagari script or Latin script due to influence from Hindi and English, respectively.

When the input code is other than Unicode encoding, the user has to select the font for the input text and the text is automatically converted from non-unicode to unicode text encoding. The software also provides for font conversion and can translate several file formats. The right side of the panel provides a website conversion feature so websites in Hindi can be translated into Punjabi, although at the time of reporting the service was not available.

Hindi font conversion
Hindi font conversion

This Hindi to Punjabi machine translation development is almost 95% accurate according to Prof Vishal Goyal, which is a extremely high rate of accuracy considering the languages are not related: Punjabi is an Indo-Aryan language and Hindi was developed from the vernacular dialect of Delhi, the Khariboli, and the surrounding area in Uttar Pradesh. In the 1600’s, Hindi was known as “Urdu” due to the Persian influences received during the Mughal Empire. Nowadays, Hindi uses the Devaganari script and uses Sanskrit words for etymology whereas Urdu was and is written using the Persian script and uses more Persian words.

Punjabi University vice-chancellor Jaspal Singh congratulated the department and applauded faculty’s efforts in bringing international recognition to the institution.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Where we are

USA

Boston

One Boston Place
Suite 2600
Boston MA 02108
(617) 621-4084
boston@pangeanic.com

New York

228 E 45TH St Rm 9E
10017-3337 New York, NY

info@pangeanic.com  

Miami

429 Lenox Ave

Miami Beach FL 33139

(305) 853-8416

info@pangeanic.com

Europe

Valencia

Pangeanic Headquarters

Av. Cortes Valencianas, 26-5,

Ofi 107

46015 Valencia (Spain)

(+34) 96 333 63 33
info@pangeanic.com

London

Flat8, 279 Church Road,
Crystal Palace
SE19 2QQ
United Kingdom
+44 203 5400 256

london@pangeanic.net

Madrid

Atrium
Castellana 91
Madrid 28046
Spain
(+34) 91 326 29 33
info@pangeanic.com

Asia

Hong Kong

21st Floor, CMA Building
64 Connaught Road Central
Hong Kong
Toll Free: +852 2157 3950
info@pangeanic.hk

Tokyo

Ogawa Building 3F

3-37 Kanda Sakuma-cho

Chiyoda-ku, Tokyo

101-0025

tokyo@pangeanic.net

Shanghai

Tomson Commercial Building,
Room 316-317
710 Dong Fang Road
Pu Dong, Shanghai 200122, China

shanghai@pangeanic.net