Technology and internet services were created in order to surpass the communication barriers across the globe . Even then we have not been able to fully break the barriers of languages. Scientists and developers all over spend countless hours as well as resources in order to improve the robotic achievements of language translation activity. For example, in order to provide their users with new experiences, Twitter and Facebook recently added translations that allow messages to be transformed into more than 40 languages when sent or received from their web pages or mobile applications. Google has also announced the development of ‘AutoML Translation’ as a way of improving the interpretative abilities of its translator in 27 dialects. The latest step forward for artificial intelligence is Neuro-Linguistic Programming, a technology that stimulates the human brain and tries to capture the meaning of phrases and words in their context in order to adapt them into a new language (Source: Cultureconnection.com).

We are still in the process of generating 100% artificial intelligence accuracy and larger establishments do not fully trust automatic machine and tool language translation and still hire professional human language translators for the task.There is no actual threat to the profession of several professional translators as the service provided by the machine are quite different from manual ones. Both the systems have their respective advantages as well as disadvantages.

Automated vs Manual translation

Our country India is home to 122 languages out of which 22 are designated as official languages. The complexity of human languages make effective language translation a humongous task for tool developers. Although this seems quite a task still its importance is many as there has been increased demand in providing digitized translated content. Few parts of the globe has the need to generate translation tools in so many different languages, as is required in the Indian subcontinent. The top six languages spoken in India are Hindi, Bengali, Telegu, Tamil, Marathi and Urdu.

With automated translation platforms changing the way information is spread and allowing for global interaction, there is no doubt that these tools play an integral role in developing cross-country communications, especially as they continue to improve. Online translation tools have increased and improved tenfold. In November of 2016, Google revealed a new version of Translate that employs a translation engine called Google Neural Machine Translation (GNMT). This system translates complete sentences using an artificial neural network. It links digital “neurons” in several layers, each one feeding its output to the next layer—a method loosely modeled after the human brain.

For example, new Google Translate began by translating eight European languages to and from English and then expanded to Chinese, Arabic, Hebrew, Russian, and Vietnamese.

Google has now extended neural translation to include nine Indian languages. No system can perfectly incorporate the efficiency of human translators therefore, there are still weaknesses to be addressed with the neural translation system.

For example, users recently found that typing variations of the Latin text placeholder “Lorem Ipsum” into Google Translate yielded random English phrases. In all lowercase, “lorem ipsum” produced “China.” Capitalized correctly (“Lorem Ipsum”) became “NATO.” In all lowercase “lorem lorem” was “China’s Internet,” and so on. Internet users were perplexed by these cryptic messages, and soon after, Google had to correct these erroneous translations.

Let’s take a look at the deadly combinations of words to amusing errors caused by the use of free translation software.

For larger companies like Coca Cola, translating and localizing the marketing message helps to build a strong local presence in each market. It so happened to Coca-Cola that when the brand decided to mix English and te reo Māori, New Zealand’s indigenous language translation fail happened. The combination translates into the native language as “Hello, Death.”

Not only Microsoft but also Google Translate can create hilarious situations, such as receiving 15,000 eggs, after placing the wrong order on a local website!

When we compare languages like French or Spanish with our Indian languages we find that all languages in India do not follow standard script and are very complex to input and still be grammatically correct. Even Devanagari script is very complex to be standardized for its multiple languages on an input device like a keyboard. For example,Amitabh Bachchan on KBC (Kaun Banega Crorepati) asks viewers a question for winning 1 Lac, users are expected to send SMS KBCQ followed by options A, B, C or D. He then quickly announces that SMS in local languages will be accepted. How many users you think will send the answer as (केबीसी क्यू क, ख, ग, घ) or any other local Indian languages?

Challenge of English to Indian Language - Syntactic Divergence

The important structural difference in English and most of the Indian languages is the word order. Since English uses the subject-verb-object order, and most of the Indian languages, primarily use subject-object-verb. Some of the Indian languages are of the nature of free word order. In the first example, we can see that word order “ate mango” becomes “mango ate” [aama khaayaa]

Table 1:

Example of Different Word Orders in English and Hindi.

Source Journal of Intelligent Systems Volume 28: Issue 3

If you want to read more about the challenges faced by developers in translation of English to Indian languages, you can refer to this journal.

Scientists are constantly trying to improvise and recently we have seen since we have so many different languages clearly the researchers had to find a generalized approach that could be easily adapted from one language to another. It was in news recently that a multipart machine translation architecture, Sampark, is nearing its completion as the combined effort of 11 institutions led by the Language Technologies Research Center at the International Institute of Information Technology in Hyderabad (IIIT-H). Sampark combines both traditional rules- and dictionary-based algorithms with statistical machine learning, and will be rolled out to the public at http://sampark.iiit.ac.in/.

Despite the improvement in machine learning, these examples show that machine learning cannot replace professional human interpreters when you plan global strategies.