Google's AI translation tool seems to have invented its own language
Google Translate processes more than 140 billion words each day. Image: REUTERS/Pawel Kopczynski
Back in September 2016, Google launched its Neural Machine Translation (GNMT) system, which uses deep learning to deliver more natural translations between languages.
Google Translate originally supported only a handful of languages when it launched 10 years ago; today that number has risen to 103. The system translates more than 140 billion words each day.
Creating a computer system to translate multiple languages is complex. The people at Google who built it wanted to find out just how clever their system was. So they came up with a challenge. They taught the machine to translate English to Japanese and vice versa. Then they taught it to translate English to Korean and also the reverse translation. So far, so ordinary. But what followed was truly extraordinary.
Lost in (AI) translation
The researchers discovered GNMT had taught itself to deliver ‘reasonable’ translations of Japanese to Korean – and vice versa – without using English as a bridge. It appears the machine had constructed its own language that reflects the concepts it uses to translate between languages it has been trained to understand.
A single sentence visualisation is captured here, representing the system’s memory of multi-directional translation between Japanese, Korean and English languages:
The discovery, called an ‘interlingua’, is in its early stages, and may be basic or highly sophisticated in its capabilities. In the above graphic, part (a) shows an overall geometry of the translations. Sentences sharing the same meaning – not language – also share the same colour. Part (b) is a close-up of one of the grouped meanings, and finally, part (c) segments the meanings into source languages.
The Google Translate team explains that "within a single group, we see a sentence with the same meaning but from three different languages. This means the network must be encoding something about the semantics of the sentence rather than simply memorizing phrase-to-phrase translations. We interpret this as a sign of existence of an interlingua in the network."
In simple terms, the system has created something by itself, with no human direction, to seemingly support its understanding of human languages.
Don't miss any update on this topic
Create a free account and access your personalized content collection with our latest publications and analyses.
License and Republishing
World Economic Forum articles may be republished in accordance with the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International Public License, and in accordance with our Terms of Use.
The views expressed in this article are those of the author alone and not the World Economic Forum.
Stay up to date:
The Digital Economy
Forum Stories newsletter
Bringing you weekly curated insights and analysis on the global issues that matter.
More on Fourth Industrial RevolutionSee all
Mihir Shukla
December 23, 2024