Most of the translation on the planet is now done by Google Translate

“In a given day we translate roughly as much text as you’d find in 1 million books. To put it another way: what all the professional human translators in the world produce in a year, our system translates in roughly a single day. By this estimate, most of the translation on the planet is now done by Google Translate.”

Pulled from Breaking Down the Language Barrier via the Google Translate Blog:

The rise of the web has brought the world’s collective knowledge to the fingertips of more than two billion people. But what happens if it’s in Hindi or Afrikaans or Icelandic, and you speak only English—or vice versa?

In 2001, Google started providing a service that could translate eight languages to and from English. It used what was then state-of-the-art commercial machine translation (MT), but the translation quality wasn’t very good, and it didn’t improve much in those first few years. In 2003, a few Google engineers decided to ramp up the translation quality and tackle more languages. That’s when I got involved. I was working as a researcher on DARPA projects looking at a new approach to machine translation—learning from data—which held the promise of much better translation quality. I got a phone call from those Googlers who convinced me (I was skeptical!) that this data-driven approach might work.

I joined Google, and we started to retool our translation system toward competing in the NIST Machine Translation Evaluation, a “bake-off” among research institutions and companies to build better machine translation. Google’s massive computing infrastructure and ability to crunch vast sets of web data gave us strong results. This was a major turning point: it underscored how effective the data-driven approach could be.

But at that time our system was too slow to run as a practical service—it took us 40 hours and 1,000 machines to translate 1,000 sentences. So we focused on speed, and a year later our system could translate a sentence in under a second, and with better quality. In early 2006, we rolled out our first languages: Chinese, then Arabic.

We announced our statistical MT approach on April 28, 2006, and in the six years since then we’ve focused primarily on core translation quality and language coverage. We can now translate among any of 64 different languages, including many with a small web presence, such as Bengali, Basque, Swahili, Yiddish, even Esperanto.

Today we have more than 200 million monthly active users on translate.google.com (and even more in other places where you can use Translate, such as Chrome, mobile apps, YouTube, etc.). People also seem eager to access Google Translate on the go (the language barrier is never more acute than when you’re traveling)—we’ve seen our mobile traffic more than quadruple year over year. And our users are truly global: more than 92 percent of our traffic comes from outside the United States.

 

by Franz Och

Distinguished Research Scientist, Google

 

// Thx to – The Next Web

Join the Conversation

No comments

  1. Esperanto is more widely used on the net than some people think. A quick search – using Google – soon makes that clear.

  2. Using the translator on the bottom of the page, that I just installed, I translated Bill’s comment into Esperanto:

    “Esperanto estas pli vaste uzita en la reto ol kelkaj personoj pensas. Rapida serĉo – uzas Google – frue faras ke klara.”

    The translator is pretty powerful (and cool).

Leave a comment

Leave a Reply to Bill Chapman Cancel reply

Your email address will not be published. Required fields are marked *