30
Oct
Donald A. DePalma 30 October 2007
Filed under (Translation Technologies)
1 pepper rating

The Wall Street Journal recently opined that “translation software is at last good enough to help companies do business in other languages,” noting a hoary case study from Ford and posturings from Google, Microsoft, and SDL — and few real examples. But that’s fine. The Journal has just discovered MT, perhaps looking for juicier stories to put on its Page 3 as Rupert Murdoch’s News Corp takes over as dowager queen of the print media.

That said, MT is definitely on the must-review list for many companies and government agencies, but few are paying for it today. The biggest use of automated translation is free online machine translation (OLMT). How widespread? Last year Common Sense Advisory asked 2,430 consumers in non-Anglophone countries whether they tried free OLMT — more than half said that they sometimes, frequently, or always use machine translation to better understand English-language websites.

So, like it or not, information consumers will get what they want translated by a quick hop over to their favorite free MT site. How good will the translations be? Let’s consider some skeptical lines penned by our resident Carioca as he read our entry about changes in the MT guard. We decided to paste these immortal words into several free OLMT sites to test Portuguese into English translations rather than come up with the MT equivalent of the “the quick brown fox jumped over the lazy brown dog.” One of the systems used statistical machine translation (SMT), the other 3 were rules-based (RBMT) systems (see the full version of Automated Machine Technology for a discussion of the differences, pros, and cons).

“Escrevo estas linhas em português para testar o novo método de tradução automática do Google. Eu não tenho grandes expectativas com relação a este tipo de serviço. Confesso que sou usuário freqüente para ter acesso a informações em idiomas que não domino, como o alemão, chinês e japonês. O Google agora utiliza tradução automática com base estatística. Quer dizer, quanto mais traduções houver de uma frase, mais correta será a solução oferecida pelo computador.”

  • Google (SMT): “I write these lines in Portuguese to test the new method of automatic translation of Google. I do not have great expectations with respect to this type of service. I confess that I am a frequent user to access information in languages that no domain, such as German, Chinese and Japanese. Google now uses machine translation based statistics. That is, the more there translation of a phrase, more correctly is the solution offered by the computer.”
  • PROMT (RBMT): “I write these lines in Portuguese to test the new method of automatic translation of the Google. I have not big expectations regarding this type of service. I confess that I am a frequent user to have access to informations in languages that I do not dominate, like German, Chinese and Japanese. The Google now uses automatic translation with statistical base. It means, how much more translations will be of a sentence, more correct will be the solution offered by the computer.”
  • SDL (RBMT): “I write these lines in Portuguese for quiz the new approach of automatic translation of the Google. I do not have big expectations regarding this kind of service. Confessed that I am user frequent for have access the information in languages that do not dominate, as the German, Chinese and Japanese. The Google now utilizes automatic translation with statistical base. It want to say, specially translations will have of a phrase, more correct will be the solution offered by the computer.”
  • SYSTRAN (RBMT): “I write these lines in Portuguese to test the new method of automatic translation of the Google. I do not have great expectations with regard to this type of service. I confess that I am using frequent to have access the information in languages that I do not dominate, as the German, Chinese and Japanese. The Google now uses automatic translation with base statistics. It wants to say, the more translations will have of a phrase, more correct will be the solution offered for the computer.”

Judge for yourself. While none of these are perfect translations and one is definitely not at the quality level of the others, all 4 tell us that Senhor Beninatto wasn’t writing a shopping list for “pound pastrami, can kraut, six bagels.” For many web browsers, that ability to determine the subject of a communication will be good enough, allowing them to determine whether they want to invest more time in a given piece of information. Obviously, in more complex domains and in printed communications like owner’s manuals for a Porsche 911 GT3 RS (Santa, are you listening?) or how to adjust the control rods for a nuclear fission reactor, tuning and accuracy will be much more of an issue.