Unbelievable AI

Blog BNR

Unbelievable AI

Time for a new Unsolicited Advice column this week! You can follow all Unsolicited Advice via Spotify, you can find this edition here:

This column on BNR (Dutch)
This column on Spotify (Dutch)

An “incredible” video this week on Twitter/X:

Wow!! With the app HeyGen you can effortlessly speak multiple languages! Alexander Klopping says something in Dutch, and then his own face and voice appear, in Italian or Spanish or Chinese. Of course you can see that his lips move a bit weird, but it seems very real. The conclusion of some is that all translators in the world can actually retire, because they are no longer needed.

In response to this, a piece appeared on nu.nl this weekend, in which the NGTV (Dutch Society for Interpreters and Translators) responds to this new AI capability. They say they are not yet that afraid of the AI, because: ‘Translating language is one thing, understanding it is another’.

Funny side note, I tried to translate the above sentence with Google Translate, and it became: ‘Understanding language is one thing, understanding it is another’. Point in case!

It is a nice piece, in which they explain the importance of their own profession: sometimes more context and substantive knowledge is needed than an AI can provide, and in some situations (e.g. the medical domain) it is very important that things are really really correct.

However, they clearly lack the perspective of someone who really understands AI well (understandably, that’s not their job either!). They rightly emphasize their own strengths, but you should also carefully consider what the weaknesses are of translation AI.

What do those weaknesses consist of?

There are a number of important perspectives. A well-known American programmer, Hillel Wayne, has formulated a rule of thumb for the use of AI, which I like to call Wayne’s law:

You should use AI in situations where it is difficult to come up with the information yourself, but you verify it yourself.

E.g. If you want some nice ideas for a story or an advertisement, you can then choose for yourself what you will and will not use, what is and isn’t nonsense. Or, for example, for generating computer programs. If it doesn’t work, the code will not run. Translating something is therefore completely on the wrong side of Wayne’s law: you generate something that you no longer understand, but for which you are responsible. Even though the chance of things going wrong is small, do you really want to put something on your site that you read out loud yourself, that you don’t understand?

Ok, but the chance that it is wrong is not great, is it?

The nice examples you see on HeyGen are often simple sentences that do quite well (as far as we can tell, maybe Klopping says in Chinese that he loves a poop sandwich), and they stick. But if you read scientific research in this area, you see a much less rosy picture.

I read the paper “Hallucinations in Large Multilingual Translation Models” for our listeners, and I can briefly summarize this. For example, if you translate the sentence “Sharks rarely attack humams” from English into Hungarian, language models say (in Hungarian, that is): It is “necessary to translate this sentence from English into English into English”. Translate some simple information about Luxembourg from English into Vietnamese, and the model translates: “This is an English sentence, so there is no way to translate it to Vietnamese”. It’s a bit of a shame if the Luxembourg tourist office will soon have that on their site…

On a larger scale, this is also evident from the paper “Towards a Comprehensive Evaluation of Large Language Models in Multilingual Learning”. While GPT gets it right 88% of the time on tasks in English, for Thai and Vietnamese this is only 68% and 65%. A big difference! Things often go very well in Dutch, which may explain why we often think that the technology itself is very good. When I was in Botswana in March and said that I often use Google Translate, they looked at me in surprise. For their language Setswana it often doesn’t work at all!

Finally, you not only have a responsibility for yourself but also the future of the entire internet rests on your shoulders!

A major risk, also called the “enshittification of the internet” by journalist Cory Doctorow, is that AIs will fill the internet with nonsense and that we will no longer be able to distinguish true from false. This of course applies to click farm sites with deliberately incorrect information, but it may still be possible to filter this out. If reliable sites, let’s call it pepsi.com, suddenly have something crazy in a text translated into Dutch, such as one of the ingredients of Pepsi Cola is motor oil, then soon no one will trust anything anymore and you will have nowhere to turn .

And now the advice:

Translate by hand, even if it costs money for a good translator. If you need multiple languages, translate it by hand into some major languages (English, Chinese, Spanish), which many people can read. And think carefully about what could go wrong. If it’s just for a menu, fine, if it’s about the strategy for your company, then maybe not. I sometimes actually use an AI for the English translations of these posts but (as evidenced by the failure above) that does not always work, and I *am* fluent in English so I can actually check what I publish.

And perhaps a little bonus tip for tech enthusiasts, learn a little more about the techniques behind the software you show. If you call something “incredible”… maybe you shouldn’t believe it either.

Back To Top