Showing posts with label language. Show all posts
Showing posts with label language. Show all posts

2018-05-21

Reading foreign languages

While competition is coming from new Chinese companies, Japanese model manufacturers are still among the most appreciated ones in terms of model quality. The English translations of the instructions are not necessarily always to the same high standards, but usually are sufficient for getting the model built. Still, there is obviously more written in Japanese than has been translated, so, while gearing up for building Fine Mold’s Savoia S.21F (i e Porco Rosso’s flying boat), I thought it would be interesting to find out what all the text in Japanese actually said. A little exploration found two tools: NewOCR, an online service to convert images to text, handling multiple languages and alphabets. So that let me scan the text from the instruction sheet into Japanese text, which I of course still couldn’t read. The next step was to use Google Translate. There is a cool feature in Google Translate, that I discovered then, which is that you can actually draw, in this case, Japanese characters in an input window. This let me correct characters that had gotten corrupted in the scanning. Of course, the characters most likely to have been misread are also the most complex to draw. 翼 (wing) appeared quite a few times, and had to be redrawn manually.

Well, so now I should have a text in English that I could read? Well, of course not. As shown by earlier examples, machine translation is an inexact science, even for languages within the same Indo-european family, but translating Japanese into English tends to render nonsense, like: “Rui is distinguished simply referred to as "F-type" In that, but the repair work Whatever You're in which was whether to follow the Detection Ichiru Suppose that you try.” “In such Ime temporary, please enjoy the Italian machine of power Rahul ma one King between War.”

Interestingly enough, names, that should have been the same string of characters in every place, got translated differently in different sentences – sometimes as the correct Italian name, sometimes as whatever Japanese expression matches the Katakana transcription of the name. Presumably this has to do with the algorithms of Google, that sometimes recognise the context and insert the correct name, sometimes don’t realise there should be a name. I had hoped for at least useful translations of the colours, but “power over key” is the rendition of what we know as ”khaki”.

So, still not entirely helpful.

2012-05-11

Word of the week: pronounciation

“Verbing weirds language”, as famously noted by Calvin and indeed there’s every year a new proliferation of new nouns to be verbed. However, other word classes tend to be quite hide-bound. Now there’s been a small amount of pronounciation going on in the creation of genderless pronouns, such as “E” and “Xe”. In Swedish the suggestion of ”hen” as a gender-neutral third person singular pronoun has created some amount of debate and indeed ridicule. It has however been pointed out that the gendered third person plural pronouns were dropped already in mediæval Swedish, so that the depronounciation process started a long time ago.

2012-04-30

Pronounciation

A thing that annoys me in most dictionaries is that they give the pronounciation of people’s last names, but never of their first names, even if they may be a lot trickier. As for example Evelyn Waugh. The other day I suddenly realised that even though I’ve always pronounced the first name [ˈevlɪn], maybe that’s just my Finnish accent, and probably the real pronounciation is [ˈiːvlɪn]. Of course even Wikipedia only bothers to give the pronounciation of his last name ([ˈwɔː]), which wasn’t that hard to figure out anyway.

Googling about I found Forvo, which contains audio pronounciation samples of about a million words in a few hundred languages. Yay! So, what about Evelyn then? Well, four samples give four different pronounciations, including both above, so apparently native speakers don’t know either.

2009-12-05

Misguided striving for perfection

When people say that something works “works like a machine” the implication is that this work is not only ceaseless but flawless. In particular this seems to appply to machine intelligence, intelligent robots and computers are, not only in fiction, assumed to have all information—and correct information only—available and then flawlessly proceed to the correct conclusion. Certainly often evil conclusions, but still the only possible conclusion.

Well, of course real software doesn't work that way. I have not worked with AI per se, but any interactive systems should be as “intelligent” as possible, where this in practice means that one studies users and figures out what they want done most of the time and then try to make the interface anticipate what the user wants in every given situation. In many cases this turns out not to be what the user wanted and bad user interfaces tend to do their anticipation in such a way as to annoy the user. Good user interfaces on the other hand are unobtrusive and smoothly let the user continue with whatever was actually intended, silently withdrawing whatever suggestion might have been proposed.

Machine translation has always been an important task for AI, and it seems the applications I have tried go for the ideal of the all-knowing computer. Thus if you submit a text for translation, you get the output all at once, unalterable, regardless of how bizarre it ends up. Shouldn't it be possible, in this day and age, to have an interactive translation application which presents alternative interpretations of the input and lets the user guide the translation? Certainly, even when I as a person translate text I end up having to make notes in the output, stating that a particular interpretation is dependent on a previous term having meant this and not that.