I agree in general (long-term Japanese translator here), but ChatGPT 4 can still...

knighthack · on May 1, 2024

I'm more than willing to forgive ChatGPT for wrong readings of Japanese names.

Even common names can have non-standard readings. Native Japanese folks mix up readings for names, and need kana sometimes. Personally, I've never yet encountered any Japanese names beyond extremely common ones that I've not at least become suspicious whether I'm reading them correctly.

So I don't think ChatGPT's bad in that regard, if it can at least offer a few possible suggestions. The readings of names can be so damn arbitrary.

tkgally · on May 1, 2024

You're right about Japanese names in general, but current LLMs' mistakes can go far beyond the range of possible readings. I've seen errors on the level of 田中 (Tanaka) being rendered as Suzuki--in other words, one common name being replaced with another.

I'm sure this problem can be solved. The linked article suggests a promising approach--more and better Japanese data.

glandium · on May 1, 2024

Arbitrary example: 上川. It can be both うえかわ or かみかわ (or even other variants). Which one is it? Depends where the family is originally from, I guess.

It's not only people's name, it's also place names. Example: https://ja.m.wikipedia.org/wiki/%E5%85%AB%E5%B9%A1

pjc50 · on May 1, 2024

> giving blatantly wrong readings for Japanese names

Native speakers can do this as well. See "Yagoo" meme; Motoaki Tanigo, CEO of Hololive, is universally known to fans by the wrong name because one of the vtubers read 谷 as in 渋谷 "Shibuya" rather than as "tani". https://en.wiktionary.org/wiki/%E8%B0%B7#Japanese

Japanese is phonetic except when it isn't.