Page MenuHomePhabricator

Autonyms for kk-arab and kk-cn seem to be wrong
Open, MediumPublic

Description

The autonyms in Names.php and langdb for the two Arabic-script Kazakh entries, kk-arab and kk-cn, seem to be wrong.

kk-arab:

  • In Names.php, the part in brackets is "تٴوتە" (teh, high hamsa, waw, teh, ae)
  • In langdb, the part in brackets is "تٶتە" (teh, high hamza waw, teh, ae)

According to https://en.wikipedia.org/wiki/Kazakh_alphabets#Use_of_Hamza, the hamza can only come at the beginning of a word, which would imply these spellings are not possible. It also says it is not used in words containing "e". "e" is written with the character "ae", which would mean this word should not have a hamza.

kk.json in core has "توتە" (teh waw teh ae) without a hamza. That is also the only spelling I've found on other websites.

kk-cn:

  • The part in brackets is "جۇنگو" (jeem, u, noon, gaf, waw)

A Google search for it finds almost zero results which aren't linked to MediaWiki. There are two from kazakh.people.com.cn but on those pages it's part of someone's name. DuckDuckGo only finds pages on Commons.

The word is a transliteration of 中国 (Zhōngguó) and the only spelling I've found elsewhere (including the Chinese Wiktionary) is "جۇڭگو" (jeem u ng gaf waw).


Based on a recent discussion with @Amire80 an ideal next step here would be to remove the two language codes. Adding steps below:

Draft plan, may change:

Event Timeline

Restricted Application added a subscriber: alaa. · View Herald TranscriptApr 8 2025, 6:10 PM
MaryMunyoki triaged this task as Medium priority.Apr 8 2025, 6:10 PM
MaryMunyoki subscribed.

might need to revisit later hence removing tag