Jump to content

Wikipedia talk:Language recognition chart

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

English

[edit]

How exactly does "DFGLNQRUVWYZ... and no other" = English?Brooklyn Nellie (Nricardo)02:31, Mar 17, 2004 (UTC)

The author meant that English uses the Latin Alpha bet, with no further letters or diacriticals. I'll change that to be the entire Latin (uppercase) Alpha bet.Lisa Paul07:16, 27 Jun 2004 (UTC)
It was my intention to inlude only characteristic letters (I excluded letters of Latin Alpha bet which look same as letters of Cyrillic or Greek Alpha bet).Nikola04:30, 12 Jul 2004 (UTC)

A resource

[edit]

This page[1]attempts something similar to the wikipedia chart and may have info not already on wikipedia, if someone wants to compare. also this discussion[2]in a livejournal community lists corrections and additions to the aforementioned web page

Devanagariand other Indic scripts?

[edit]

I don't know a darned thing about Asian Alpha bets, but maybe somebody could put in a sample ofHindi,Nepali,etc. Also nice would be a sample ofThai.-Lisa Paul07:52, 27 Jun 2004 (UTC)

You may be interested in omniglot[3]--SS20:05, 22 Aug 2004 (UTC)

Klingon? Are you serious?

[edit]

Come on people. Does Wikipedia have to start to look like a dumping ground for losers? (yeah, I know I'm one, but still, I try to hide it from time to time)Nelson Ricardo01:30, Aug 23, 2004 (UTC)

Klingonis being taken more seriously today than you may realise. It wouldn't surprise me if, in a few years, it became more widely spoken than Esperanto.AdmN01:52, 23 Aug 2004 (UTC)

The purpose of this page was originally to enable users to see in which language a new article is written in. As someone might write an article in Klingon...Nikola08:16, 23 Aug 2004 (UTC)

Well, at the top it says "document" rather than "Wikipedia article"... and there are certainly documents written in Klingon. I wouldn't expect anyone to write a en.wikipedia article in Klingon, but then I wouldn't expect anyone to write an en.wikipedia article inFrencheither. Perhaps this should be moved out of the Wikipedia namespace and mingled with the general articles. It's certainly fascinating, and it may be generally useful. And no, I did not add Klingon as a joke. --SS16:08, 23 Aug 2004 (UTC)

Namespace

[edit]

I moved this page to the main namespace. I understand the initial motivation, but I don't see why hide this useful information from the general public. --Taku05:17, Dec 5, 2004 (UTC)

I second the utility of this page, although it needs work. --babbage13:08, 16 January 2007 (UTC)[reply]

Armenian/Georgian Alphabets

[edit]

All I see is a bunch of question marks for the Armenian/Georgian Alpha bets.MonsterOfTheLake17:15, 28 Dec 2004 (UTC)

Your computer doesn't have support for these fonts.Mikkalai20:24, 28 Dec 2004 (UTC)

Languages using Arabic or Arabic-derived script

[edit]

How does one tell apart Arabic, Farsi, and Urdu?

My totally non-expert observations indicate that Persian (Farsi) writing is more 'broken up' than Arabic if that makes any sense. —Trilobite(Talk)06:34, 8 Mar 2005 (UTC)
SeePersian Alpha bet.Basically, Farsi, Kurdish (when not written in the Latin Alpha bet), Pashto, and other languages use an "extended" Arabic Alpha bet, including letters for sounds which are present in their language, but not in Arabic. Urdu, well, justlook at it,because I can't describe that nearly as well. -Fsotrain0901:04, 28 December 2006 (UTC)[reply]

Although both use the Arabic script, Arabic is a much flatter language, in that all letters sit upon the line, eg. اللغة العربية(the Arabic language).

That's one style of Arabic writing; it's not universal. —Tamfang(talk)18:39, 20 October 2012 (UTC)[reply]

I think the article spends too much time describing different ways of writing Greek, and not enough on the grammar and vocubulary. If the purpose here is to quickly figure out which language something is written in, so a translator can be contacted, there is no need to get into the details of monotonic vs. polytonic. Like Hebrew, Greek is instantly recognizable; no other Alpha bet is sufficiently close for confusion. (Most Greek that shows up here is in the Greek Alpha bet, not in "greeklish".)

However it would be worth regognising the differences between Ancient and Modern Greek. While an educated native speaker of Greek can usually understand the ancient language; an ancient text would be better translated by aclassicist.On the other hand a modern text would be better translated by a native speaker who is more aware of current cultural references.

Segv11(talk/contribs)22:53, 14 January 2006 (UTC)[reply]

Missing Greek characters

[edit]

I think some of the upper-case Greek letters are missing from the list. Could someone fix that? I'm not good at generating non-Latin characters on my keyboard?Truthanado02:40 23 March 2007

Done. Also added a low case "o" that was missing. Should we perhaps also add the accented characters (ά etc.) as we are after characters and not letters?

Arabic

[edit]

The easiest way to tell a text in arabic script isn't arabic is the extra letters that are found in the persian languages. There is no P in arabic, also the Jhe, but it is a rare character in Persian farsi. It's hard to use the words as a reference, excluding the obvious like the al article in arabic, and the pronouns in each language.

The four letters:

پ pe ژ jhe چ che گ gaf

are missing from arabic and found in farsi. Pe, Che, and Gaf are very common.

Arabic uses the "al" article (The) very VERY often and is the easiest way to confirm something is an arabic text.

it can appear in different forms though

ال الا

This is the easiest method of recognizing words that are arabic beginning with those letters (On the right side of a word).

examples

Also Personal pronouns in arabic:

انا انت هو هی نحت انتم انتن هم هن

And common pronouns in farsi:

من تو شما انها

common verb endings in farsi:

کرد شد است ام

This should be enough to give you an insight into recognizing farsi and arabic texts, the challenging one to identify is urdu, someone should figure out a method of recognizing that, because it's so different from all other languages written with the arabic script. About half of the word pool from any language written with the arabic script is arabic EXCEPT urdu, which is probably 70% hindi.

Afrikaans

[edit]

I'd say that this article is wrong about Afrikaans using no other letters. For example the Afrikaans word for 'morning' ismôre,the word for 'bird' isvoël,'world' iswêreldand 'bridges' arebrûe.

encyclopedic?

[edit]

While the page is certainly useful, it's highly unencyclopedic, I don't know if WikiPedia is a good place for it. If so, it should at least start with a paragraph a discussion... also note that recongnising languages by squiggles is not always useful online, as many people write still in squiggleless ASCII due to their local technical limitations or wish to communicate with others who might be encodingly-challanged.

This page was never intended to be encyclopedic, but was intended almost as a help page to accompanyWikipedia:Pages needing translation into Englishto assist people to identify what language an article was written in. It was originally created in the Wikipedia namespace and moved to the main encyclopedia as per the discussion above, and therefore I'm going to remove the unencyclopedic template. --Sepa17:46, 18 April 2006 (UTC)[reply]

can only be Vietnamese?

[edit]

Although the letters with a dot below mainly occur Vietnamese, ẹ and ọ are also used inYorùbá language,ị and ụ are used inIgbo language,ạ is used inRotuman language.-Hello World!03:27, 28 October 2007 (UTC)[reply]

Suggested French and Spanish additions

[edit]

French: W only used in loan words [can only think of whisky and proper nouns]. Spanish: ll common at the starts of words.

SimonTrew(talk)15:17, 4 March 2009 (UTC)[reply]

Hawaii

[edit]

I added a link toHawaiian languageunder basic latin Alpha bet. I now see it is referenced, but not linked, a little farther down that it "may" have overscores in texts. I am happy to revert, undo, whatever, though I hope it wuld be referenced in one place or the other, but not sure which should go (or I'd have just undone it myself). Advice please.SimonTrew(talk)02:46, 15 April 2009 (UTC)[reply]

Namespace, 2nd

[edit]

The page lies in the WP-Namespace currently, but according to the discussion a few items above, it should actually be in the main namespace (which I'd also prefer). For some reason, it was moved back here with the strange reason that "it causes some problems as an article". Well it's not an article after all, it's a table or a list anyway. After all,Language recognition chartstill links here, which is quite a no-no after all. I suggest moving it back to main as in the discussion above. --PaterMcFly(talk)10:22, 24 June 2009 (UTC)[reply]

Farsi?

[edit]

I do see Persian, but no mention iirc of Farsi. What about changing to "Persian (Farsi)", or "Farsi (Persian)"?Nikevich(talk)05:48, 11 October 2010 (UTC)[reply]

it's called Persian in English.Choyoołʼįįhí:Seb az86556> haneʼ05:52, 11 October 2010 (UTC)[reply]
Surely, but I also see "Farsi" in news stories and the like. "Persian" seems to be more traditional. However, I'm only suggesting, not passionate at all. (^_^) Regards,Nikevich(talk)06:36, 11 October 2010 (UTC)[reply]

Maltese?

[edit]

Should Maltese be included, or does it not have enough speakers?Nikevich(talk)05:48, 11 October 2010 (UTC)[reply]

Alpha bet in article with no associated language

[edit]

There's a bullet in the Characters section with A, Ą, Ã, B, C, D, E, É, Ë, F, G, H, I, J, K, L, Ł, M, N, Ń, O, Ò, Ó, Ô, P, R, S, T, U, Ù, W, Y, Z, Ż that has no language mentioned. If no one knows which language it is supposed to indicate, I'll take it out.--Wikimedes(talk)19:58, 22 July 2013 (UTC)[reply]

I haveidentifiedit as theKashubian Alpha bet.
Wavelength(talk)20:54, 22 July 2013 (UTC)[reply]
Very good. Thanks.--Wikimedes(talk)05:06, 23 July 2013 (UTC)[reply]

Le

[edit]

Is there a reason why a section is titled "French (le français)" and not "French (français)"? We don't give the article for any other language name.Apokrif(talk)19:17, 18 November 2015 (UTC)[reply]

Syriac

[edit]

I have added the Syriac Alphabet Bulahyatain(talk)23:21, 22 January 2019 (UTC)bulahyatain[reply]

Added Maltese

[edit]

I added a Maltese section Bulahyatain(talk)23:38, 22 January 2019 (UTC)bulahyatain[reply]

Language recognition chart

[edit]

Hello sir. I want to add my language to Wikipedia:Language recognition chart and i don't know how. Will u do it for me... The language is Tamazight or (berber)... Plzz 😟Massinissa014(talk)12:28, 29 September 2021 (UTC)[reply]

Wikipedia:Language recognition chart

[edit]

Hello sir. I want to add my language toWikipedia:Language recognition chart and i don't know how... Can u do it for me plllz. The language is Tamazight (ⵜⴰⵎⴰⵣⵉⵖⵜ) its called berber too... Plz add it men:) and thank youMassinissa014(talk)12:32, 29 September 2021 (UTC)[reply]

I'll look into taking care of it, at least to present the Alpha bet, probably in the next few days.Largoplazo(talk)14:26, 29 September 2021 (UTC)[reply]
@Massinissa014:Sorry for the delay, but I just added it to the end of the Characters section. Please check to see whether I got it right.Largoplazo(talk)23:16, 29 April 2022 (UTC)[reply]

Add more langauges that use cyrillic

[edit]

Here is a list with more languages that use cyrillic and how to distinguish them, maybe someone could help me add them: https:// quora /How-do-I-tell-the-difference-between-languages-that-use-cyrillic-script— Precedingunsignedcomment added byVloxxity(talkcontribs)21:48, 15 January 2023 (UTC)[reply]

Removal of extraneous information in French section

[edit]

It was pointed out to me that the French section contains a fair bit of phonological and historical orthographic information, which isn't really useful for a page whose scope is to "help [...] determine the language in which a text is written."

I'veWP:BOLDgone ahead and removed this information, to refocus this section on describing more "naïvely" how to identify French. While the removed information is in the history, for convenience, please find below the points that I've significantly modified:

  • Common digrams for vowels, that either were historically diphthongs or long (au,ai,ei,ou,or final-ez), or are nasalized (an,en,in,onand more rarelyun,where thenis muted to anmbeforeb,porm) possibly surrounded by mute letters for longer polygrams (e.g.eau,ein,ain,butoinis a common diphthong).
  • Common digrams as well for some consonants (ch,rarelysh,gu-,gu-) or semi-consonants (-ill-)
  • Final consonants of words are generally mute (notablys), except to form vocalic digrams.

Laogeodritt[Talk|Contribs]00:04, 23 January 2023 (UTC)[reply]

French à

[edit]
  • Accented letters: [...]àonly in the wordàand at end of words.

What French word (other thanà) has finalà?—Tamfang(talk)04:01, 9 February 2023 (UTC)[reply]

,çà(as inçà et là),delà,deçà,déjà,holà,voilà.Largoplazo(talk)04:16, 9 February 2023 (UTC)[reply]
D'oh, I thought ofjust as I clicked the notice of a change to this page. —Tamfang(talk)04:37, 9 February 2023 (UTC)[reply]

Chakma and Burmese

[edit]

I'd like to suggest that the Chakma and Burmese scripts be added to this chart. They are similar scripts that are currently missing from this chart. I don't know if there are any guidelines to determine if languages should be put in this list, but if so, they should be put under "Brahmic family of scripts" in "Characters". I also don't know in what order to put the characters in.Alpha514(talk)15:26, 25 July 2023 (UTC)[reply]