Alphabetical order

Alphabetical orderis a system wherebycharacter stringsare placed in order based on the position of the characters in the conventional ordering of analphabet.It is one of the methods ofcollation.In mathematics, alexicographical orderis the generalization of the alphabetical order to other data types, such assequencesof numbers or other orderedmathematical objects.

When applied to strings orsequencesthat may contain digits, numbers or more elaborate types of elements, in addition to alphabetical characters, the alphabetical order is generally called alexicographical order.

To determine which of two strings of characters comes first when arranging in alphabetical order, their firstlettersare compared. If they differ, then the string whose first letter comes earlier in the alphabet comes before the other string. If the first letters are the same, then the second letters are compared, and so on. If a position is reached where one string has no more letters to compare while the other does, then the first (shorter) string is deemed to come first in alphabetical order.

Capital or upper caseletters are generally considered to be identical to their corresponding lower case letters for the purposes of alphabetical ordering, although conventions may be adopted to handle situations where two strings differ only in capitalization. Various conventions also exist for the handling of strings containingspaces,modified letters, such as those withdiacritics,and non-letter characters such as marks ofpunctuation.

The result of placing a set of words or strings in alphabetical order is that all of the strings beginning with the same letter are grouped together; within that grouping all words beginning with the same two-letter sequence are grouped together; and so on. The system thus tends to maximize the number of common initial letters between adjacent words.

History

Alphabetical order was first used in the 1st millenniumBCEby Northwest Semitic scribes using theabjadsystem.^[1]However, a range of other methods of classifying and ordering material, including geographical,chronological,hierarchicalandby category,were preferred over alphabetical order for centuries.^[2]

Parts of theBibleare dated to the 7th–6th centuries BCE. In theBook of Jeremiah,the prophet utilizes theAtbash substitution cipher,based on alphabetical order. Similarly, biblical authors usedacrosticsbased on the (ordered)Hebrew alphabet.^[3]

The first effective use of alphabetical order as a cataloging device among scholars may have been in ancient Alexandria,^[4]in theGreat Library of Alexandria,which was founded around 300 BCE. The poet and scholarCallimachus,who worked there, is thought to have created the world's firstlibrary catalog,known as thePinakes,with scrolls shelved in alphabetical order of the first letter of authors' names.^[2]

In the 1st century BC, Roman writerVarrocompiled alphabetic lists of authors and titles.^[5]In the 2nd century CE,Sextus Pompeius Festuswrote an encyclopedicepitomeof the works ofVerrius Flaccus,De verborum significatu,with entries in alphabetic order.^[6]In the 3rd century CE,Harpocrationwrote aHomericlexicon alphabetized by all letters.^[7]In the 10th century, the author of theSudaused alphabetic order with phonetic variations.

Alphabetical order as an aid to consultation started to enter the mainstream ofWestern Europeanintellectual life in the second half of the 12th century, when alphabetical tools were developed to helppreachersanalysebiblicalvocabulary. This led to the compilation of alphabeticalconcordancesof the Bible by theDominican friarsinParisin the 13th century, underHugh of Saint Cher.Older reference works such asSt. Jerome'sInterpretations of Hebrew Nameswere alphabetized for ease of consultation. The use of alphabetical order was initially resisted by scholars, who expected their students to master their area of study according to its own rational structures; its success was driven by such tools asRobert Kilwardby's index to the works ofSt. Augustine,which helped readers access the full original text instead of depending on the compilations ofexcerptswhich had become prominent in 12th centuryscholasticism.The adoption of alphabetical order was part of the transition from the primacy ofmemoryto that of written works.^[8]The idea of ordering information by the order of the alphabet also met resistance from the compilers of encyclopaedias in the 12th and 13th centuries, who were all devout churchmen. They preferred to organise their materialtheologically– in the order of God's creation, starting withDeus(meaning God).^[2]

In 1604Robert Cawdreyhad to explain inTable Alphabeticall,the firstmonolingualEnglishdictionary,"Nowe if the word, which thou art desirous to finde, begin with (a) then looke in the beginning of this Table, but if with (v) looke towards the end".^[9]Although as late as 1803Samuel Taylor Coleridgecondemned encyclopedias with "an arrangement determined by the accident of initial letters",^[10]many lists are today based on this principle.

Ordering in the Latin script

Basic order and examples

The standard order of the modernISO basic Latin alphabetis:

A-B-C-D-E-F-G-H-I-J-K-L-M-N-O-P-Q-R-S-T-U-V-W-X-Y-Z

An example of straightforward alphabetical ordering follows:

As; Aster; Astrolabe; Astronomy; Astrophysics; At; Ataman; Attack; Baa

Another example:

Barnacle; Be; Been; Benefit; Bent

The above words are ordered alphabetically.Ascomes beforeAsterbecause they begin with the same two letters andAshas no more letters after that whereasAsterdoes. The next three words come afterAsterbecause their fourth letter (the first one that differs) isr,which comes aftere(the fourth letter ofAster) in the alphabet. Those words themselves are ordered based on their sixth letters (l,nandprespectively). Then comesAt,which differs from the preceding words in the second letter (tcomes afters).Atamancomes afterAtfor the same reason thatAstercame afterAs.AttackfollowsAtamanbased on comparison of their third letters, andBaacomes after all of the others because it has a different first letter.

Treatment of multiword strings

When some of the strings being ordered consist of more than one word, i.e., they containspacesor other separators such ashyphens,then two basic approaches may be taken. In the first approach, all strings are ordered initially according to their first word, as in the sequence:

Oak; Oak Hill; Oak Ridge; Oakley Park; Oakley River
where all strings beginning with the separate wordOakprecede all those beginning withOakley,becauseOakprecedesOakleyin alphabetical order.

In the second approach, strings are alphabetized as if they had no spaces, giving the sequence:

Oak; Oak Hill; Oakley Park; Oakley River; Oak Ridge
whereOak Ridgenow comes after theOakleystrings, as it would if it were written "Oakridge".

The second approach is the one usually taken in dictionaries^{[citation needed]},and it is thus often calleddictionary orderbypublishers.The first approach has often been used inbook indexes,although each publisher traditionally set its own standards for which approach to use therein; there was no ISO standard for book indexes (ISO 999) before 1975.

Special cases

Modified letters

In French, modified letters (such as those withdiacritics) are treated the same as the base letter for alphabetical ordering purposes. For example,rôlecomes betweenrockandrose,as if it were writtenrole.However, languages that use such letters systematically generally have their own ordering rules. See§ Language-specific conventionsbelow.

Ordering by surname

In most cultures wherefamily namesare written aftergiven names,it is still desired to sort lists of names (as in telephone directories) by family name first. In this case, names need to be reordered to be sorted correctly. For example, Juan Hernandes and Brian O'Leary should be sorted as "Hernandes, Juan" and "O'Leary, Brian" even if they are not written this way. Capturing this rule in a computer collation algorithm is complex, and simple attempts will fail. For example, unless the algorithm has at its disposal an extensive list of family names, there is no way to decide if "Gillian Lucille van der Waal" is "van der Waal, Gillian Lucille", "Waal, Gillian Lucille van der", or even "Lucille van der Waal, Gillian".

Ordering by surname is frequently encountered in academic contexts. Within a single multi-author paper, ordering the authors alphabetically by surname, rather than by other methods such as reverse seniority or subjective degree of contribution to the paper, is seen as a way of "acknowledg[ing] similar contributions" or "avoid[ing] disharmony in collaborating groups".^[11]The practice in certain fields of orderingcitationsin bibliographies by the surnames of their authors has been found to create bias in favour of authors with surnames which appear earlier in the alphabet, while this effect does not appear in fields in which bibliographies are ordered chronologically.^[12]

Theand other common words

If a phrase begins with a very common word (such as "the", "a" or "an", called articles in grammar), that word is sometimes ignored or moved to the end of the phrase, but this is not always the case. For example, the book "The Shining"might be treated as" Shining ", or" Shining, The "and therefore before the book title"Summer of Sam".However, it may also be treated as simply" The Shining "and after" Summer of Sam ". Similarly,"A Wrinkle in Time"might be treated as" Wrinkle in Time "," Wrinkle in Time, A ", or" A Wrinkle in Time ". All three alphabetization methods are fairly easy to create by algorithm, but many programs rely on simplelexicographic orderinginstead.

Macprefixes

The prefixesMandMcin Irish and Scottish surnames are abbreviations forMacand are sometimes alphabetized as if the spelling isMacin full. ThusMcKinleymight be listed beforeMackintosh(as it would be if it had been spelled out as "MacKinley" ). Since the advent of computer-sorted lists, this type of alphabetization is less frequently encountered, though it is still used in British telephone directories.

Stprefix

The prefixStorSt.is an abbreviation of "Saint", and is traditionally alphabetized as if the spelling isSaintin full. Thus in a gazetteerSt John'smight be listed beforeSalem(as if it would be if it had been spelled out as "Saint John's" ). Since the advent of computer-sorted lists, this type of alphabetization is less frequently encountered, though it is still sometimes used.

Ligatures

Ligatures(two or more letters merged into one symbol) which are not considered distinct letters, such asÆandŒin English, are typically collated as if the letters were separate— "æther" and "aether" would be ordered the same relative to all other words. This is true even when the ligature is not purely stylistic, such as inloanwordsand brand names.

Special rules may need to be adopted to sort strings which vary only by whether two letters are joined by a ligature.

Treatment of numerals

When some of the strings containnumerals(or other non-letter characters), various approaches are possible. Sometimes such characters are treated as if they came before or after all the letters of the alphabet. Another method is for numbers to be sorted alphabetically as they would be spelled: for example1776would be sorted as if spelled out "seventeen seventy-six", and24 heures du Mansas if spelled "vingt-quatre..." (French for "twenty-four" ). When numerals or other symbols are used as special graphical forms of letters, as1337forleetor the movieSeven(which was stylised asSe7en), they may be sorted as if they were those letters.Natural sort orderorders strings alphabetically, except that multi-digit numbers are treated as a single character and ordered by the value of the number encoded by the digits.

In the case ofmonarchsandpopes,although their numbers are inRoman numeralsand resemble letters, they are normally arranged in numerical order: so, for example, even though V comes after I, the Danish kingChristian IXcomes after his predecessorChristian VIII.

Language-specific conventions

Languages which use anextended Latin alphabetgenerally have their own conventions for treatment of the extra letters. Also in some languages certaindigraphsare treated as single letters for collation purposes. For example, theSpanish alphabettreatsñas a basic letter followingn,and formerly treated the digraphschandllas basic letters followingcandl,respectively. Nowchandllare alphabetized as two-letter combinations. The new alphabetization rule was issued by theRoyal Spanish Academyin 1994. These digraphs were still formally designated as letters but they are no longer so since 2010. On the other hand, the digraphrrfollowsrquas expected (and did so even before the 1994 alphabetization rule), while vowels with acute accents (á, é, í, ó, ú) have always been ordered in parallel with their base letters, as has the letterü.

In a few cases, such asArabicandKiowa,the alphabet has been completely reordered.

Alphabetization rules applied in various languages are listed below.

InArabic,there are two main orders of the28 letter alphabetused today. The standard and most commonly used is thehijāʾīorder, which was created by the early Arab linguistNasr ibn 'Asim al-Laythiand features a visual ordering method where letters are ordered based on their shapes. For examplebāʾ(ب),tāʾ(ت),thāʾ(ث) are grouped as they have the same base shape orrasm(ٮ) and are differentiated only by consonant pointing known asiʻjām.The originalʾabjadīorder, which phonetically resembles that of otherSemitic languagesas well as Latin, is still in use today, usually limited for ordering lists in a document, analogous toRoman Numerals.When theʾabjadīorder is used in numbering, letters are written in a modified form to distinguish them from letters used in words and from numerals. For example,ʾalif(ا) which looks identical to theEastern Arabic numeralone (١), a small oval loop extends clockwise of the letter's bottom, followed by a short tail (𞺀).^{[citation needed]}Although these characters are rarely used digitally they are encoded in Unicode underArabic Mathematical Alphabetic Symbols.^[13]A less common order, theṣawtī^[ar]order, is collated phonetically and was created byal-Khalil ibn Ahmad al-Farahidi.
InAzerbaijani,there are eight additional letters to the standard Latin alphabet. Five of them are vowels: i, ı, ö, ü,əand three are consonants: ç, ş, ğ. The alphabet is the same as theTurkish,with the same sounds written with the same letters, except for three additional letters: q, x and ə for sounds that do not exist in Turkish. Although all the "Turkish letters" are collated in their "normal" alphabetical order like in Turkish, the three extra letters are collated arbitrarily after letters whose sounds approach theirs. So, q is collated just after k, x (pronounced like a Germanch) is collated just after h and ə (pronounced roughly like an English shorta) is collated just after e.
InBreton,there is no "c", "q", "x" but there are the digraphs "ch" and "c'h", which are collated between "b" and "d". For example: « buzhugenn, chug, c'hoar, daeraouenn » (earthworm, juice, sister, teardrop).
InCzechandSlovak,accented vowels have secondary collating weight – compared to other letters, they are treated as their unaccented forms (in Czech, A-Á, E-É-Ě, I-Í, O-Ó, U-Ú-Ů, Y-Ý, and in Slovak, A-Á-Ä, E-É, I-Í, O-Ó-Ô, U-Ú, Y-Ý), but then they are sorted after the unaccented letters (for example, the correct lexicographic order is baa, baá, báa, báá, bab, báb, bac, bác, bač, báč [in Czech] and baa, baá, baä, báa, báá, báä, bäa, bäá, bää, bab, báb, bäb, bac, bác, bäc, bač, báč, bäč [in Slovak]). Accented consonants have primary collating weight and are collated immediately after their unaccented counterparts, with exception of Ď, Ň and Ť (in Czech) and Ď, Ĺ, Ľ, Ň, Ŕ and Ť (in Slovak), which have again secondary weight.CHis considered to be a separate letter and goes betweenHandI.In Slovak,DZandDŽare also considered separate letters and are positioned betweenĎandE.
In theDanish and Norwegian alphabets,the same extra vowels as in Swedish (see below) are also present but in a different order and with differentglyphs(..., X, Y, Z,Æ,Ø,Å). Also, "Aa" collates as an equivalent to "Å". The Danish alphabet has traditionally seen "W" as a variant of "V", but today "W" is considered a separate letter.
InDutchthe combination IJ (representingĲ) was formerly to be collated as Y (or sometimes as a separate letter: Y < IJ < Z), but is currently mostly collated as 2 letters (II < IJ < IK). Exceptions are phone directories; IJ is always collated as Y here because in many Dutch family names Y is used where modern spelling would require IJ. Note that a word starting with ij that is written with a capital I is also written with a capital J, for example, the townIJmuiden,the riverIJsseland the country IJsland (Iceland).
InEsperanto,consonants withcircumflexaccents (ĉ,ĝ,ĥ,ĵ,ŝ), as well asŭ(u withbreve), are counted as separate letters and collated separately (c, ĉ, d, e, f, g, ĝ, h, ĥ, i, j, ĵ... s, ŝ, t, u, ŭ, v, z).
InEstonian õ,ä,öandüare considered separate letters and collate afterw.Lettersš,zandžappear in loanwords and foreign proper names only and follow the lettersin theEstonian alphabet,which otherwise does not differ from the basic Latin alphabet.
TheFaroese alphabetalso has some of the Danish, Norwegian, and Swedish extra letters, namelyÆandØ.Furthermore, theFaroese alphabetuses the Icelandic eth, which follows theD.Five of the six vowelsA,I,O,UandYcan get accents and are after that considered separate letters. The consonantsC,Q,X,WandZare not found. Therefore, the first five letters areA,Á,B,DandÐ,and the last five areV,Y,Ý,Æ,Ø
InFilipino(Tagalog) and other Philippine languages, the letter Ng is treated as a separate letter. It is pronounced as insing,ping-pong,etc. By itself, it is pronouncednang,but in generalFilipino orthography,it is spelled as if it were two separate letters (n and g). Also, letter derivatives (such asÑ) immediately follow the base letter. Filipino also is written with diacritics, but their use is very rare (except thetilde).
TheFinnish alphabetand collating rules are the same as those of Swedish.
ForFrench,thelastaccent in a given word determines the order.^[14]For example, in French, the following four words would be sorted this way: cote < côte < coté < côté. The letter e is ordered as e é è ê ë (œ considered as oe), same thing for o as ô ö.
InGermanletters withumlaut(Ä,Ö,Ü) are treated generally just like their non-umlauted versions;ßis always sorted as ss. This makes the alphabetic order Arbeit, Arg, Ärgerlich, Argument, Arm, Assistant, Aßlar, Assoziation. For phone directories and similar lists of names, the umlauts are to be collated like the letter combinations "ae", "oe", "ue" because a number of German surnames appear both with umlaut and in the non-umlauted form with "e" (Müller/Mueller). This makes the alphabetic order Udet, Übelacker, Uell, Ülle, Ueve, Üxküll, Uffenbach.
TheHungarianvowels have accents, umlauts, and double accents, while consonants are written with single, double (digraphs) or triple (trigraph) characters. In collating, accented vowels are equivalent with their non-accented counterparts and double and triple characters follow their single originals. Hungarian alphabetic order is: A=Á, B, C, Cs, D, Dz, Dzs, E=É, F, G, Gy, H, I=Í, J, K, L, Ly, M, N, Ny, O=Ó, Ö=Ő, P, Q, R, S, Sz, T, Ty, U=Ú, Ü=Ű, V, W, X, Y, Z, Zs. (Before 1984,dzanddzswere not considered single letters for collation, but two letters each, d+z and d+zs instead.) It means that e.g.nádcukorshould precedenádcsomó(even thoughsnormally precedesu), sincecprecedescsin the collation. Difference in vowel length should only be taken into consideration if the two words are otherwise identical (e.g.egér, éger). Spaces and hyphens within phrases are ignored in collation.Chalso occurs as a digraph in certain words but it is not considered as a grapheme on its own right in terms of collation.
A particular feature of Hungarian collation is that contracted forms of double di- and trigraphs (such asggyfromgy + gyorddzsfromdzs + dzs) should be collated as if they were written in full (independently of the fact of the contraction and the elements of the di- or trigraphs). For example,kaszinóshould precedekassza(even though the fourth characterzwould normally come aftersin the alphabet), because the fourth "character" (grapheme) of the wordkasszais considered a secondsz(decomposingsszintosz + sz), which does followi(inkaszinó).
InIcelandic,Þis added, and D is followed byÐ.Each vowel (A, E, I, O, U, Y) is followed by its correspondent withacute:Á, É, Í, Ó, Ú, Ý. There is no Z, so the alphabet ends:... X, Y, Ý,Þ,Æ,Ö.
- Both letters were also used byAnglo-Saxonscribes who also used the Runic letterWynnto represent /w/.
- Þ(called thorn; lowercase þ) is also a Runic letter.
- Ð(called eth; lowercase ð) is the letterDwith an added stroke.
Kiowais ordered on phonetic principles, like theBrahmic scripts,rather than on the historical Latin order. Vowels come first, then stop consonants ordered from the front to the back of the mouth, and from negative to positivevoice-onset time,then the affricates, fricatives, liquids, and nasals:

A, AU, E, I, O, U, B, F, P, V, D, J, T, TH, G, C, K, Q, CH, X, S, Z, L, Y, W, H, M, N

InLithuanian,specifically Lithuanian letters go after their Latin originals. Another change is thatYcomes just beforeJ:... G, H, I, Į, Y, J, K...
InPolish,specifically Polish letters derived from the Latin alphabet are collated after their originals: A, Ą, B, C, Ć, D, E, Ę,..., L, Ł, M, N, Ń, O, Ó, P,..., S, Ś, T,..., Z, Ź, Ż. The digraphs for collation purposes are treated as if they were two separate letters.
InPinyin alphabetical order,where words have the same basic letters in pinyin and differ only in modifying diacritics, the unmodified letter comes before the modified letter. For example,⟨e⟩comes before⟨ê⟩( ngạch (è) before ai (ê̄)), and⟨u⟩comes before and⟨ü⟩( lộ (lù) before lư (lǘ) and nỗ (nǔ) before nữ (nǚ)). Characters with the same pinyin letters (including modified letters⟨ê⟩and⟨ü⟩) are arranged according to their tones in the order of "first tone (i.e.," flat tone "), second tone (rising tone), third tone (falling-rising tone), fourth tone (falling tone), fifth tone (neutral tone)", for example "Mụ (mā), ma (má), mã (mǎ), mạ (mà), mạ (ma) ".^[a]
InPortuguese,the collating order is just like in English: A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z. Digraphs and letters with diacritics are not included in the alphabet.
InRomanian,special characters derived from the Latin alphabet are collated after their originals: A, Ă, Â,..., I, Î,..., S, Ș, T, Ț,..., Z.
InSerbo-Croatianand other related South Slavic languages, the five accented characters and three conjoined characters are sorted after the originals:..., C, Č, Ć, D, DŽ, Đ, E,..., L, LJ, M, N, NJ, O,..., S, Š, T,..., Z, Ž.
Spanishtreated (until 1994) "CH" and "LL" as single letters, giving an ordering ofcinco,credo,chispaandlomo,luz,llama.This is not true any more since in 1994 theRAEadopted the more conventional usage, and now LL is collated between LK and LM, and CH between CG and CI. The six characters with diacritics Á, É, Í, Ó, Ú, Ü are treated as the original letters A, E, I, O, U, for example:radio,ráfaga,rana,rápido,rastrillo.The only Spanish-specific collating question isÑ(eñe) as a different letter collated after N.
In theSwedish alphabet,there are three extravowelsplaced at its end (..., X, Y, Z,Å,Ä,Ö), similar to the Danish and Norwegian alphabet, but with different glyphs and a different collating order. The letter "W" has been treated as a variant of "V", but in the 13th edition ofSvenska Akademiens ordlista(2006) "W" was considered a separate letter.
In theTurkish alphabetthere are six additional letters: ç, ğ, ı, ö, ş, and ü (but no q, w, and x). They are collated with ç after c, ğ after g, ıbeforei, ö after o, ş after s, and ü after u. Originally, when the alphabet was introduced in 1928, ı was collated after i, but the order was changed later so that letters having shapes containing dots, cedilles or other adorning marks always follow the letters with corresponding bare shapes. Note that in Turkish orthography the letter I is the majuscule of dotless ı, whereas İ is the majuscule of dotted i.
In manyTurkic languages(such asAzerior theJaꞑaliforthography forTatar), there used to be the letterGha(Ƣƣ), which came betweenGandH.It is now in disuse.
InVietnamese,there are seven additional letters:ă,â,đ,ê,ô,ơ,ưwhilef,j,w,zare absent, even though they are still in some use (like Internet address, foreign loan language). "f" is replaced by the combination "ph". The same as for "w" is "qu".
InVolapük ä,öandüare counted as separate letters and collated separately (a, ä, b... o, ö, p... u, ü, v) whileqandware absent.^[15]
InWelshthe digraphs CH, DD, FF, NG, LL, PH, RH, and TH are treated as single letters, and each is listed after the first character of the pair (except for NG which is listed after G), producing the order A, B, C, CH, D, DD, E, F, FF, G, NG, H, and so on. It can sometimes happen, however, that word compounding results in the juxtaposition of two letters which donotform a digraph. An example is the word LLONGYFARCH (composed from LLON + GYFARCH). This results in such an ordering as, for example, LAWR, LWCUS, LLONG, LLOM, LLONGYFARCH (NG is a digraph in LLONG, but not in LLONGYFARCH). The letter combination R+H (as distinct from the digraph RH) may similarly arise by juxtaposition in compounds, although this tends not to produce any pairs in which misidentification could affect the ordering. For the other potentially confusing letter combinations that may occur – namely, D+D and L+L – a hyphen is used in the spelling (e.g. AD-DAL, CHWIL-LYS).

Automation

Collation algorithms(in combination withsorting algorithms) are used in computer programming to place strings in alphabetical order. A standard example is theUnicode Collation Algorithm,which can be used to put strings containing anyUnicodesymbols into (an extension of) alphabetical order.^[14]It can be made to conform to most of the language-specific conventions described above by tailoring its default collation table. Several such tailorings are collected inCommon Locale Data Repository.

Similar orderings

The principle behind alphabetical ordering can still be applied in languages that do not strictly speaking use analphabet– for example, they may be written using asyllabaryorabugida– provided the symbols used have an established ordering.

Forlogographicwriting systems, such as Chinesehanzior Japanesekanji,the method ofradical-and-stroke sortingis frequently used as a way of defining an ordering on the symbols. Japanese sometimes uses pronunciation order, most commonly with theGojūonorder but sometimes with the olderIrohaordering.

In mathematics,lexicographical orderis a means of ordering sequences in a manner analogous to that used to produce alphabetical order.^[16]

Some computer applications use a version of alphabetical order that can be achieved using a very simplealgorithm,based purely on theASCIIorUnicodecodes for characters. This may have non-standard effects such as placing all capital letters before lower-case ones. SeeASCIIbetical order.

Arhyming dictionaryis based on sorting words in alphabetical order starting from the last to the first letter of the word.

Notes

^There is an exception: InABC Chinese–English Dictionarythe tone order is "zero tone (neutral tone), first tone (flat tone), second tone (rising tone), third tone (falling-rising tone) and fourth tone (falling tone)".

References

^Reinhard G. Lehmann: "27-30-22-26. How Many Letters Needs an Alphabet? The Case of Semitic", in:The idea of writing: Writing across borders,edited by Alex de Voogt and Joachim Friedrich Quack, Leiden: Brill 2012, pp. 11–52.
^^a ^b ^cStreet, Julie (10 June 2020)."From A to Z - the surprising history of alphabetical order"(text and audio).ABC News (ABC Radio National).Australian Broadcasting Corporation.Archivedfrom the original on 2 July 2020.Retrieved6 July2020.
^e.g. Psalms 25, 34, 37, 111, 112, 119 and 145 of the Hebrew Bible
^Daly, Lloyd.Contributions to the History of Alphabetization in Antiquity and the Middle Ages.Brussels, 1967. p. 25.
^O'Hara, James (1989). "Messapus, Cycnus, and the Alphabetical Order of Vergil's Catalogue of Italian Heroes".Phoenix.43(1): 35–38.doi:10.2307/1088539.JSTOR 1088539.
^LIVRE XI – texte latin – traduction + commentaires.Archivedfrom the original on 9 June 2012.Retrieved8 May2012.
^Gibson, Craig (2002).Interpreting a classic: Demosthenes and his ancient commentators.
^Rouse, Mary A.; Rouse, Richard M. (1991), "Statim invenire:Schools, Preachers and New Attitudes to the Page ",Authentic Witnesses: Approaches to Medieval Texts and Manuscripts,University of Notre Dame Press, pp. 201–219,ISBN 0-268-00622-9
^Cawdrey, Robert (1604).A Table Alphabeticall.London. p. [A4]v.
^Coleridge's Letters, No.507.
^Tscharntke, Teja; Hochberg, Michael E; Rand, Tatyana A; Resh, Vincent H; Krauss, Jochen (January 2007)."Author Sequence and Credit for Contributions in Multiauthored Publications".PLOS Biol.5(1): e18.doi:10.1371/journal.pbio.0050018.PMC1769438.PMID 17227141.
^
Stevens, Jeffrey R.; Duque, Juan F. (2018)."Order Matters: Alphabetizing In-Text Citations Biases Citation Rates"(PDF).Psychonomic Bulletin & Review.26(3): 1020–1026.doi:10.3758/s13423-018-1532-8.PMID 30288671.S2CID 52922399.Archived(PDF)from the original on 10 November 2018.Retrieved10 November2018.
- Lay summary in:Colleen Flaherty (22 October 2018)."The Case Against Alphabetical Naming of Authors".Inside Higher Ed.
^"Arabic Mathematical Alphabetic Symbols"(PDF).THE Unicode Standard.Archived(PDF)from the original on 30 October 2022.Retrieved26 November2022.
^^a ^b"Unicode Technical Standard #10: Unicode collation algorithm".Unicode, Inc. (unicode.org). 20 March 2008.Archivedfrom the original on 27 August 2008.Retrieved27 August2008.
^Midgley, Ralph."Volapük to English dictionary"(PDF).Archived fromthe original(PDF)on 1 September 2012.Retrieved24 September2019.
^Franz Baader; Tobias Nipkow (1999).Term Rewriting and All That.Cambridge University Press. pp. 18–19.ISBN 978-0-521-77920-3.