Adigraph(fromAncient Greekδίς(dís)'double' andγράφω(gráphō)'to write') ordigramis a pair ofcharactersused in theorthographyof alanguageto write either a singlephoneme(distinct sound), or a sequence of phonemes that does not correspond to the normal values of the two characters combined.

InWelsh,thedigraph⟨ll⟩fused for a time into aligature.

Some digraphs represent phonemes that cannot be represented with a single character in the writing system of a language, likechin Spanishchicoandocho.Other digraphs represent phonemes that can also be represented by single characters. A digraph that shares its pronunciation with a single character may be a relic from an earlier period of the language when the digraph had a different pronunciation, or may represent a distinction that is made only in certaindialects,like the Englishwh.Some such digraphs are used for purelyetymologicalreasons, likephin French.

In some orthographies, digraphs (and occasionallytrigraphs) are considered individualletters,which means that they have their own place in theAlpha betand cannot be separated into their constituent placesgraphemeswhensorting,abbreviating,orhyphenatingwords. Digraphs are used in someromanizationschemes, e.g.zhas a romanisation ofRussianж.

Thecapitalisationof digraphs can vary, e.g.⟨sz⟩in Polish is capitalized⟨Sz⟩and⟨kj⟩inNorwegianis capitalized⟨Kj⟩,whileijinDutchis capitalized⟨IJ⟩and word initial⟨dt⟩inIrishis capitalized⟨dT⟩.

Digraphs may develop intoligatures,but this is a distinct concept: a ligature involves the graphical fusion of two characters into one, e.g. when⟨o⟩and⟨e⟩become⟨œ⟩,e.g. as inFrenchcœur"heart".

Double letters

edit

Digraphs may consist of two different characters (heterogeneous digraphs) or two instances of the same character (homogeneous digraphs). In the latter case, they are generally calleddouble(ordoubled)letters.

Doubledvowelletters are commonly used to indicate along vowelsound. This is the case inFinnishandEstonian,for instance, where⟨uu⟩represents a longer version of the vowel denoted by⟨u⟩,⟨ää⟩represents a longer version of the vowel denoted by⟨ä⟩,and so on. InMiddle English,the sequences⟨ee⟩and⟨oo⟩were used in a similar way, to represent lengthened "e" and "o" sounds respectively; both spellings have been retained in modernEnglish orthography,but theGreat Vowel Shiftandother historical sound changesmean that the modern pronunciations are quite different from the original ones.

Doubledconsonantletters can also be used to indicate a long orgeminatedconsonant sound. InItalian,for example, consonants written double are pronounced longer than single ones. This was the original use of doubled consonant letters inOld English,but during theMiddle EnglishandEarly Modern Englishperiod, phonemic consonant length was lost and a spelling convention developed in which a doubled consonant serves to indicate that a preceding vowel is to be pronounced short. In modern English, for example, the⟨pp⟩oftappingdifferentiates the first vowel sound from that oftaping.In rare cases, doubled consonant letters represent a true geminate consonant in modern English; this may occur when two instances of the same consonant come from differentmorphemes,for example⟨nn⟩inunnatural(un+natural) or⟨tt⟩incattail(cat+tail).

In some cases, the sound represented by a doubled consonant letter is distinguished in some other way than length from the sound of the corresponding single consonant letter:

  • InWelshandGreenlandic,⟨ll⟩stands for a voicelesslateral consonant,while inSpanishandCatalanit stands for apalatal consonant.
  • In several languages of western Europe, including English,French,Portugueseand Catalan, the digraph⟨ss⟩is used between vowels to represent the voiceless sibilant/s/,since an⟨s⟩alone between vowels normally represents the voiced sibilant/z/.
  • In Spanish, Portuguese, Catalan and Basque,⟨rr⟩is used between vowels for thealveolar trill/r/,since an⟨r⟩alone between vowels represents analveolar flap/ɾ/(the two are different phonemes in those languages).
  • In Spanish, the digraph⟨nn⟩formerly indicated/ɲ/(apalatal nasal); it developed into theletter ñ.
  • InBasque,double consonant letters generally markpalatalizedversions of the single consonant letter, as in⟨dd⟩,⟨ll⟩,⟨tt⟩.However,⟨rr⟩is a trill that contrasts with the single-letter flap, as in Spanish, and the palatal version of⟨n⟩is written⟨ñ⟩.

In several European writing systems, including the English one, the doubling of the letter⟨c⟩or⟨k⟩is represented as the heterogeneous digraph⟨ck⟩instead of⟨cc⟩or⟨kk⟩respectively. In native German words, the doubling of⟨z⟩,which corresponds to/ts/,is replaced by the digraph⟨tz⟩.

Pan-dialectical digraphs

edit

Some languages have a unified orthography with digraphs that represent distinct pronunciations in different dialects (diaphonemes). For example, inBretonthere is a digraph⟨zh⟩that represents[z]in most dialects, but[h]inVannetais.Similarly, theSaintongeaisdialect of French has a digraph⟨jh⟩that represents[h]in words that correspond to[ʒ]in standard French. Similarly, Catalan has a digraph⟨ix⟩that represents[ʃ]inEastern Catalan,but[jʃ]or[js]inWestern CatalanValencian.

Split digraphs

edit

The pair of letters making up a phoneme are not always adjacent. This is the case with Englishsilent e.For example, the sequencea_ehas the sound/eɪ/in Englishcake.This is the result of three historical sound changes:cakewas originally/kakə/,theopen syllable/ka/came to be pronounced with along vowel,and later the finalschwadropped off, leaving/kaːk/.Later still, the vowel/aː/became/eɪ/.There are six such digraphs in English,⟨a_e, e_e, i_e, o_e, u_e, y_e⟩.[1]

However, Alpha bets may also be designed with discontinuous digraphs. In theTatarCyrillic Alpha bet,for example, the letterюis used to write both/ju/and/jy/.Usually the difference is evident from the rest of the word, but when it is not, the sequenceю...ьis used for/jy/,as inюнь/jyn/'cheap'.

TheIndic Alpha betsare distinctive for their discontinuous vowels, such as Thai เ...อ/ɤː/in เกอ/kɤː/.Technically, however, they may be considereddiacritics,not full letters; whether they are digraphs is thus a matter of definition.

Ambiguous letter sequences

edit

Some letter pairs should not be interpreted as digraphs but appear because ofcompounding:hogsheadandcooperate.They are often not marked in any way and so must be memorized as exceptions. Some authors, however, indicate it either by breaking up the digraph with ahyphen,as inhogs-head,co-operate,or with atrema mark,as incoöperate,but the use of the diaeresis has declinedin Englishwithin the last century. When it occurs in names such asClapham,Townshend, and Hartshorne, it is never marked in any way. Positional alternative glyphs may help to disambiguate in certain cases: when round,⟨s⟩was used as a final variant of long⟨ſ⟩,and the English digraph for/ʃ/would always be⟨ſh⟩.

Inromanization of Japanese,the constituent sounds (morae) are usually indicated by digraphs, but some are indicated by a single letter, and some with a trigraph. The case of ambiguity is the syllabic,which is written asn(or sometimesm), except before vowels orywhere it is followed by anapostropheasn’.For example, the given name じゅんいちろう is romanized as Jun’ichirō, so that it is parsed as "Jun-i-chi-rou", rather than as "Ju-ni-chi-rou". A similar use of the apostrophe is seen inpinyinwhere Thường Nga is writtenChang'ebecause the g belongs to the final (-ang) of the first syllable, not to the initial of the second syllable. Without the apostrophe, Change would be understood as the syllable chan (final -an) followed by the syllable ge (initial g-).

In Alpha betization

edit

In some languages, certain digraphs andtrigraphsare counted as distinct letters in themselves, and assigned to a specific place in theAlpha bet,separate from that of the sequence of characters that composes them, for purposes oforthographyandcollation:

  • In theGaj's Latin Alpha betused to writeSerbo-Croatian,the digraphs,ljandnj,which correspond to the singleCyrillic letters⟨џ⟩,⟨љ⟩,⟨њ⟩,are treated as distinct letters.
  • In theCzechandSlovak Alpha bet,chis treated as a distinct letter, coming afterhin the Alpha bet. Also, in theSlovak Alpha betthe relatively rare digraphsdzandare treated as distinct letters.
  • In theDanish and Norwegian Alpha bet,the former digraphaa,where it appears in older names, is sorted as if it were the letterå,which replaced it.
  • In theNorwegian Alpha bet,there are several digraphs and letter combinations representing an isolated sound.
  • In theDutch Alpha bet,the digraphijis sometimes written as aligatureand may be sorted withy(in theNetherlands,though not usually inBelgium); however, regardless of where it is used, when a Dutch word starting with⟨ij⟩is capitalized, the entire digraph is capitalized (IJmeer,IJmuiden). OtherDutch digraphsare never treated as single letters.
  • InHungarian,the digraphscs,dz,gy,ly,ny,sz,ty,zs,and the trigraphdzs,have their own places in the Alpha bet (where e.g.⟨ny⟩comes right after⟨n⟩)
  • InSpanish,the digraphschandllwere formerly treated as distinct letters, but are now split into their constituent letters.
  • InWelsh,the Alpha bet includes the digraphsch,dd,ff,ll,ng,ph,rh,th.However,mh,nhandngh,which representmutatedvoiceless consonants, are not treated as distinct letters.
  • In the romanization of several Slavic countries that use the Cyrillic script, letters like ш, ж, and ю might be written as sh, zh and yu, however sometimes the result of the romanization might modify a letter to be a diacritical letter instead of a digraph.
  • InMaltese,two digraphs are used,which comes right after⟨g⟩,andiewhich comes right after⟨i⟩.

Most other languages, including most of the Romance languages, treat digraphs as combinations of separate letters for Alpha betization purposes.

Examples

edit

Latin script

edit

English

edit

English has both homogeneous digraphs (doubled letters) and heterogeneous digraphs (digraphs consisting of two different letters). Those of the latter type include the following:

Digraphs may also be composed of vowels. Some letters⟨a, e, o⟩are preferred for the first position, others for the second⟨i, u⟩.The latter haveallographs⟨y, w⟩inEnglish orthography.

English vocalic digraphs
second letter →
first letter ↓
⟨...e⟩ ⟨...i⟩¦⟨...y⟩ ⟨...u⟩¦⟨...w⟩ ⟨...a⟩ ⟨...o⟩
⟨o...⟩ ⟨oe¦œ⟩>⟨e⟩/i/ ⟨oi¦oy⟩/ɔɪ/ ⟨ou¦ow⟩/aʊ¦uː¦oʊ/ ⟨oa⟩/oʊ¦ɔː/ ⟨oo⟩/uː¦ʊ(¦ʌ)/
⟨a...⟩ ⟨ae¦æ⟩>⟨e⟩/i/ ⟨ai¦ay⟩/eɪ¦ɛ/ ⟨au¦aw⟩/ɔː/
(in loanwords:/aʊ/ )
(in loanwords and proper nouns:⟨aa⟩/ə¦ɔː¦ɔl/ ) (in loanwords from Chinese:⟨ao⟩/aʊ/ )
⟨e...⟩ ⟨ee⟩/iː/ ⟨ei¦ey⟩/aɪ¦eɪ¦(iː)/ ⟨eu¦ew⟩/juː¦uː/ ⟨ea⟩/iː¦ɛ¦(eɪ¦ɪə)/
⟨u...⟩ ⟨ue⟩/uː¦u/ ⟨ui⟩/ɪ¦uː/
⟨i...⟩ ⟨ie⟩/iː(¦aɪ)/

Other languages using the Latin Alpha bet

edit

InSerbo-Croatian:

Note that in theCyrillic orthography,those sounds are represented by single letters (љ, њ, џ).

InCzechandSlovak:

InDanish and Norwegian:

  • The digraphaarepresented/ɔ/until 1917 in Norway and 1948 in Denmark, but is today speltå.The digraph is still used in older names, but sorted as if it were the letter with the diacritic mark.

InNorwegian,several sounds can be represented only by a digraph or a combination of letters. They are the most common combinations, but extreme regional differences exists, especially those of theeastern dialects.A noteworthy difference is theaspirationof⟨rs⟩in eastern dialects, where it corresponds to⟨skj⟩and⟨sj⟩.Among many young people, especially in the western regions of Norway and in or around the major cities, the difference between/ç/and/ʃ/has been completely wiped away and are now pronounced the same.

  • ⟨kj⟩represents/ç/
  • ⟨tj⟩represents/ç/.
  • ⟨skj⟩represents/ʃ/.
  • ⟨sj⟩represents/ʃ/.
  • ⟨sk⟩represents/ʃ/(before i or y).
  • ngrepresents/ŋ/as inngin Englishthing.

InCatalan:

InDutch:

InFrench:

French vocalic digraphs
⟨...i⟩ ⟨...u⟩
⟨a...⟩ ⟨ai⟩/ɛ¦e/ ⟨au⟩/o/
⟨e...⟩ ⟨ei⟩/ɛ/ ⟨eu⟩/œ¦ø/
⟨o...⟩ ⟨oi⟩/wa/ ⟨ou⟩/u(¦w)/

See alsoFrench phonology.

InGerman:

InHungarian:

InItalian:

InManx Gaelic,⟨ch⟩represents/χ/,but⟨çh⟩represents/tʃ/.

InPolish:

InPortuguese:

InSpanish:

  • ⟨ll⟩is traditionally pronounced/ʎ/,but in dialects withyeísmois pronounced/ʝ/
  • ⟨ch⟩represents/tʃ/(voiceless postalveolar affricate). Since 2010, neither is considered part of the Alpha bet. They used to be sorted as separate letters, but a reform in 1994 by theSpanish Royal Academyhas allowed that they be split into their constituent letters for collation. The digraphrr,pronounced as a distinctalveolar trill,was never officially considered to be a letter in the Spanish Alpha bet, and the same is true⟨gu⟩and⟨qu⟩(for/ɡ/and/k/respectively before⟨e⟩or⟨i⟩).

InWelsh:

The digraphs listed above represent distinct phonemes and are treated as separate letters for collation purposes. On the other hand, the digraphsmh,nh,and the trigraphngh,which stand forvoiceless consonantsbut occur only at the beginning of words as a result of thenasal mutation,are not treated as separate letters, and thus are not included in the Alpha bet.

Daighi tongiong pingim,a transcription system used forTaiwanese Hokkien,includesorthat represents/ə/(mid central vowel) or/o/(close-mid back rounded vowel), as well as other digraphs.

InYoruba,⟨gb⟩is a letter that represents a plosive most accurately pronounced by trying to say/g/and/b/at the same time.

Cyrillic

edit

Modern Slavic languages written in theCyrillic Alpha betmake little use of digraphs apart from⟨дж⟩for/dʐ/,⟨дз⟩for/dz/(in Ukrainian, Belarusian, and Bulgarian), and⟨жж⟩and⟨зж⟩for the uncommon Russian phoneme/ʑː/.In Russian, the sequences⟨дж⟩and⟨дз⟩do occur (mainly in loanwords) but are pronounced as combinations of an implosive (sometimes treated as an affricate) and a fricative; implosives are treated as allophones of the plosive/d̪/and so those sequences are not considered to be digraphs. Cyrillic has few digraphs unless it is used to write non-Slavic languages, especiallyCaucasian languages.

Arabic script

edit

Because vowels are not generally written, digraphs are rare inabjadslike Arabic. For example, ifshwere used forš,then the sequenceshcould mean eitheršaorsaha.However, digraphs are used for theaspiratedandmurmured consonants(those spelled withh-digraphs in Latin transcription) in languages ofSouth Asiasuch asUrduthat are written in theArabic scriptby a special form of the letterh,which is used only for aspiration digraphs, as can be seen with the following connecting(kh)and non-connecting(ḍh)consonants:

Urdu connecting non-connecting
digraph: کھا /kʰɑː/ ڈھا /ɖʱɑː/
sequence: کہا /kəɦɑː/ ڈہا /ɖəɦɑː/

Armenian

edit

In theArmenian language,the digraphու⟨ou⟩transcribes/u/,a convention that comes from Greek.

Georgian

edit

TheGeorgian Alpha betuses a few digraphs to write other languages. For example, inSvan,/ø/is written ჳე⟨we⟩,and/y/as ჳი⟨wi⟩.

Greek

edit

Modern Greekhas the following digraphs:

  • αι(ai) represents/e̞/
  • ει(ei) represents/i/
  • οι(oi) represents/i/
  • ου(oy) represents/u/
  • υι(yi) represents/i/

They are called "diphthongs" inGreek;in classical times, most of them representeddiphthongs,and the name has stuck.

  • γγ(gg) represents/ŋɡ/or/ɡ/
  • τσ(ts) represents the affricate/ts/
  • τζ(tz) represents the affricate/dz/
  • Initialγκ(gk) represents/ɡ/
  • Initialμπ(mp) represents/b/
  • Initialντ(nt) represents/d/

Ancient Greekalso had the "diphthongs" listed above although their pronunciation in ancient times is disputed. In addition, Ancient Greek also used the letter γ combined with a velar stop to produce the following digraphs:

  • γγ(gg) represents/ŋɡ/
  • γκ(gk) represents/ŋɡ/
  • γχ(gkh) represents/ŋkʰ/

Tsakonianhas a few additional digraphs:

  • ρζ(rz)/ʒ/(historically perhaps africative trill)
  • κχ(kkh) represents/kʰ/
  • τθ(tth) represents/tʰ/
  • πφ(pph) represents/pʰ/
  • σχ(skh) represents/ʃ/

In addition,palatal consonantsare indicated with the vowel letterι,which is, however, largely predictable. When/n/and/l/are not palatalized beforeι,they are writtenννandλλ.

InBactrian,the digraphsββ,δδ,andγγwere used for/b/,/d/,and/ŋg/respectively.

Hebrew

edit

In theHebrew Alpha bet,תס‎ andתש‎ may sometimes be found forצ/ts/.Modern Hebrew also uses digraphs made with the׳‎ symbol for non-native sounds:ג׳//,ז׳/ʒ/,צ׳//;and other digraphs of letters when it is written without vowels:וו‎ for a consonantal letterו‎ in the middle of a word, andיי‎ for/aj/or/aji/,etc., that is, a consonantal letterי‎ in places where it might not have been expected.Yiddishhas its own tradition of transcription and so uses different digraphs for some of the same sounds:דז/dz/,זש/ʒ/,טש//,andדזש‎ (literallydzš)for//,וו/v/,also available as a singleUnicodecharacterװ‎,וי‎ or as a single character in Unicodeױ/oj/,יי‎ orײ/ej/,andײַ/aj/.The single-character digraphs are called "ligatures"in Unicode.י‎ may also be used following a consonant to indicate palatalization in Slavic loanwords.

Indic

edit

MostIndic scriptshave compound voweldiacriticsthat cannot be predicted from their individual elements. That can be illustrated withThaiin which the diacritic เ, pronounced alone/eː/,modifies the pronunciation of other vowels:

single vowel sign: กา /kaː/, เก /keː/, กอ /kɔː/
vowel sign plus เ: เกา /kaw/, แก /kɛː/, เกอ /kɤː/

In addition, the combination รร is pronounced/a/or/an/,there are some words in which the combinations ทร and ศร stand for/s/and the letter ห, as a prefix to a consonant, changes its tonic class to high, modifying the tone of the syllable.

Inuit

edit

Inuktitut syllabicsadds two digraphs to Cree:

rkforq
qai,ᕿqi,ᖁqu,ᖃqa,ᖅq

and

ngforŋ
ng

The latter forms trigraphs and tetragraphs.

CJK Characters

edit

Chinese

edit

Several combinations ofChinese characters(Hanzi) formed from two or more different characters that known as digraphs.

Japanese

edit

Twokanamay be combined into aCVsyllable by subscripting the second; the convention cancels the vowel of the first. That is commonly done forCyVsyllables calledyōon,as in ひょ (ひ)hyo⟨hiyo.They are not digraphs since they retain the normal sequential reading of the two glyphs. However, some obsolete sequences no longer retain that reading, as in くゎkwa,ぐゎgwa,and むゎmwa,now pronouncedka, ga, ma.In addition, non-sequenceable digraphs are used for foreign loans that do not follow normal Japaneseassibilationpatterns, such as ティti,トゥtu,チェtye / che,スェswe,ウィwi,ツォtso,ズィzi.(Seekatakanaandtranscription into Japanesefor complete tables.)

Long vowels are written by adding the kana for that vowel, in effect doubling it. However, longōmay be written eitherooorou,as in とうきょうtoukyou[toːkʲoː]'Tōkyō'. For dialects that do not distinguishēandei,the latter spelling is used for a longe,as in へいせいheisei[heːseː]'Heisei'. In loanwords,chōonpu,a line following the direction of the text, as in ビールbīru[biːru]bīru'beer'. With the exception of syllables starting withn,doubled consonant sounds are written by prefi xing a smaller version oftsu(written っ and ッ in hiragana and katakana respectively), as in きってkitte'stamp'. Consonants beginning with n use the kanancharacter (written ん or ン) as a prefix instead.

There are several conventions ofOkinawan kanathat involve subscript digraphs or ligatures. For instance, in the University of the Ryukyu's system, ウ is/ʔu/,ヲ is/o/,but ヲゥ (ヲ) is/u/.

Korean

edit

As was the case in Greek, Korean has vowels descended from diphthongs that are still written with two letters. Those digraphs, ㅐ/ɛ/and ㅔ/e/(also ㅒ/jɛ/,ㅖ/je/), and in some dialects ㅚ/ø/and ㅟ/y/,all end in historical ㅣ/i/.

Hangulwas designed with a digraph series to represent the "muddy"consonants: ㅃ*[b],ㄸ*[d],ㅉ*[dz],ㄲ*[ɡ],ㅆ*[z],ㆅ*[ɣ];also ᅇ, with an uncertain value. Those values are now obsolete, but most of the doubled letters were resurrected in the 19th century to write consonants that did not exist when hangul was devised: ㅃ/p͈/,ㄸ/t͈/,ㅉ/t͈ɕ/,ㄲ/k͈/,ㅆ/s͈/.

Ligatures and new letters

edit

Digraphs sometimes come to be written as a single ligature. Over time, the ligatures may evolve into new letters or letters with diacritics. For exampleszbecameßin German, and "nn" becameñin Spanish.

In Unicode

edit

Generally, a digraph is simply represented using two characters inUnicode.[2]However, for various reasons, Unicode sometimes provides a separatecode pointfor a digraph, encoded as a single character.

TheDZandIJdigraphs and theSerbian/Croatian digraphsDŽ, LJ, and NJ have separate code points in Unicode.

Two Glyphs Digraph Unicode Code Point HTML
DZ, Dz, dz DZ, Dz, dz U+01F1 U+01F2 U+01F3 DZ Dz dz
DŽ, Dž, dž DŽ, Dž, dž U+01C4 U+01C5 U+01C6 DŽ Dž dž
IJ, ij IJ, ij U+0132 U+0133 IJ ij
LJ, Lj, lj LJ, Lj, lj U+01C7 U+01C8 U+01C9 LJ Lj lj
NJ, Nj, nj NJ, Nj, nj U+01CA U+01CB U+01CC NJ Nj nj
th U+1D7A[3]

See alsoLigatures in Unicode.

See also

edit

References

edit
  1. ^Brooks (2015)Dictionary of the British English Spelling System,p. 460ff
  2. ^"FAQ – Ligatures, Digraphs and Presentation Forms".The Unicode Consortium: Home Page.Unicode Inc.1991–2009.Retrieved2009-05-11.
  3. ^"The Unicode Standard, Version 15.1"(PDF).Unicode.Retrieved2023-12-20.