User:Robbiemuffin/L2 Presentation Patterns/Sounds and Writing/Writing system

Part 1: Syllabaries
One might expect that a syllabary is going to have thousands of characters: in fact Chinese has some 47,000. But only about 4,000 are necessary for native speakers. The rest are specialized, limited use terms. And there are morphographic mappings of letters such that, upon gaining familiarity with basic symbols, a native speaker can approximate what a unknown related symbol would be. 4,000 is a high water mark in the syllabary languages: there are only 142 in use in katakana, of which only 103 are for non-loanwords. In hiragana only 69 make up the primary school table and only about 46 are used in introductory texts for students of the language.

Looking at these numbers, over a hundred for Japanese syllabaries, and still only 47,000 for Chinese, one realizes a common thread of syllabaries: a restricted use of syllables. (Compare to the >1.6 million possible syllables in English.) This either comes from the isolating, analytic nature of the language, or from the dominance of consonant-vowel pairing which is often characterized by rapid, machine-gun like speech (a non-syllabary with this character is Spanish). There are very many syllables, very many exceptions, but there is a small core of simple syllables that are atomic, in much the same way that the simplyfied alphabet below would do almost as a drop-in-place system for English speakers.

Japanese as a whole, makes a very sticky example. It is the most complicated written form of language; which commonly weds three distinct and complete writing systems (one of which, kanji, basically is a ligaturization of all of Chinese, in the sense that it is some 50,000 symbols that show morphological divergence, while only 1,000 are oft-used) along with liberal, and frequent, transliteration (the romanji) and borrowing of loan words in their proper native writing system.

The English alphabet
As used in modern English, the Latin alphabet consists of the following characters

In addition, the ligatures Æ of A with E (e.g. "encyclopædia"), and Œ of O with E (e.g. "cœlom") may be used, optionally, in words derived from Latin or Greek, and the diaeresis mark is sometimes placed for example on the letter o (e.g. "coöperate") to indicate the pronunciation of oo as two distinct vowels, rather than a long one.

Outside of professional papers on specific subjects that traditionally use ligatures in loanwords, however, ligatures and diaereses are seldom used in modern English. Also, any letter from the extended latin alphabet (that is, the latin alphabet in all the languages which use it), will sometimes be used when that word is or used as a loanword such as naïve.

Simplifying
The extended latin alphabet is a good deal more letters than this: comprised of 53 distinct alphabets, some with diacritics (naïve) and others with ligatures (beißen). There is not a one-to-one correlation between phonemes and letters: There are about 50 distinct sounds just within the English language, or about two for every letter (though it doesn't quite distribute evenly that way). And just as we group sounds together in letters, other languages endorse their own groupings. In Japan, the "l" and the "r" are really the same sound; it is just that we make a funny distinction. Likewise, to English speakers, the elle "LL" of Spanish is (in neutral accent) to us really just the same sound as our "y", and the Irish Eth "ð" is really just "th". Therefore, it makes sense to group letters into abstracted families when transferring from one language to another.

The letters can be grouped together in more than one way, and someone learning english is likely to choose a grouping that maps well onto their own writing system. A transfer from english to english would be how we group letters when teaching our own children, something like the following:

'' * Though not officially ligatures, these characters as used in English are in fact single glyphs representing the combination of two distinct letters. The "X" is a "ks" digraph, and the "Q" a "ku" digraph. Of course in the case of Q, the trailing "u" is almost always explicit in the spelling of the word. ''

Of particular interest to technicians (and probably very natural to mothers, family and teachers of young children) is the use of "C" instead of "k" and "E" in place of "i". This sort of selection represents the letter-choice bias of English, and if the target language was another latin alphabet, the historic symbols like "k" would likely be a better choice. Review what we just did again. We took our alphabet in a standard, complete form, and abstracted out the simple letters of our language.

Mapping

 * This reads, for example: the J is like the G, the W is like UI, et cetera.

Recap
In this module we detailed how to generate the content to use with the writing system presentation pattern: