If we could add one additional consonant letter to English orthography, it should be TH, which already nicely has characters thorn or Greek theta. If we could add another, it should be NG, which also already has a character eng, n with a hook. Further consonant digraphs are listed below, with frequency relative to TH, though they may have been formed by the merging the final and initial of adjacent syllables. The high frequency of LL is due to the contraction pronoun + will.
Consider recomputing English Huffman 0 with the two extra consonants.
1.00 th ; 0.34 ng ; 0.30 st ; 0.27 nd ; 0.24 nt ; 0.22 ll ; 0.14 wh ; 0.13 ch ; 0.11 pr ; 0.11 ns ; 0.10 ss ; 0.10 ct ; 0.10 rs ; 0.10 rt ; 0.09 ld ; 0.09 tr ; 0.08 pl ; 0.08 ts ; 0.08 gh ; 0.08 sh ; 0.08 nc ; 0.07 bl ; 0.06 tt ; 0.06 ck ; 0.05 rd ; 0.05 fr ; 0.05 mp ; 0.05 cl ; 0.05 ht ; 0.05 nk ; 0.05 sp ; 0.05 ff ; 0.04 pp ; 0.04 gr ; 0.04 cr ; 0.04 rr ; 0.04 ls ; 0.04 sc ; 0.03 rn ; 0.03 ds ; 0.03 rm ; 0.03 tl ; 0.03 rk ; 0.03 mm ; 0.03 nn ; 0.03 rc ; 0.03 mb
Counting only consonant clusters at the beginning or endings or words, so avoiding central consonant clusters due to adjacent syllables:
1.00 th ; 0.31 ng ; 0.21 st ; 0.20 nd ; 0.14 wh ; 0.12 ll ; 0.11 nt ; 0.11 ch ; 0.10 pr ; 0.08 ts ; 0.08 ld ; 0.07 sh ; 0.06 rs ; 0.06 ns ; 0.05 fr ; 0.05 ss ; 0.04 ht ; 0.04 tr ; 0.04 nk ; 0.03 rt ; 0.03 ck ; 0.03 ds ; 0.03 gh ; 0.03 cl ; 0.03 rd ; 0.03 sp ; 0.03 ct ; 0.03 sn ; 0.03 pl
The symbol & replacing "and" would decrease the frequency of ND to 0.06, suggesting the third already existing useful character to add.
Scraping near the bottom of the barrel, SS has a character in German.
No comments :
Post a Comment