Comparing the phonetic content of various texts with VMs "words"


The procedure is to take each word in a body of plaintext and convert it to a phonetic code using the "Soundex"  and Double Metaphone algorithms. Then accumulate the frequency distribution of the phonetic codes in the plaintext and compare with the frequency distribution of word counts in the VMs.

If the VMs "words" are phonetic abbreviations of plaintext words, then one might expect the frequency distributions to match in form and level.

(Of course, Soundex is designed for English, so may not make much practical sense when applied to other languages. However, it is one means of expressing phonetic content: and thus allows us to generate a phonetic description for any foreign language word, albeit in an English pronounciation! Double Metaphone is designed for several different languages.)

Example Soundex

To illustrate the Soundex compression, here is a Latin phrase and its Soundex equivalent:

et   post non  multas elapsi temporis horas
E300 P230 N500 M432   E412   T516     H620

See below for the Soundex results

(there are a total of 26x666 = 17316 Soundex codes possible i.e. A000 ... Z666)

Example Double Metaphone

With the Double Metaphone encoding, the nice feature is that the encoded words look a little like the plaintext word, so you get a better feel for what is seen.

For example:

Plain:    The    quick brown fox  jumps over the  lazy dog
Soundex:  T000   Q200  B650  F200 J512  O160 T000 L200 D200
Dbl.Meta: 0      KK    PRN   FKS  JMPS  AFR  0    LS   DK


Also, one is able to specify a maximum encoding length, so that longer words like "manuscript" come out as "MNSKPT" in Double Metaphone, rather than the "M526" you would get in Soundex.

Producing the word frequency tables using Double Metaphone (comparing the Herbal Folios with a Latin text describing a herb garden)


8am     0.038182747     AT      0.029438822    e.g "et"
1oe     0.020316487     S       0.014719411        "se"
1oy     0.014599285     HK      0.01425943         "huc"
s       0.011638591     K       0.012419503        "quo"
oy      0.010107198     AN      0.012419503        "in"
19      0.010107198     TM      0.00873965         "tum"
am      0.009596733     NN      0.008279668        "non"
8ay     0.00949464      KM      0.008279668        "quam"
89      0.0089841755    KT      0.006439742        "quot"
K9      0.008882083     FRT     0.0059797605       "forti"

(numbers are the relative frequencies)

Compare with Soundex:

8am     0.038182747     E300    0.021619136
1oe     0.020316487     H200    0.019319227
1oy     0.014599285     S000    0.014719411
s       0.011638591     I500    0.013339466
oy      0.010107198     Q000    0.01149954
19      0.010107198     N500    0.008279668
am      0.009596733     Q200    0.006439742
8ay     0.00949464      Q300    0.006439742
89      0.0089841755    V620    0.0059797605
K9      0.008882083     D500    0.0059797605



The phonetic distribution is *very* sensitive to what is used for the plaintext, (surprisingly?) which makes it unreasonable to draw any conclusions by comparison to the VMs.

Comparison Tables: Soundex

Herbal Folios

Augustinus (Latin):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     E300    0.064491205
1oe     0.020316487     N500    0.026331432
1oy     0.014599285     M000    0.025974799
s       0.011638591     I500    0.025737043
oy      0.010107198     T000    0.02318117
19      0.010107198     Q000    0.021932954
am      0.009596733     E200    0.02044698
8ay     0.00949464      E230    0.013789824
89      0.0089841755    A300    0.012719925
K9      0.008882083     Q300    0.012719925
2oe     0.0077590607    S300    0.011174513
8an     0.0073506893    N200    0.010639563
1c9     0.0071465033    I400    0.009807418
oe      0.006738132     S000    0.008737518
ay      0.006636039     U300    0.008321445
2o      0.006533946     C500    0.00808369
oham    0.005104645     E550    0.007964812
oh9     0.0047983667    H200    0.007964812
4ok19   0.0047983667    D200    0.00754874
9       0.0046962737    D000    0.0067760344
8ae     0.0044920878    Q200    0.0067760344
2oy     0.004389995     P600    0.0065977178
Koe     0.004287902     Q500    0.00647884
4oh19   0.004287902     M200    0.005884451
7am     0.0041858093    S500    0.0057061343
Koy     0.0040837163    A000    0.0055278176
8oy     0.0039816233    A320    0.005171184
29      0.0039816233    I120    0.0048145507
1H9     0.0039816233    E000    0.0045173564
ok9     0.0039816233    E630    0.0042796005
1o      0.0038795304    I300    0.0041607227
8oe     0.0034711587    D550    0.003982406
1o89    0.003369066     A350    0.0039229672
ohae    0.003369066     E350    0.0039229672
1am     0.003266973     T550    0.0038635284
1c79    0.003266973     A550    0.0038635284
y       0.00316488      S530    0.003625773
1ay     0.0029606943    M500    0.003566334
okoe    0.0028586013    S200    0.0033285783
sam     0.0028586013    E500    0.0032097006


Latin Herb Garden (Latin):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     E300    0.021619136
1oe     0.020316487     H200    0.019319227
1oy     0.014599285     S000    0.014719411
s       0.011638591     I500    0.013339466
oy      0.010107198     Q000    0.01149954
19      0.010107198     N500    0.008279668
am      0.009596733     Q200    0.006439742
8ay     0.00949464      Q300    0.006439742
89      0.0089841755    V620    0.0059797605
K9      0.008882083     D500    0.0059797605
2oe     0.0077590607    P600    0.0059797605
8an     0.0073506893    P632    0.0059797605
1c9     0.0071465033    F630    0.005519779
oe      0.006738132     I400    0.0050597973
ay      0.006636039     N200    0.0050597973
2o      0.006533946     I536    0.0050597973
oham    0.005104645     Q500    0.0050597973
oh9     0.0047983667    C616    0.0045998157
4ok19   0.0047983667    N550    0.0045998157
9       0.0046962737    S100    0.0045998157
8ae     0.0044920878    S300    0.0045998157
2oy     0.004389995     E230    0.004139834
Koe     0.004287902     O360    0.004139834
4oh19   0.004287902     C523    0.004139834
7am     0.0041858093    V400    0.004139834
Koy     0.0040837163    C500    0.004139834
8oy     0.0039816233    T500    0.004139834
29      0.0039816233    L300    0.004139834
1H9     0.0039816233    F653    0.0036798527
ok9     0.0039816233    U300    0.0036798527
1o      0.0038795304    A300    0.0036798527
8oe     0.0034711587    G520    0.0036798527
1o89    0.003369066     T100    0.0036798527
ohae    0.003369066     N236    0.0036798527
1am     0.003266973     S200    0.0036798527
1c79    0.003266973     I525    0.0036798527
y       0.00316488      S162    0.003219871
1ay     0.0029606943    T550    0.003219871
okoe    0.0028586013    V536    0.003219871
sam     0.0028586013    V633    0.003219871
Culpeper (Old English):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     T000    0.10639342
1oe     0.020316487     A530    0.054539897
1oy     0.014599285     O100    0.035268757
s       0.011638591     I500    0.024507163
oy      0.010107198     O600    0.018764427
19      0.010107198     I300    0.017799262
am      0.009596733     A000    0.017147776
8ay     0.00949464      I200    0.01615848
89      0.0089841755    W300    0.013842083
K9      0.008882083     T500    0.01140504
2oe     0.0077590607    B000    0.010568563
8an     0.0073506893    T200    0.010464003
1c9     0.0071465033    S300    0.0094907945
oe      0.006738132     F600    0.009394278
ay      0.006636039     B300    0.008783007
2o      0.006533946     T300    0.008533672
oham    0.005104645     A200    0.00839694
oh9     0.0047983667    A600    0.007793712
4ok19   0.0047983667    W200    0.0068124603
9       0.0046962737    H300    0.00673203
8ae     0.0044920878    L120    0.0065068244
2oy     0.004389995     S500    0.0060081556
Koe     0.004287902     A500    0.005895553
4oh19   0.004287902     A420    0.005662305
7am     0.0041858093    A400    0.00551753
Koy     0.0040837163    B520    0.005356669
8oy     0.0039816233    T600    0.005348626
29      0.0039816233    O500    0.0049706027
1H9     0.0039816233    G300    0.004874086
ok9     0.0039816233    M500    0.0047856127
1o      0.0038795304    H610    0.004624752
8oe     0.0034711587    L200    0.0045764935
1o89    0.003369066     W400    0.0041019535
ohae    0.003369066     B430    0.0040536956
1am     0.003266973     A300    0.0040295664
1c79    0.003266973     R300    0.003965222
y       0.00316488      M300    0.0039089206
1ay     0.0029606943    S540    0.0039008774
okoe    0.0028586013    S000    0.003836533
sam     0.0028586013    P420    0.003820447


German Cook Book 1553 (Old German):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     V530    0.06859404
1oe     0.020316487     A500    0.06320765
1oy     0.014599285     J500    0.030449597
s       0.011638591     D200    0.026547212
oy      0.010107198     S000    0.026437286
19      0.010107198     D000    0.02423876
am      0.009596733     D500    0.01802792
8ay     0.00949464      D650    0.01593932
89      0.0089841755    D652    0.014949983
K9      0.008882083     E200    0.014345388
2oe     0.0077590607    N500    0.01264153
8an     0.0073506893    Z000    0.011542266
1c9     0.0071465033    L200    0.0113773765
oe      0.006738132     W400    0.01038804
ay      0.006636039     M200    0.010278113
2o      0.006533946     T000    0.009948335
oham    0.005104645     M300    0.009563592
oh9     0.0047983667    W500    0.008849071
4ok19   0.0047983667    T200    0.0087941075
9       0.0046962737    M250    0.008739145
8ae     0.0044920878    A100    0.008244476
2oy     0.004389995     A550    0.0081345495
Koe     0.004287902     V500    0.008024624
4oh19   0.004287902     J300    0.0078047705
7am     0.0041858093    A420    0.0077498076
Koy     0.0040837163    A200    0.007694844
8oy     0.0039816233    S350    0.007255139
29      0.0039816233    W520    0.007255139
1H9     0.0039816233    D600    0.0071452125
ok9     0.0039816233    W000    0.006650544
1o      0.0038795304    N300    0.006430691
8oe     0.0034711587    Z260    0.006375728
1o89    0.003369066     O360    0.006265802
ohae    0.003369066     M500    0.006265802
1am     0.003266973     N513    0.0058260965
1c79    0.003266973     G300    0.00571617
y       0.00316488      B200    0.0054963175
1ay     0.0029606943    W630    0.0054963175
okoe    0.0028586013    S432    0.005386391
sam     0.0028586013    W430    0.005386391


Thomas Hardy (English):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     T000    0.093254805
1oe     0.020316487     A000    0.036285233
1oy     0.014599285     A530    0.030430516
s       0.011638591     O100    0.029694293
oy      0.010107198     S000    0.02099986
19      0.010107198     W200    0.019352125
am      0.009596733     I500    0.01731875
8ay     0.00949464      H000    0.014128454
89      0.0089841755    I000    0.013357174
K9      0.008882083     B000    0.012235311
2oe     0.0077590607    H200    0.01167438
8an     0.0073506893    S300    0.011428973
1c9     0.0071465033    W300    0.011393914
oe      0.006738132     T300    0.010517459
ay      0.006636039     A200    0.010236993
2o      0.006533946     H600    0.010131819
oham    0.005104645     H300    0.009956528
oh9     0.0047983667    F600    0.009465713
4ok19   0.0047983667    I300    0.009430655
9       0.0046962737    T200    0.009255365
8ae     0.0044920878    Y000    0.009255365
2oy     0.004389995     O500    0.008974899
Koe     0.004287902     T500    0.008063385
4oh19   0.004287902     M500    0.0076426873
7am     0.0041858093    A300    0.007081756
Koy     0.0040837163    N000    0.006976581
8oy     0.0039816233    M000    0.006731174
29      0.0039816233    B300    0.006625999
1H9     0.0039816233    T600    0.006590941
ok9     0.0039816233    A500    0.0063455338
1o      0.0038795304    F650    0.005994952
8oe     0.0034711587    W600    0.005539195
1o89    0.003369066     W000    0.005188613
ohae    0.003369066     N300    0.0050834385
1am     0.003266973     W400    0.0049081477
1c79    0.003266973     O200    0.0048380313
y       0.00316488      I200    0.0048380313
1ay     0.0029606943    H100    0.0048380313
okoe    0.0028586013    S500    0.0047328565
sam     0.0028586013    G164    0.004592624


Spanish 1543 (Spanish):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     T000    0.05038168
1oe     0.020316487     Q000    0.044711012
1oy     0.014599285     N000    0.044274807
s       0.011638591     D000    0.03838604
oy      0.010107198     Y000    0.036423117
19      0.010107198     L000    0.03598691
am      0.009596733     S000    0.02791712
8ay     0.00949464      E500    0.021374045
89      0.0089841755    H200    0.020719739
K9      0.008882083     A000    0.018974917
2oe     0.0077590607    E400    0.017230097
8an     0.0073506893    L200    0.016793894
1c9     0.0071465033    P600    0.016575791
oe      0.006738132     C500    0.015485277
ay      0.006636039     B500    0.014176663
2o      0.006533946     M200    0.01308615
oham    0.005104645     C000    0.01308615
oh9     0.0047983667    Q620    0.012868048
4ok19   0.0047983667    E200    0.011123228
9       0.0046962737    S200    0.009378408
8ae     0.0044920878    D120    0.009378408
2oy     0.004389995     M000    0.0082878955
Koe     0.004287902     H516    0.008069793
4oh19   0.004287902     P620    0.007197383
7am     0.0041858093    D200    0.0063249725
Koy     0.0040837163    S500    0.0061068702
8oy     0.0039816233    G650    0.0056706653
29      0.0039816233    Q530    0.0054525626
1H9     0.0039816233    S516    0.0054525626
ok9     0.0039816233    A520    0.0052344603
1o      0.0038795304    M260    0.0050163576
8oe     0.0034711587    M400    0.0050163576
1o89    0.003369066     C200    0.0050163576
ohae    0.003369066     D400    0.004798255
1am     0.003266973     S600    0.004798255
1c79    0.003266973     T300    0.004798255
y       0.00316488      A400    0.0045801527
1ay     0.0029606943    P300    0.0045801527
okoe    0.0028586013    P360    0.00436205
sam     0.0028586013    S160    0.00436205


French 1367 (Mediaeval French):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     L000    0.05783751
1oe     0.020316487     D000    0.05557233
1oy     0.014599285     E300    0.049682874
s       0.011638591     Q000    0.030806402
oy      0.010107198     A000    0.025218965
19      0.010107198     P600    0.025218965
am      0.009596733     L200    0.024161883
8ay     0.00949464      E500    0.022802778
89      0.0089841755    O000    0.018272424
K9      0.008882083     S300    0.017366353
2oe     0.0077590607    &000    0.015252189
8an     0.0073506893    S530    0.014799153
1c9     0.0071465033    R000    0.0135910595
oe      0.006738132     N000    0.0135910595
ay      0.006636039     S000    0.011476895
2o      0.006533946     L600    0.0110238595
oham    0.005104645     N400    0.009664753
oh9     0.0047983667    S256    0.009513741
4ok19   0.0047983667    E524    0.009211718
9       0.0046962737    C000    0.009211718
8ae     0.0044920878    P625    0.009060706
2oy     0.004389995     D300    0.0084566595
Koe     0.004287902     D200    0.008003624
4oh19   0.004287902     C200    0.008003624
7am     0.0041858093    D320    0.008003624
Koy     0.0040837163    N236    0.007852612
8oy     0.0039816233    E230    0.0070975535
29      0.0039816233    C500    0.0057384474
1H9     0.0039816233    O635    0.0057384474
ok9     0.0039816233    P200    0.0057384474
1o      0.0038795304    T600    0.0054364237
8oe     0.0034711587    I620    0.0051344004
1o89    0.003369066     I350    0.0051344004
ohae    0.003369066     V253    0.0049833884
1am     0.003266973     S600    0.0048323767
1c79    0.003266973     F652    0.004681365
y       0.00316488      T520    0.004681365
1ay     0.0029606943    F600    0.004530353
okoe    0.0028586013    G600    0.0043793414
sam     0.0028586013    E200    0.0040773177


Flores Filosofia (Mediaeval Spanish):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     Q000    0.07295496
1oe     0.020316487     E000    0.069623165
1oy     0.014599285     L000    0.04779412
s       0.011638591     E400    0.03952206
oy      0.010107198     S000    0.037798714
19      0.010107198     E200    0.030101104
am      0.009596733     D000    0.026424633
8ay     0.00949464      N500    0.02366728
89      0.0089841755    C500    0.017922794
K9      0.008882083     P600    0.017233456
2oe     0.0077590607    L200    0.017118566
8an     0.0073506893    D400    0.015165442
1c9     0.0071465033    B500    0.014246323
oe      0.006738132     E500    0.013901655
ay      0.006636039     O500    0.013556985
2o      0.006533946     D200    0.013212317
oham    0.005104645     C000    0.012293198
oh9     0.0047983667    M200    0.010914522
4ok19   0.0047983667    P200    0.009995405
9       0.0046962737    H000    0.008157169
8ae     0.0044920878    S130    0.00804228
2oy     0.004389995     F200    0.007582721
Koe     0.004287902     S200    0.007467831
4oh19   0.004287902     M260    0.007467831
7am     0.0041858093    A400    0.0065487134
Koy     0.0040837163    M400    0.0064338236
8oy     0.0039816233    S500    0.006204044
29      0.0039816233    Q530    0.0060891546
1H9     0.0039816233    S600    0.005974265
ok9     0.0039816233    C220    0.005859375
1o      0.0038795304    S160    0.005514706
8oe     0.0034711587    E530    0.0051700366
1o89    0.003369066     T000    0.0051700366
ohae    0.003369066     F260    0.0051700366
1am     0.003266973     A200    0.0048253676
1c79    0.003266973     A420    0.0048253676
y       0.00316488      R000    0.0045955884
1ay     0.0029606943    D420    0.004365809
okoe    0.0028586013    M530    0.004250919
sam     0.0028586013    V500    0.004250919


Vietnamese Bible (Vietnamese):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     N200    0.107542574
1oe     0.020316487     C000    0.09951338
1oy     0.014599285     I000    0.08856448
s       0.011638591     T000    0.086618006
oy      0.010107198     N000    0.083211675
19      0.010107198     V000    0.045012165
am      0.009596733     S000    0.042092457
8ay     0.00949464      L000    0.034793187
89      0.0089841755    H000    0.034306567
K9      0.008882083     M000    0.029927006
2oe     0.0077590607    G000    0.023357663
8an     0.0073506893    A000    0.020924574
1c9     0.0071465033    Y000    0.019708028
oe      0.006738132     K000    0.01922141
ay      0.006636039     U000    0.016301703
2o      0.006533946     L500    0.016058395
oham    0.005104645     P000    0.015815085
oh9     0.0047983667    B000    0.015815085
4ok19   0.0047983667    C520    0.01459854
9       0.0046962737    T600    0.0124087585
8ae     0.0044920878    R000    0.01216545
2oy     0.004389995     C200    0.011435523
Koe     0.004287902     ╨000    0.011192214
4oh19   0.004287902     D000    0.011192214
7am     0.0041858093    K520    0.010948905
Koy     0.0040837163    B500    0.009489051
8oy     0.0039816233    M500    0.008029197
29      0.0039816233    ├000    0.007542579
1H9     0.0039816233    T652    0.005839416
ok9     0.0039816233    T500    0.005596107
1o      0.0038795304    C500    0.005596107
8oe     0.0034711587    D500    0.005109489
1o89    0.003369066     L520    0.004379562
ohae    0.003369066     N500    0.004379562
1am     0.003266973     X000    0.003892944
1c79    0.003266973     ╙000    0.003649635
y       0.00316488      T650    0.003649635
1ay     0.0029606943    P500    0.003649635
okoe    0.0028586013    ┴200    0.003406326
sam     0.0028586013    D520    0.002676399
Book of the Courtier 1561 (Mediaeval English):

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     T000    0.08259616
1oe     0.020316487     A530    0.04111586
1oy     0.014599285     O100    0.03379386
s       0.011638591     I500    0.02981811
oy      0.010107198     T300    0.026770037
19      0.010107198     A000    0.022396713
am      0.009596733     B000    0.015902992
8ay     0.00949464      T500    0.015306629
89      0.0089841755    S000    0.01424643
K9      0.008882083     M500    0.013948249
2oe     0.0077590607    I300    0.013948249
8an     0.0073506893    T200    0.013517543
1c9     0.0071465033    H000    0.013086837
oe      0.006738132     N300    0.012755524
ay      0.006636039     W300    0.012324818
2o      0.006533946     F600    0.012291688
oham    0.005104645     W200    0.0120929
oh9     0.0047983667    I000    0.011132094
4ok19   0.0047983667    I200    0.010535732
9       0.0046962737    B300    0.010138158
8ae     0.0044920878    M000    0.009475533
2oy     0.004389995     A200    0.009077958
Koe     0.004287902     H100    0.008216546
4oh19   0.004287902     H200    0.007918364
7am     0.0041858093    T520    0.0077195773
Koy     0.0040837163    A500    0.007388265
8oy     0.0039816233    W400    0.0072888713
29      0.0039816233    A400    0.0070569525
1H9     0.0039816233    Y000    0.0070238216
ok9     0.0039816233    O360    0.0067587714
1o      0.0038795304    H300    0.0063611967
8oe     0.0034711587    T600    0.006096147
1o89    0.003369066     H500    0.0060630157
ohae    0.003369066     O600    0.005930491
1am     0.003266973     W600    0.005864228
1c79    0.003266973     M230    0.0054666535
y       0.00316488      W000    0.0054666535
1ay     0.0029606943    M200    0.005135341
okoe    0.0028586013    S300    0.0050359475
sam     0.0028586013    S500    0.004969685

Recipes Folios

The Recipes section with the Latin Herb Garden:

Top 40 words in source and target
Source          Target
------          ------
am      0.018943263     E300    0.021619136
ay      0.014393166     H200    0.019319227
ae      0.0136502925    S000    0.014719411
1c89    0.012257406     I500    0.013339466
4ohC9   0.01179311      Q000    0.01149954
1c9     0.010493082     N500    0.008279668
oe      0.010400223     Q200    0.006439742
4oham   0.010307364     Q300    0.006439742
8am     0.009843068     V620    0.0059797605
4ohan   0.009564491     D500    0.0059797605
oham    0.007800167     P600    0.0059797605
okam    0.0070572942    P632    0.0059797605
oy      0.006314421     F630    0.005519779
an      0.0058501256    I400    0.0050597973
ohan    0.0058501256    N200    0.0050597973
e       0.0058501256    I536    0.0050597973
2c89    0.005757266     Q500    0.0050597973
1c79    0.0056644073    C616    0.0045998157
ohC9    0.0056644073    N550    0.0045998157
okay    0.0052929707    S100    0.0045998157
1oe     0.0052929707    S300    0.0045998157
2c9     0.0051072524    E230    0.004139834
okae    0.004921534     O360    0.004139834
4ohC89  0.004921534     C523    0.004139834
okan    0.0048286747    V400    0.004139834
1coe    0.004735816     C500    0.004139834
1C9     0.0042715203    T500    0.004139834
ohae    0.004085802     L300    0.004139834
8an     0.0039929426    F653    0.0036798527
4okam   0.0038072243    U300    0.0036798527
8ay     0.0036215063    A300    0.0036798527
4ohay   0.0036215063    G520    0.0036798527
1co     0.0036215063    T100    0.0036798527
4ohae   0.0035286471    N236    0.0036798527
okC9    0.0035286471    S200    0.0036798527
4ohc9   0.003435788     I525    0.0036798527
eham    0.003435788     S162    0.003219871
9       0.003435788     T550    0.003219871
okc89   0.003435788     V536    0.003219871
y       0.0033429288    V633    0.003219871

Astrological Folios

The Astrological section with the Latin Herb Garden:

Top 40 words in source and target
Source          Target
------          ------
ay      0.016070843     E300    0.021619136
am      0.011479174     H200    0.019319227
ae      0.009839292     S000    0.014719411
8am     0.008527386     I500    0.013339466
s       0.007543457     Q000    0.01149954
8ay     0.007215481     N500    0.008279668
8ae     0.006559528     Q200    0.006439742
89      0.0062315515    Q300    0.006439742
okc9    0.0062315515    V620    0.0059797605
okcos   0.005903575     D500    0.0059797605
okC9    0.005903575     P600    0.0059797605
ohC9    0.0055755987    P632    0.0059797605
okay    0.0052476223    F630    0.005519779
1c9     0.0052476223    I400    0.0050597973
okam    0.004919646     N200    0.0050597973
okcc9   0.004919646     I536    0.0050597973
ok9     0.0045916694    Q500    0.0050597973
ap      0.0045916694    C616    0.0045998157
o       0.0045916694    N550    0.0045998157
okco89  0.0045916694    S100    0.0045998157
oe      0.004263693     S300    0.0045998157
1oe     0.004263693     E230    0.004139834
okae    0.0039357166    O360    0.004139834
2c9     0.0039357166    C523    0.004139834
1coy    0.0039357166    V400    0.004139834
ohae    0.0039357166    C500    0.004139834
9       0.0036077404    T500    0.004139834
say     0.0036077404    L300    0.004139834
19      0.0036077404    F653    0.0036798527
ohay    0.0036077404    U300    0.0036798527
oy      0.003279764     A300    0.0036798527
7ay     0.003279764     G520    0.0036798527
1ch9    0.003279764     T100    0.0036798527
oham    0.003279764     N236    0.0036798527
ohcos   0.003279764     S200    0.0036798527
1co89   0.0029517876    I525    0.0036798527
okoe    0.0029517876    S162    0.003219871
ae9     0.0029517876    T550    0.003219871
ohcoe   0.0026238111    V536    0.003219871
2oe     0.0026238111    V633    0.003219871

Biological Folios

The Biological section with the Latin Herb Garden:

Top 40 words in source and target
Source          Target
------          ------
oe      0.03410959      E300    0.021619136
4ohan   0.021643836     H200    0.019319227
1c89    0.02109589      S000    0.014719411
2c89    0.015616438     I500    0.013339466
4ohc89  0.015479452     Q000    0.01149954
4oe     0.015342466     N500    0.008279668
4ohae   0.014109589     Q200    0.006439742
1c9     0.013287671     Q300    0.006439742
4oham   0.011506849     V620    0.0059797605
4ohC89  0.0113698635    D500    0.0059797605
8am     0.010410959     P600    0.0059797605
oy      0.010136986     P632    0.0059797605
4ohC9   0.009589042     F630    0.005519779
8ay     0.009178082     I400    0.0050597973
2c9     0.009041096     N200    0.0050597973
4oh9    0.007945205     I536    0.0050597973
1c79    0.007671233     Q500    0.0050597973
8ae     0.007671233     C616    0.0045998157
am      0.007260274     N550    0.0045998157
4ohay   0.0064383564    S100    0.0045998157
y       0.00630137      S300    0.0045998157
8an     0.0060273972    E230    0.004139834
ohan    0.0060273972    O360    0.004139834
89      0.005890411     C523    0.004139834
4ohc79  0.0057534245    V400    0.004139834
1H9     0.0056164386    C500    0.004139834
4ohc9   0.005479452     T500    0.004139834
e1c89   0.005068493     L300    0.004139834
soe     0.005068493     F653    0.0036798527
4ohcc89 0.005068493     U300    0.0036798527
4okc89  0.005068493     A300    0.0036798527
okc89   0.0049315067    G520    0.0036798527
oham    0.004794521     T100    0.0036798527
2c79    0.004794521     N236    0.0036798527
s       0.0046575344    S200    0.0036798527
ay      0.004520548     I525    0.0036798527
sam     0.0043835617    S162    0.003219871
san     0.004109589     T550    0.003219871
oe9     0.0039726025    V536    0.003219871
5c89    0.0039726025    V633    0.003219871

Frequency Table for Plaintext Words

As a comparison, here is a frequency table for plaintext words in the "Latin Herb Garden" compared with the Herbal Folios

Top 40 words in source and target
Source          Target
------          ------
8am     0.038182747     et      0.021159153
1oe     0.020316487     in      0.0096596135
1oy     0.014599285     si      0.0096596135
s       0.011638591     non     0.008279668
oy      0.010107198     quae    0.006899724
19      0.010107198     hoc     0.005519779
am      0.009596733     dum     0.0045998157
8ay     0.00949464      huius   0.004139834
89      0.0089841755    quam    0.004139834
K9      0.008882083     haec    0.004139834
2oe     0.0077590607    quod    0.0036798527
8an     0.0073506893    ut      0.0036798527
1c9     0.0071465033    per     0.0036798527
oe      0.006738132     tibi    0.0036798527
ay      0.006636039     cum     0.0036798527
2o      0.006533946     tum     0.0036798527
oham    0.005104645     tamen   0.003219871
oh9     0.0047983667    nec     0.003219871
4ok19   0.0047983667    forte   0.003219871
9       0.0046962737    est     0.003219871
8ae     0.0044920878    illa    0.0027598895
2oy     0.004389995     sub     0.0027598895
Koe     0.004287902     inter   0.0027598895
4oh19   0.004287902     vires   0.0027598895
7am     0.0041858093    genus   0.0027598895
Koy     0.0040837163    sed     0.0027598895
8oy     0.0039816233    lilia   0.0027598895
29      0.0039816233    quoque  0.0022999079
1H9     0.0039816233    se      0.0022999079
ok9     0.0039816233    iam     0.0022999079
1o      0.0038795304    undique 0.0022999079
8oe     0.0034711587    quis    0.0022999079
1o89    0.003369066     aut     0.0022999079
ohae    0.003369066     etiam   0.0022999079
1am     0.003266973     satis   0.0022999079
1c79    0.003266973     viscera 0.0022999079
y       0.00316488      odore   0.0022999079
1ay     0.0029606943    de      0.0022999079
okoe    0.0028586013    quo     0.0022999079
sam     0.0028586013    ore     0.0018399263