The procedure is to take each word in a body of plaintext and convert it to a
phonetic code using the
"Soundex" and
Double Metaphone
algorithms. Then accumulate the frequency
distribution of the phonetic codes in the plaintext and compare with the
frequency distribution of word counts in the VMs.
If the VMs "words" are phonetic abbreviations of plaintext words, then
one might expect the frequency distributions to match in form and level.
(Of course, Soundex is designed for English, so may not make much practical sense when applied to other languages. However, it is one means of expressing phonetic content: and thus allows us to generate a phonetic description for any foreign language word, albeit in an English pronounciation! Double Metaphone is designed for several different languages.)
To illustrate the Soundex compression, here is a Latin phrase and its Soundex equivalent:
et post non multas elapsi temporis horas E300 P230 N500 M432 E412 T516 H620
See below for the Soundex results
(there are a total of 26x666 = 17316 Soundex codes possible i.e. A000 ... Z666)
With the Double Metaphone encoding,
the nice feature is that the encoded words look
a little like the plaintext word, so you get a better
feel for what is seen.
For example:
Plain: The quick brown fox jumps over the lazy dog Soundex: T000 Q200 B650 F200 J512 O160 T000 L200 D200 Dbl.Meta: 0 KK PRN FKS JMPS AFR 0 LS DK
Also, one is able to specify a maximum encoding length, so
that longer words like "manuscript" come out as "MNSKPT"
in Double Metaphone, rather than the "M526" you would get
in Soundex.
Producing the word frequency tables using Double
Metaphone (comparing the Herbal Folios with a Latin
text describing a herb garden)
8am 0.038182747 AT 0.029438822 e.g "et" 1oe 0.020316487 S 0.014719411 "se" 1oy 0.014599285 HK 0.01425943 "huc" s 0.011638591 K 0.012419503 "quo" oy 0.010107198 AN 0.012419503 "in" 19 0.010107198 TM 0.00873965 "tum" am 0.009596733 NN 0.008279668 "non" 8ay 0.00949464 KM 0.008279668 "quam" 89 0.0089841755 KT 0.006439742 "quot" K9 0.008882083 FRT 0.0059797605 "forti" (numbers are the relative frequencies) Compare with Soundex: 8am 0.038182747 E300 0.021619136 1oe 0.020316487 H200 0.019319227 1oy 0.014599285 S000 0.014719411 s 0.011638591 I500 0.013339466 oy 0.010107198 Q000 0.01149954 19 0.010107198 N500 0.008279668 am 0.009596733 Q200 0.006439742 8ay 0.00949464 Q300 0.006439742 89 0.0089841755 V620 0.0059797605 K9 0.008882083 D500 0.0059797605
The phonetic distribution is *very* sensitive to what is used for the plaintext,
(surprisingly?) which makes it unreasonable to draw any conclusions by
comparison to the VMs.
Augustinus (Latin): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 E300 0.064491205 1oe 0.020316487 N500 0.026331432 1oy 0.014599285 M000 0.025974799 s 0.011638591 I500 0.025737043 oy 0.010107198 T000 0.02318117 19 0.010107198 Q000 0.021932954 am 0.009596733 E200 0.02044698 8ay 0.00949464 E230 0.013789824 89 0.0089841755 A300 0.012719925 K9 0.008882083 Q300 0.012719925 2oe 0.0077590607 S300 0.011174513 8an 0.0073506893 N200 0.010639563 1c9 0.0071465033 I400 0.009807418 oe 0.006738132 S000 0.008737518 ay 0.006636039 U300 0.008321445 2o 0.006533946 C500 0.00808369 oham 0.005104645 E550 0.007964812 oh9 0.0047983667 H200 0.007964812 4ok19 0.0047983667 D200 0.00754874 9 0.0046962737 D000 0.0067760344 8ae 0.0044920878 Q200 0.0067760344 2oy 0.004389995 P600 0.0065977178 Koe 0.004287902 Q500 0.00647884 4oh19 0.004287902 M200 0.005884451 7am 0.0041858093 S500 0.0057061343 Koy 0.0040837163 A000 0.0055278176 8oy 0.0039816233 A320 0.005171184 29 0.0039816233 I120 0.0048145507 1H9 0.0039816233 E000 0.0045173564 ok9 0.0039816233 E630 0.0042796005 1o 0.0038795304 I300 0.0041607227 8oe 0.0034711587 D550 0.003982406 1o89 0.003369066 A350 0.0039229672 ohae 0.003369066 E350 0.0039229672 1am 0.003266973 T550 0.0038635284 1c79 0.003266973 A550 0.0038635284 y 0.00316488 S530 0.003625773 1ay 0.0029606943 M500 0.003566334 okoe 0.0028586013 S200 0.0033285783 sam 0.0028586013 E500 0.0032097006 Latin Herb Garden (Latin): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 E300 0.021619136 1oe 0.020316487 H200 0.019319227 1oy 0.014599285 S000 0.014719411 s 0.011638591 I500 0.013339466 oy 0.010107198 Q000 0.01149954 19 0.010107198 N500 0.008279668 am 0.009596733 Q200 0.006439742 8ay 0.00949464 Q300 0.006439742 89 0.0089841755 V620 0.0059797605 K9 0.008882083 D500 0.0059797605 2oe 0.0077590607 P600 0.0059797605 8an 0.0073506893 P632 0.0059797605 1c9 0.0071465033 F630 0.005519779 oe 0.006738132 I400 0.0050597973 ay 0.006636039 N200 0.0050597973 2o 0.006533946 I536 0.0050597973 oham 0.005104645 Q500 0.0050597973 oh9 0.0047983667 C616 0.0045998157 4ok19 0.0047983667 N550 0.0045998157 9 0.0046962737 S100 0.0045998157 8ae 0.0044920878 S300 0.0045998157 2oy 0.004389995 E230 0.004139834 Koe 0.004287902 O360 0.004139834 4oh19 0.004287902 C523 0.004139834 7am 0.0041858093 V400 0.004139834 Koy 0.0040837163 C500 0.004139834 8oy 0.0039816233 T500 0.004139834 29 0.0039816233 L300 0.004139834 1H9 0.0039816233 F653 0.0036798527 ok9 0.0039816233 U300 0.0036798527 1o 0.0038795304 A300 0.0036798527 8oe 0.0034711587 G520 0.0036798527 1o89 0.003369066 T100 0.0036798527 ohae 0.003369066 N236 0.0036798527 1am 0.003266973 S200 0.0036798527 1c79 0.003266973 I525 0.0036798527 y 0.00316488 S162 0.003219871 1ay 0.0029606943 T550 0.003219871 okoe 0.0028586013 V536 0.003219871 sam 0.0028586013 V633 0.003219871
Culpeper (Old English): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 T000 0.10639342 1oe 0.020316487 A530 0.054539897 1oy 0.014599285 O100 0.035268757 s 0.011638591 I500 0.024507163 oy 0.010107198 O600 0.018764427 19 0.010107198 I300 0.017799262 am 0.009596733 A000 0.017147776 8ay 0.00949464 I200 0.01615848 89 0.0089841755 W300 0.013842083 K9 0.008882083 T500 0.01140504 2oe 0.0077590607 B000 0.010568563 8an 0.0073506893 T200 0.010464003 1c9 0.0071465033 S300 0.0094907945 oe 0.006738132 F600 0.009394278 ay 0.006636039 B300 0.008783007 2o 0.006533946 T300 0.008533672 oham 0.005104645 A200 0.00839694 oh9 0.0047983667 A600 0.007793712 4ok19 0.0047983667 W200 0.0068124603 9 0.0046962737 H300 0.00673203 8ae 0.0044920878 L120 0.0065068244 2oy 0.004389995 S500 0.0060081556 Koe 0.004287902 A500 0.005895553 4oh19 0.004287902 A420 0.005662305 7am 0.0041858093 A400 0.00551753 Koy 0.0040837163 B520 0.005356669 8oy 0.0039816233 T600 0.005348626 29 0.0039816233 O500 0.0049706027 1H9 0.0039816233 G300 0.004874086 ok9 0.0039816233 M500 0.0047856127 1o 0.0038795304 H610 0.004624752 8oe 0.0034711587 L200 0.0045764935 1o89 0.003369066 W400 0.0041019535 ohae 0.003369066 B430 0.0040536956 1am 0.003266973 A300 0.0040295664 1c79 0.003266973 R300 0.003965222 y 0.00316488 M300 0.0039089206 1ay 0.0029606943 S540 0.0039008774 okoe 0.0028586013 S000 0.003836533 sam 0.0028586013 P420 0.003820447 German Cook Book 1553 (Old German): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 V530 0.06859404 1oe 0.020316487 A500 0.06320765 1oy 0.014599285 J500 0.030449597 s 0.011638591 D200 0.026547212 oy 0.010107198 S000 0.026437286 19 0.010107198 D000 0.02423876 am 0.009596733 D500 0.01802792 8ay 0.00949464 D650 0.01593932 89 0.0089841755 D652 0.014949983 K9 0.008882083 E200 0.014345388 2oe 0.0077590607 N500 0.01264153 8an 0.0073506893 Z000 0.011542266 1c9 0.0071465033 L200 0.0113773765 oe 0.006738132 W400 0.01038804 ay 0.006636039 M200 0.010278113 2o 0.006533946 T000 0.009948335 oham 0.005104645 M300 0.009563592 oh9 0.0047983667 W500 0.008849071 4ok19 0.0047983667 T200 0.0087941075 9 0.0046962737 M250 0.008739145 8ae 0.0044920878 A100 0.008244476 2oy 0.004389995 A550 0.0081345495 Koe 0.004287902 V500 0.008024624 4oh19 0.004287902 J300 0.0078047705 7am 0.0041858093 A420 0.0077498076 Koy 0.0040837163 A200 0.007694844 8oy 0.0039816233 S350 0.007255139 29 0.0039816233 W520 0.007255139 1H9 0.0039816233 D600 0.0071452125 ok9 0.0039816233 W000 0.006650544 1o 0.0038795304 N300 0.006430691 8oe 0.0034711587 Z260 0.006375728 1o89 0.003369066 O360 0.006265802 ohae 0.003369066 M500 0.006265802 1am 0.003266973 N513 0.0058260965 1c79 0.003266973 G300 0.00571617 y 0.00316488 B200 0.0054963175 1ay 0.0029606943 W630 0.0054963175 okoe 0.0028586013 S432 0.005386391 sam 0.0028586013 W430 0.005386391 Thomas Hardy (English): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 T000 0.093254805 1oe 0.020316487 A000 0.036285233 1oy 0.014599285 A530 0.030430516 s 0.011638591 O100 0.029694293 oy 0.010107198 S000 0.02099986 19 0.010107198 W200 0.019352125 am 0.009596733 I500 0.01731875 8ay 0.00949464 H000 0.014128454 89 0.0089841755 I000 0.013357174 K9 0.008882083 B000 0.012235311 2oe 0.0077590607 H200 0.01167438 8an 0.0073506893 S300 0.011428973 1c9 0.0071465033 W300 0.011393914 oe 0.006738132 T300 0.010517459 ay 0.006636039 A200 0.010236993 2o 0.006533946 H600 0.010131819 oham 0.005104645 H300 0.009956528 oh9 0.0047983667 F600 0.009465713 4ok19 0.0047983667 I300 0.009430655 9 0.0046962737 T200 0.009255365 8ae 0.0044920878 Y000 0.009255365 2oy 0.004389995 O500 0.008974899 Koe 0.004287902 T500 0.008063385 4oh19 0.004287902 M500 0.0076426873 7am 0.0041858093 A300 0.007081756 Koy 0.0040837163 N000 0.006976581 8oy 0.0039816233 M000 0.006731174 29 0.0039816233 B300 0.006625999 1H9 0.0039816233 T600 0.006590941 ok9 0.0039816233 A500 0.0063455338 1o 0.0038795304 F650 0.005994952 8oe 0.0034711587 W600 0.005539195 1o89 0.003369066 W000 0.005188613 ohae 0.003369066 N300 0.0050834385 1am 0.003266973 W400 0.0049081477 1c79 0.003266973 O200 0.0048380313 y 0.00316488 I200 0.0048380313 1ay 0.0029606943 H100 0.0048380313 okoe 0.0028586013 S500 0.0047328565 sam 0.0028586013 G164 0.004592624 Spanish 1543 (Spanish): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 T000 0.05038168 1oe 0.020316487 Q000 0.044711012 1oy 0.014599285 N000 0.044274807 s 0.011638591 D000 0.03838604 oy 0.010107198 Y000 0.036423117 19 0.010107198 L000 0.03598691 am 0.009596733 S000 0.02791712 8ay 0.00949464 E500 0.021374045 89 0.0089841755 H200 0.020719739 K9 0.008882083 A000 0.018974917 2oe 0.0077590607 E400 0.017230097 8an 0.0073506893 L200 0.016793894 1c9 0.0071465033 P600 0.016575791 oe 0.006738132 C500 0.015485277 ay 0.006636039 B500 0.014176663 2o 0.006533946 M200 0.01308615 oham 0.005104645 C000 0.01308615 oh9 0.0047983667 Q620 0.012868048 4ok19 0.0047983667 E200 0.011123228 9 0.0046962737 S200 0.009378408 8ae 0.0044920878 D120 0.009378408 2oy 0.004389995 M000 0.0082878955 Koe 0.004287902 H516 0.008069793 4oh19 0.004287902 P620 0.007197383 7am 0.0041858093 D200 0.0063249725 Koy 0.0040837163 S500 0.0061068702 8oy 0.0039816233 G650 0.0056706653 29 0.0039816233 Q530 0.0054525626 1H9 0.0039816233 S516 0.0054525626 ok9 0.0039816233 A520 0.0052344603 1o 0.0038795304 M260 0.0050163576 8oe 0.0034711587 M400 0.0050163576 1o89 0.003369066 C200 0.0050163576 ohae 0.003369066 D400 0.004798255 1am 0.003266973 S600 0.004798255 1c79 0.003266973 T300 0.004798255 y 0.00316488 A400 0.0045801527 1ay 0.0029606943 P300 0.0045801527 okoe 0.0028586013 P360 0.00436205 sam 0.0028586013 S160 0.00436205 French 1367 (Mediaeval French): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 L000 0.05783751 1oe 0.020316487 D000 0.05557233 1oy 0.014599285 E300 0.049682874 s 0.011638591 Q000 0.030806402 oy 0.010107198 A000 0.025218965 19 0.010107198 P600 0.025218965 am 0.009596733 L200 0.024161883 8ay 0.00949464 E500 0.022802778 89 0.0089841755 O000 0.018272424 K9 0.008882083 S300 0.017366353 2oe 0.0077590607 &000 0.015252189 8an 0.0073506893 S530 0.014799153 1c9 0.0071465033 R000 0.0135910595 oe 0.006738132 N000 0.0135910595 ay 0.006636039 S000 0.011476895 2o 0.006533946 L600 0.0110238595 oham 0.005104645 N400 0.009664753 oh9 0.0047983667 S256 0.009513741 4ok19 0.0047983667 E524 0.009211718 9 0.0046962737 C000 0.009211718 8ae 0.0044920878 P625 0.009060706 2oy 0.004389995 D300 0.0084566595 Koe 0.004287902 D200 0.008003624 4oh19 0.004287902 C200 0.008003624 7am 0.0041858093 D320 0.008003624 Koy 0.0040837163 N236 0.007852612 8oy 0.0039816233 E230 0.0070975535 29 0.0039816233 C500 0.0057384474 1H9 0.0039816233 O635 0.0057384474 ok9 0.0039816233 P200 0.0057384474 1o 0.0038795304 T600 0.0054364237 8oe 0.0034711587 I620 0.0051344004 1o89 0.003369066 I350 0.0051344004 ohae 0.003369066 V253 0.0049833884 1am 0.003266973 S600 0.0048323767 1c79 0.003266973 F652 0.004681365 y 0.00316488 T520 0.004681365 1ay 0.0029606943 F600 0.004530353 okoe 0.0028586013 G600 0.0043793414 sam 0.0028586013 E200 0.0040773177 Flores Filosofia (Mediaeval Spanish): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 Q000 0.07295496 1oe 0.020316487 E000 0.069623165 1oy 0.014599285 L000 0.04779412 s 0.011638591 E400 0.03952206 oy 0.010107198 S000 0.037798714 19 0.010107198 E200 0.030101104 am 0.009596733 D000 0.026424633 8ay 0.00949464 N500 0.02366728 89 0.0089841755 C500 0.017922794 K9 0.008882083 P600 0.017233456 2oe 0.0077590607 L200 0.017118566 8an 0.0073506893 D400 0.015165442 1c9 0.0071465033 B500 0.014246323 oe 0.006738132 E500 0.013901655 ay 0.006636039 O500 0.013556985 2o 0.006533946 D200 0.013212317 oham 0.005104645 C000 0.012293198 oh9 0.0047983667 M200 0.010914522 4ok19 0.0047983667 P200 0.009995405 9 0.0046962737 H000 0.008157169 8ae 0.0044920878 S130 0.00804228 2oy 0.004389995 F200 0.007582721 Koe 0.004287902 S200 0.007467831 4oh19 0.004287902 M260 0.007467831 7am 0.0041858093 A400 0.0065487134 Koy 0.0040837163 M400 0.0064338236 8oy 0.0039816233 S500 0.006204044 29 0.0039816233 Q530 0.0060891546 1H9 0.0039816233 S600 0.005974265 ok9 0.0039816233 C220 0.005859375 1o 0.0038795304 S160 0.005514706 8oe 0.0034711587 E530 0.0051700366 1o89 0.003369066 T000 0.0051700366 ohae 0.003369066 F260 0.0051700366 1am 0.003266973 A200 0.0048253676 1c79 0.003266973 A420 0.0048253676 y 0.00316488 R000 0.0045955884 1ay 0.0029606943 D420 0.004365809 okoe 0.0028586013 M530 0.004250919 sam 0.0028586013 V500 0.004250919 Vietnamese Bible (Vietnamese): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 N200 0.107542574 1oe 0.020316487 C000 0.09951338 1oy 0.014599285 I000 0.08856448 s 0.011638591 T000 0.086618006 oy 0.010107198 N000 0.083211675 19 0.010107198 V000 0.045012165 am 0.009596733 S000 0.042092457 8ay 0.00949464 L000 0.034793187 89 0.0089841755 H000 0.034306567 K9 0.008882083 M000 0.029927006 2oe 0.0077590607 G000 0.023357663 8an 0.0073506893 A000 0.020924574 1c9 0.0071465033 Y000 0.019708028 oe 0.006738132 K000 0.01922141 ay 0.006636039 U000 0.016301703 2o 0.006533946 L500 0.016058395 oham 0.005104645 P000 0.015815085 oh9 0.0047983667 B000 0.015815085 4ok19 0.0047983667 C520 0.01459854 9 0.0046962737 T600 0.0124087585 8ae 0.0044920878 R000 0.01216545 2oy 0.004389995 C200 0.011435523 Koe 0.004287902 ╨000 0.011192214 4oh19 0.004287902 D000 0.011192214 7am 0.0041858093 K520 0.010948905 Koy 0.0040837163 B500 0.009489051 8oy 0.0039816233 M500 0.008029197 29 0.0039816233 ├000 0.007542579 1H9 0.0039816233 T652 0.005839416 ok9 0.0039816233 T500 0.005596107 1o 0.0038795304 C500 0.005596107 8oe 0.0034711587 D500 0.005109489 1o89 0.003369066 L520 0.004379562 ohae 0.003369066 N500 0.004379562 1am 0.003266973 X000 0.003892944 1c79 0.003266973 ╙000 0.003649635 y 0.00316488 T650 0.003649635 1ay 0.0029606943 P500 0.003649635 okoe 0.0028586013 ┴200 0.003406326 sam 0.0028586013 D520 0.002676399
Book of the Courtier 1561 (Mediaeval English): Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 T000 0.08259616 1oe 0.020316487 A530 0.04111586 1oy 0.014599285 O100 0.03379386 s 0.011638591 I500 0.02981811 oy 0.010107198 T300 0.026770037 19 0.010107198 A000 0.022396713 am 0.009596733 B000 0.015902992 8ay 0.00949464 T500 0.015306629 89 0.0089841755 S000 0.01424643 K9 0.008882083 M500 0.013948249 2oe 0.0077590607 I300 0.013948249 8an 0.0073506893 T200 0.013517543 1c9 0.0071465033 H000 0.013086837 oe 0.006738132 N300 0.012755524 ay 0.006636039 W300 0.012324818 2o 0.006533946 F600 0.012291688 oham 0.005104645 W200 0.0120929 oh9 0.0047983667 I000 0.011132094 4ok19 0.0047983667 I200 0.010535732 9 0.0046962737 B300 0.010138158 8ae 0.0044920878 M000 0.009475533 2oy 0.004389995 A200 0.009077958 Koe 0.004287902 H100 0.008216546 4oh19 0.004287902 H200 0.007918364 7am 0.0041858093 T520 0.0077195773 Koy 0.0040837163 A500 0.007388265 8oy 0.0039816233 W400 0.0072888713 29 0.0039816233 A400 0.0070569525 1H9 0.0039816233 Y000 0.0070238216 ok9 0.0039816233 O360 0.0067587714 1o 0.0038795304 H300 0.0063611967 8oe 0.0034711587 T600 0.006096147 1o89 0.003369066 H500 0.0060630157 ohae 0.003369066 O600 0.005930491 1am 0.003266973 W600 0.005864228 1c79 0.003266973 M230 0.0054666535 y 0.00316488 W000 0.0054666535 1ay 0.0029606943 M200 0.005135341 okoe 0.0028586013 S300 0.0050359475 sam 0.0028586013 S500 0.004969685
The Recipes section with the Latin Herb Garden: Top 40 words in source and target Source Target ------ ------ am 0.018943263 E300 0.021619136 ay 0.014393166 H200 0.019319227 ae 0.0136502925 S000 0.014719411 1c89 0.012257406 I500 0.013339466 4ohC9 0.01179311 Q000 0.01149954 1c9 0.010493082 N500 0.008279668 oe 0.010400223 Q200 0.006439742 4oham 0.010307364 Q300 0.006439742 8am 0.009843068 V620 0.0059797605 4ohan 0.009564491 D500 0.0059797605 oham 0.007800167 P600 0.0059797605 okam 0.0070572942 P632 0.0059797605 oy 0.006314421 F630 0.005519779 an 0.0058501256 I400 0.0050597973 ohan 0.0058501256 N200 0.0050597973 e 0.0058501256 I536 0.0050597973 2c89 0.005757266 Q500 0.0050597973 1c79 0.0056644073 C616 0.0045998157 ohC9 0.0056644073 N550 0.0045998157 okay 0.0052929707 S100 0.0045998157 1oe 0.0052929707 S300 0.0045998157 2c9 0.0051072524 E230 0.004139834 okae 0.004921534 O360 0.004139834 4ohC89 0.004921534 C523 0.004139834 okan 0.0048286747 V400 0.004139834 1coe 0.004735816 C500 0.004139834 1C9 0.0042715203 T500 0.004139834 ohae 0.004085802 L300 0.004139834 8an 0.0039929426 F653 0.0036798527 4okam 0.0038072243 U300 0.0036798527 8ay 0.0036215063 A300 0.0036798527 4ohay 0.0036215063 G520 0.0036798527 1co 0.0036215063 T100 0.0036798527 4ohae 0.0035286471 N236 0.0036798527 okC9 0.0035286471 S200 0.0036798527 4ohc9 0.003435788 I525 0.0036798527 eham 0.003435788 S162 0.003219871 9 0.003435788 T550 0.003219871 okc89 0.003435788 V536 0.003219871 y 0.0033429288 V633 0.003219871
The Astrological section with the Latin Herb Garden: Top 40 words in source and target Source Target ------ ------ ay 0.016070843 E300 0.021619136 am 0.011479174 H200 0.019319227 ae 0.009839292 S000 0.014719411 8am 0.008527386 I500 0.013339466 s 0.007543457 Q000 0.01149954 8ay 0.007215481 N500 0.008279668 8ae 0.006559528 Q200 0.006439742 89 0.0062315515 Q300 0.006439742 okc9 0.0062315515 V620 0.0059797605 okcos 0.005903575 D500 0.0059797605 okC9 0.005903575 P600 0.0059797605 ohC9 0.0055755987 P632 0.0059797605 okay 0.0052476223 F630 0.005519779 1c9 0.0052476223 I400 0.0050597973 okam 0.004919646 N200 0.0050597973 okcc9 0.004919646 I536 0.0050597973 ok9 0.0045916694 Q500 0.0050597973 ap 0.0045916694 C616 0.0045998157 o 0.0045916694 N550 0.0045998157 okco89 0.0045916694 S100 0.0045998157 oe 0.004263693 S300 0.0045998157 1oe 0.004263693 E230 0.004139834 okae 0.0039357166 O360 0.004139834 2c9 0.0039357166 C523 0.004139834 1coy 0.0039357166 V400 0.004139834 ohae 0.0039357166 C500 0.004139834 9 0.0036077404 T500 0.004139834 say 0.0036077404 L300 0.004139834 19 0.0036077404 F653 0.0036798527 ohay 0.0036077404 U300 0.0036798527 oy 0.003279764 A300 0.0036798527 7ay 0.003279764 G520 0.0036798527 1ch9 0.003279764 T100 0.0036798527 oham 0.003279764 N236 0.0036798527 ohcos 0.003279764 S200 0.0036798527 1co89 0.0029517876 I525 0.0036798527 okoe 0.0029517876 S162 0.003219871 ae9 0.0029517876 T550 0.003219871 ohcoe 0.0026238111 V536 0.003219871 2oe 0.0026238111 V633 0.003219871
The Biological section with the Latin Herb Garden: Top 40 words in source and target Source Target ------ ------ oe 0.03410959 E300 0.021619136 4ohan 0.021643836 H200 0.019319227 1c89 0.02109589 S000 0.014719411 2c89 0.015616438 I500 0.013339466 4ohc89 0.015479452 Q000 0.01149954 4oe 0.015342466 N500 0.008279668 4ohae 0.014109589 Q200 0.006439742 1c9 0.013287671 Q300 0.006439742 4oham 0.011506849 V620 0.0059797605 4ohC89 0.0113698635 D500 0.0059797605 8am 0.010410959 P600 0.0059797605 oy 0.010136986 P632 0.0059797605 4ohC9 0.009589042 F630 0.005519779 8ay 0.009178082 I400 0.0050597973 2c9 0.009041096 N200 0.0050597973 4oh9 0.007945205 I536 0.0050597973 1c79 0.007671233 Q500 0.0050597973 8ae 0.007671233 C616 0.0045998157 am 0.007260274 N550 0.0045998157 4ohay 0.0064383564 S100 0.0045998157 y 0.00630137 S300 0.0045998157 8an 0.0060273972 E230 0.004139834 ohan 0.0060273972 O360 0.004139834 89 0.005890411 C523 0.004139834 4ohc79 0.0057534245 V400 0.004139834 1H9 0.0056164386 C500 0.004139834 4ohc9 0.005479452 T500 0.004139834 e1c89 0.005068493 L300 0.004139834 soe 0.005068493 F653 0.0036798527 4ohcc89 0.005068493 U300 0.0036798527 4okc89 0.005068493 A300 0.0036798527 okc89 0.0049315067 G520 0.0036798527 oham 0.004794521 T100 0.0036798527 2c79 0.004794521 N236 0.0036798527 s 0.0046575344 S200 0.0036798527 ay 0.004520548 I525 0.0036798527 sam 0.0043835617 S162 0.003219871 san 0.004109589 T550 0.003219871 oe9 0.0039726025 V536 0.003219871 5c89 0.0039726025 V633 0.003219871
As a comparison, here is a frequency table for plaintext words in the "Latin Herb Garden" compared with the Herbal Folios
Top 40 words in source and target Source Target ------ ------ 8am 0.038182747 et 0.021159153 1oe 0.020316487 in 0.0096596135 1oy 0.014599285 si 0.0096596135 s 0.011638591 non 0.008279668 oy 0.010107198 quae 0.006899724 19 0.010107198 hoc 0.005519779 am 0.009596733 dum 0.0045998157 8ay 0.00949464 huius 0.004139834 89 0.0089841755 quam 0.004139834 K9 0.008882083 haec 0.004139834 2oe 0.0077590607 quod 0.0036798527 8an 0.0073506893 ut 0.0036798527 1c9 0.0071465033 per 0.0036798527 oe 0.006738132 tibi 0.0036798527 ay 0.006636039 cum 0.0036798527 2o 0.006533946 tum 0.0036798527 oham 0.005104645 tamen 0.003219871 oh9 0.0047983667 nec 0.003219871 4ok19 0.0047983667 forte 0.003219871 9 0.0046962737 est 0.003219871 8ae 0.0044920878 illa 0.0027598895 2oy 0.004389995 sub 0.0027598895 Koe 0.004287902 inter 0.0027598895 4oh19 0.004287902 vires 0.0027598895 7am 0.0041858093 genus 0.0027598895 Koy 0.0040837163 sed 0.0027598895 8oy 0.0039816233 lilia 0.0027598895 29 0.0039816233 quoque 0.0022999079 1H9 0.0039816233 se 0.0022999079 ok9 0.0039816233 iam 0.0022999079 1o 0.0038795304 undique 0.0022999079 8oe 0.0034711587 quis 0.0022999079 1o89 0.003369066 aut 0.0022999079 ohae 0.003369066 etiam 0.0022999079 1am 0.003266973 satis 0.0022999079 1c79 0.003266973 viscera 0.0022999079 y 0.00316488 odore 0.0022999079 1ay 0.0029606943 de 0.0022999079 okoe 0.0028586013 quo 0.0022999079 sam 0.0028586013 ore 0.0018399263