The Voynich text generator
Description of the table:
The table is a set of 168 individual elements. For convenience the table is set out in 14
rows of 12 elements. Each row contains certain items. The first item is the count of the
occurrences of the pair item throughout the sample. Below this number is the unique pair
as found in the EVA sample. There are 168 of these in the sample. For the purposes of this
table I have omitted lines with 'wierdo' characters to avoid complicating the table unduly.
Below the pair are set of 1 to 3 flags. These indicate whether the pair can start a VMS word,
be in mid word position or end a VMS word. These are S, M and X respectively. Below this are
a list of letters that are found to succeed these start pairs to form either initial, mid
or terminating triplets. These also have a set of 3 flags. These are I if the triplet can
initiate a word, C if it can continue a word or X if it can terminate a word.
Use of the table:
If we take the example of the pair 'in' we find that the pair never starts a word in the
EVA sample and so can not be used to start a word. If we now look at the pair 'ch' it has
all flags and so we can use this as the start of an example word.
Using 'ch' we scan the list of letters in the column beneath it to find a letter with the I
flag that can form an initial triplet. There are quite a few for 'ch'. In this case let us
choose 'o'.
Now we have the sequence 'cho'. To find the next letter we now take the 'ho' from the end
and use this as a lookup to that element. We find that the flags for 'ho' show that it
cannot start a word but can be a mid word pair or a terminator. Also all the letter flags
indicate this being either C for continue word or X for terminating a word.
Lets choose 'd' for our next selection, giving us 'chold'. Now we continue with the pair
'ld'. We find that 'ld' has 3 continuing letters. In this case I will chose 'a' as it
forms the beginning of the popular word terminator 'aiin'.
Now we have 'cholda'. Looking up 'da' we find only 3 continuation letters. These are 'l',
'r' and 'i'. This would restrict the opportunities when reaching this path through the
table and would account for a high percentage of word endings through this route.
Particularly in the case of 'l' and 'r' as these are flagged as word terminating by
the X flag. So with these we would have words ending 'dal' and 'dar'.
In this case let us choose the 'i' as indicated above. This gives use 'choldai'. Looking
up 'ai' we now find that we have 4 terminators. Two of these could provide the words
'choldain' or 'choldair'. They could however also continue
a word if necessary.
Let us select 'i' however, as this can only continue a word here as its other flag signals
it as an initiator and not a terminator. This gives us 'choldaii'. Moving to the entry for
'ii' we find only one continuator which would append another 'i'. All other letters will
terminate only. We now have four word ending choices which would produce the words
'choldaiim', 'choldaiin', 'choldaiir' and 'choldaiis'.
NOTE: We need not generate a triplet at all. If the current pair has the X flag it can
terminate a word without the need for selecting a letter from its list. This does not
explain single letters as words in the VMS.
The question to ask here is, what property will link one continuator to another more often
than through other routes. It is these links between continuators that I believe is the
way forward.
Now for the table itself:
1075 818 454 450 441 415 405 403 366 353 251 240
ch ho ai ol in hy ii or da sh ok he
SMX MX SM SMX MX MX SM SMX SMX SMX SMX MX
e ICX d CX m X d ICX d CX d CX m X l X l ICX e ICX o ICX o CX
l IX k CX n CX o ICX y X s CX n X y X m IX o ICX y ICX s CX
m X l CX r CX s ICX c C a C r X c IC n IX y IX a IC t CX
o ICX m X s X y ICX k C s X o IC r ICX a IC c IC y CX
s ICX p CX i IC a IC p C i IC s IC s IX c I e IC a C
y ICX r CX c C c IC t C a C d I d I s IC c C
a IC s CX t IC d C i IC f I h C e C
c IC t CX e C k I k I k C
d IC y X y I t I
k IC a C
p I c C
r IC e C
t IC f C
i C
o C
233 229 212 208 196 194 163 162 151 149 149 136
ot qo ct th dy od tc kc ar ey ha ee
SMX SMX SM MX SMX SMX SM SM SMX MX MX SMX
o ICX a I h CX d X d ICX g X h ICX h ICX y X r X d CX e ICX
y ICX c I y X l X c IC l IX e I k C l CX g X
a IC d I c I o CX k I o ICX a C m X n X
c IC e I h I y CX o I y IX c C n X o ICX
e I f I a C p I a IC o C r CX s ICX
l I k I e C t I s IC i C y ICX
s IC l I r C c C p C a IC
e C o I e C k C
p I
s I
t IC
y I
121 94 87 87 63 79 78 77 76 67 62 59
eo ko al to do yt ka yk ky ty ck kh
MX SMX MX SMX SMX SM SMX SM SMX SMX SM MX
d CX d ICX a X d ICX o X y IX i ICX y IX d IX d IC h ICX o CX
l CX l ICX d CX l ICX a IC a IC l CX a IC t IC c C y IX y X
m X m X g X m X d I c IC m X c IC c C o C c I a C
r CX r ICX s X r ICX i I d I n X e IC k C s C o I e C
s X s IX y X y IX k I e I r IX o I t C h C
a C y X c C e IC l I o IC y X s IC
c C a IC o C i IC r IC s I d C
e C c I s I t I
k C i IC a C
t C k I c C
t I k C
o C
59 57 53 52 46 45 42 42 41 41 41 39
oc ke ta pc am yd dc om hc ir yc ld
SM SM SM SM MX SMX SM X M MX SM MX
h IC g X l ICX h ICX o X s X h IC f C d CX h IC g X
k IC o IC m X o X y IC t C h C o X p C s X
t IC y X n X y X a IC k C c C y X
c IC r IX l IC p C i C a C
e IC i IC o I t C c C
s C o I e C o C
39 37 36 35 35 35 34 33 33 31 30 28
oi ks so ea oa os oy cp ph lo op an
SM SM SMX M SM SMX SMX SM M SMX SMX X
n X h ICX c IC l CX l ICX h ICX k IC h IC a C d ICX c IC
i IC d IC m X n X s X t I e C l ICX o IC
r I e I n X r ICX a IC d C h C m X y IC
i I r X i IC c IC o C r IX a C
k I s X o I o I c IC
l I y X k I
o I i C i C
r I
s I
t I
y I
27 26 26 23 22 21 21 20 19 18 17 17
es oe sa te ro hk ry ly ds ts ls sc
MX SM SM SMX SMX M SMX MX SMX SM MX SM
e CX p X i IC y IX l IX y X c I s C f X h IC y X h IC
y X e IC r I e IC m X a C t C h IC a C k I
o C k IC t I o IC d IC c C h C
o C k I e C s C
t I o C
16 14 14 14 14 13 13 12 12 11 10 9
ra ek ht lc po rc sy cf fh yp de oo
SM SM M SM SM SM SX SM S SM SM SM
l X y X y X h IC d X h IC a I h IC y X c IC y IX r X
m X s I a C f C l IX k I a C o I e IC s X
r X a C c C p C c I h C o I c IC
i IC c C o C e I o C a C
e C s C i I i C
h C r IC k C
o C t I
8 8 8 7 7 7 7 7 6 6 5 5
fc py ys dl en hd hs rd fo hl ad et
SM SMX SX SMX X M MX M SM MX MX SM
h IC d IC h IC o IX s X y X y X l IX o C y X y IX
c C s I y X e C a I a C a I
a C h C c I o I
d IC
5 5 5 5 5 4 4 4 4 4 3 3
im la of se yo ay dk hp lt ps as dd
MX MX SM SMX SM MX S M M S MX SM
m X l X c IC d X a I t C o I a C y X h I a C o I
r X o C y X d I s I c C a C y I
e C y C e I r I y I c C
i C k C o C
3 3 3 3 3 3 3 3 2 2 2 2
dg ec hh hr is pa ri ya ak ao at dt
X M M MX X SMX M SM M SM SMX SM
h C y X a C l X n X i IC y X r X a I c I
t C e C i I o I
d C
2 2 2 2 2 2 2 2 2 2 2 2
ed eg fy ic il le mo nd ny pd re rs
X X M M M M X MX X S MX M
d C h C s X r X y X r I s X h C
d C y I
i C
l C
n C
o C
r C
2 2 2 2 2 2 2 1 1 1 1 1
sk ss td ye ac ae ap co cs di dp dr
S M SM S M M M S S SM S M
a I h C g X e I h C n X c C y I h I i I c I a C
y I a I
1 1 1 1 1 1 1 1 1 1 1 1
el ep er fa fs hf hm id ik io lg lk
X X X S S M X M M M X M
c I h I y C y X a C d C o I
1 1 1 1 1 1 1 1 1 1 1 1
mm nc no oh oq qk rg rl sf tl yf yr
X M M S S S X X X M S X
h C l X e I o I o I y X o I