Numerical Coding of Word Sections

Imagine for a moment a code which allows an encoder to make random, infinite choices when encoding, but which can only be decoded one way… into one, clear plain-text… at the receiver’s end. The interim coded text would represent the same characters and words in multiple ways, so that any attempt at decipherment, by trying to make a count on the elements, and compare them to various plain texts in various languages, would fail.

Augustus of Brunswick-Lüneburg

Augustus of Brunswick-Lüneburg

There is at least one system I know of, which has this ability, and which at the same time is easy and fast to encode and decode. This is the numerical coding system which appears in the 1624 Gustavus Selenus Book, “Cryptomenytices et Cryptographiae”. Gustavus Selenus is the playful Latinized psuedonym of August II, Duke of Brunswick-Luneburg. He used the root “selene” (goddess of the moon) for his name, because of the “moon-root” in the name of his dukedom, and “Gustavus” being an anagram of “Augustus”.  He wrote this book on cipher, and a book on chess, under this pseudonym. Cryptomenytices is a compilation of dozens of codes and ciphers, with detailed explanations and examples. Many of them are from the work of Trithemius, or based on his it. But many of the systems are unique to Cryptomenytices, or adaptations of previous work of others, with improvements and additions.

The code with which this post is concerned with is found on page 360, in book seven of this work. I do not read much Latin, but from what I can deduce from the preface, this code is partially based on, or refers to work by, one Jacobus Silvestri. This man wrote a work on cipher and codes in the early 16th century, and is surprisingly little documented or discussed. In fact the only copy of his book I could find is in the NSA library.  I have a copy of this code, and so can understand that it is familial to the code on page 360 of Cryptomenytices. But the page 360 code is much more elaborate and ingenious, and expounds beyond the simple points of Silvestri.

The code works like this: Both the sender and receiver have a code chart, which assigns progressing numbers to first letters, then to combinations of letters. “A” is simply “1”. “B is simply “6”. These can be written as either an Arabic 1 & 6, or Roman “I” and “VI”, or of course, any way one would like.

After the single letters, the code moves on to numbering “consonants before vowels”… starting with “BA” (which is 22 on the code chart). So let’s stop here, and examine the letter string “BA” encoded. When the encoder comes across “BA”, they have a choice: They can simply write it as “22”, in which case the receiver looks up “22” on the chart (which is laid out very clearly, and in order, and so all numbers are quick and easy to find). They immediately know it is “BA” in the plain text. But the encoder had a second choice… they could have used the numerical codes for the B and A separately, and so written it as 6-1. Again, the decoder looks up 6 & 1 on the chart, and again, knows it is the plain text BA. But the choice of encoding means that BA can appear in two different ways in the coded text, confusing any character counts. Even if the numbers where suspected, and even if they were known, how would an investigator relate 22 and 6-1? And if they determined that “E” was “2” (which it is in this code), then they may think 22 is EE, and not BA.

214/213= “Intent”, because “int”=214, & “ent”=213

But this is a simple case. The code allows many more variables, as the plain text increases in complexity. For after “vowels after consonants” comes “vowels preceding consonants”, such as AB, AC, and so on. Then strings of three: “vowels preceding two consonants”, such as ABS and so on. The list goes like this:

  1. Individual vowels.
  2. Individual consonants.
  3. One vowel after one consonant.
  4. One vowel before one consonant.
  5. One vowel before two consonants.
  6. Two consonants before one vowel.
  7. One consonent before and after one vowel (3 letters)

All of these combinations are shown in alphabetical order, so an encoder can quickly look up their chosen string of letters. The list is numbered from 1 (for “A”) to 1,521 (for “ZUT”).  The encoder takes their strings, looks them up on the  chart, and writes down their number value. Elmar Vogt coined the word “chunk” for these strings of letters, as they do not have to follow syllabic or phonetic breaks… they can be chosen at the whim of the encoder.

So let’s look at a sentence, and the choices and the results of encoding it with this system. I previously used “I am here” as an example for the biliteral, so I’ll stick with that as a control. One choice for an encoder might be to break this up: I-AM-HE-RE. Looking the numbers for these chunks on the code list, we have: 3-147-48-83. But if broken down as IA-M-HER-E, it would change drastically: 257-13-808-2. Do you see the problem for someone out of the loop, who is trying to decode a text? They are presented with 3-147-48-83 and 257-13-808-2, and would have absolutely no way to relate the two. The counts of individual numerals would not make any sense of it. But to a receiver of the code, it is a simple matter of running their finger down the code list, and substuting the number strings with the appropriate word chunk. In either case it is fast and easy, and the plain text absolutely unambiguous. Not subjective, no anagramming involved, no choices on the decoders part at all.

As for the breaks between the number strings, as Selenus points out, these can be written different ways. One choice is nulls, such as “+’s”. The use of crosses such as this a common Christian habit in the past, to emphasize text. But any null could work. Our code string might be 3a147b48c83, or if Roman, IIIaCXLVIIbXLVIIIcLXXXIII, and so on. There can be one null, multiple nulls, spaces for nulls. In any case it would not confuse or complicate the system for either the encoder or decoder.

I consider it a top contender for the code in the Voynich Manuscript.  For one thing, as I explained, it would frustrate counting attempts, as the counts of individual characters would bear little or no relationship with the plain text. Also, as I understand it, the Voynich character frequencies do roughly coincide with what one would expect with a core of the numbers 1 through 9, plus some nulls. I’ve seen this core, frequent, count as 17 in at least one case, although there are of course a smattering of rare characters, bringing the total to several time this. But these could easily be accounted for as shorthand or some other symbolic representations. And the often occurring “9” tail character, which was popularly a plural suffix (in Latin shorthand, and also Middle and Old Dutch), would make sense both in frequency and placement, in such a scheme. I also like the fact that is would help explain the large number of recurring Voyichese “words”, as any plain text, in any language, can be broken down into often repeating parts. For instance look at this paragraph, and count the number of times “ER” appears… and “AN”, and so on. And lastly, look at the well-discussed “key page” notation, with crosses, and almost-Roman numerals in part:

And compare it to this, as an example from the Selenus code:

Although I feel it is possible to use the Voynich characters to encode with this method, I did look into the possibility that there were complications which could account for some other Voynichese features. I experimented with those to some extent. For one, I had wondered if the gallows could be a “modifier” or “multiplyer” of some kind, for the larger numbers in the code. Here is an example of this, from my notes:

As you see, it explores the “what if” the gallows, in two variations, were 1,000 and 100 multipliers for a character (enciphered number) just before. This would help write out some of the larger numbers in different ways. But it is just part of the process of trying out the system, and does not imply that this complication is favored by me.

I do feel that “working backward” would be a valuable way to explore the Voynich code/cipher. I do feel that with some work, it would be possible to encode in Voynichese with this numerical system, and have actually succeded in a small way, in the limited time I applied to it. I explained my ideas to Julian Bunn, and he was working with the idea for a time. He wrote a clever conversion program, and was able to encode some very impressive sections with the method. Below is the text from Francis Bacon’s New Atlantis, “…we have also glasses and means to see small and minute bodies, perfectly and distinctly; as the shapes and colors of small flies and worms, grains, and flaws in gems which cannot otherwise be seen.”, encoded by Julian in Gabriel Landini’s Voynich 101 font:

Well of course this would not follow the same patterns or counts found in the Voynich, but that was not the point at that stage. It was an exercise to discover if this Selenus code variation could encode in Voynich characters. It can. Almost any code or cipher can, really. But after that, one has a starting point, to see how the resulting Voynich-like strings are affected by various choices in the numbering lists, plain text content, breakdown of plain text, and so on. The point would be to see if strings of Voynichese, with the same resulting character and word counts, and other patterns, could be generated with this system or variation of it, while containing meaningful plain text. And this of course would be the next step, applying the necessary time and effort to do so, and answer the question.

Selenus in His Library-“oh to be a fly on the wall”

This entry was posted in codes and ciphers and tagged , , , , , , . Bookmark the permalink.

Leave a comment