Bacon’s Cipher in Python
Bacon's Cipher is a very simple and very old method of encoding a message and is now only of interest as a historical relic, but it also provides an interesting little programming project. In this article I will code it in Python.
Bacon's Cipher
Bacon's Cipher was created in 1605 by Francis Bacon, a very interesting figure who, amongst other things, made valuable contributions to "natural philosophy" which evolved into what we call science. His Wikipedia article is worth reading but I'll just concentrate on his eponymous cipher.
![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dd7bc2b-4f40-46f3-8972-1ded06818925_391x480.jpeg)
To encipher a message each letter is first represented by a sequence of 5 As and Bs according to the following table. This of course can be considered to be a form of binary so I have also shown the corresponding sequences of 0s and 1s. (The original form of the cipher combined the letter pairs I,J and U,V so only had 24 encodings. I will use the full 26 encodings shown below.)
For this project I will use the binary encodings because they form a sequence of 26 consecutive numbers which we can easily convert to and from ASCII using addition and subtraction.
After creating a list of either As and Bs or 0s and 1s we need another piece of text the same length or longer. In the original form of the cipher two typefaces were used, one for letters corresponding to A and another for letters corresponding to B. In practice any two types of distinct formatting can be used, and I will use lower case for 0 and upper case for 1.
To decipher the text the binary pattern (or As and Bs) is recreated according to the case or other formatting of the text, and each block of 5 characters in the pattern is converted back to the corresponding letter. The deciphering process creates a string of uppercase letters as the punctuation, whitespace and case of the original plaintext is lost.
Let's look at a brief example, using "Bacon." as plaintext and "Francis Bacon was a statesman and philosopher" as the target text.
As you can see the target text is truncated to the necessary length but it could be used in full with the superfluous letters all using the same formatting.
The Project
This project consists of the following files:
baconscipher.py
baconscipherdemo.py
which you can grab from the Github repository.
Let's first look at baconscipher.py.
def encipher(plaintext, target_text):
"""
Encipher plaintext to target text with binary 0
as lower case and binary 1 as upper case
"""
# remove all non-alphabetic characters
# from plaintext and convert to upper case
plaintext = ''.join([c for c in plaintext if c.isalpha()]).upper()
# get a string of 0s and 1s representing the
# formatting of the enciphered message letters
binary_string = _string_to_bit_pattern(plaintext)
# format the target text letters as upper or lower case
# according to the bit pattern
enciphered = _cased_text_from_bit_pattern(target_text, binary_string)
return enciphered
def decipher(enciphered):
"""
Decipher enciphered text assuming lower case letters
represent binary 0 and upper case letters represent binary 1
"""
print("\nDeciphering\n===========\n")
# remove everything except letters
enciphered = ''.join([c for c in enciphered if c.isalpha()])
length = len(enciphered)
letter_quintet = ""
bit_pattern = ""
deciphered = []
for i in range(0, length, 5):
# grab next 5 letters
letter_quintet = enciphered[i: i+5]
# get corresponding string of 5 bits
bit_pattern = _letter_quintet_to_bit_pattern(letter_quintet)
# get letter corresponding to bit pattern
letter = chr(int(bit_pattern, 2) + 65)
print(f"{letter_quintet} {bit_pattern} {letter}")
deciphered.append(letter)
return "".join(deciphered)
#------------------------------------------------------------
# "PRIVATE" FUNCTIONS
#------------------------------------------------------------
def _string_to_bit_pattern(string):
"""
Convert string of letters to string of
corresponding 5-bit patterns
"""
binary_list = []
bit_pattern = ""
for letter in string:
# get ASCII code, subtract 65 and format as 5 bit string
bit_pattern = format(ord(letter) - 65, '05b')
binary_list.append(bit_pattern)
return "".join(binary_list)
def _cased_text_from_bit_pattern(target_text, binary_pattern):
"""
Set case of target text according to string of 5-bit patterns
0 => lower case
1 => upper case
Non-alpha characters are skipped
"""
cased_text = []
index = 0
for bit in binary_pattern:
while not target_text[index].isalpha():
cased_text.append(target_text[index])
index += 1
if bit == "0":
cased_text.append(target_text[index].lower())
else:
cased_text.append(target_text[index].upper())
index += 1
return "".join(cased_text)
def _letter_quintet_to_bit_pattern(letter_quintet):
"""
Convert string of 5 letters to corresponding bit pattern
Lower case => 0
Upper case => 1
"""
bit_pattern = []
for c in letter_quintet:
if(c >= "a" and c <= "z"):
bit_pattern.append("0")
elif(c >= "A" and c <= "Z"):
bit_pattern.append("1")
return "".join(bit_pattern)
encipher
The first line actually carries out three separate tasks. The list comprehension selects only alphabetical characters, the resulting list is joined into a string, and the string is then converted to upper case.
Then we call a function to create a string of bit patterns corresponding to the letter/bits mappings shown in the table above.
Finally we call another function to set the case of the individual letters in the target text according to the bit pattern string.
I'll describe the functions used here in more detail as we get to them.
decipher
Firstly we strip out everything from enciphered which is not a letter, and then declare a few variables for use within the for loop.
Next is the for loop itself: note that we iterate from 0 to the length of the enciphered text, and in particular note that we have a step of 5 as within the loop we process 5 characters at a time.
Within each iteration we grab the next 5 letters and then pass them to a function to create the corresponding sequence of 0s and 1s based on the case of the letters. The next line of code converts the string of bits to an integer, adds 65 to get the ASCII code, and then gets the actual letter using the chr function. This letter is then added to the deciphered list.
As you can see I have printed the variables so we can see the decipherement process in action.
Finally we return the deciphered list joined into a string.
_string_to_bit_pattern
This is used by encipher as we saw earlier, and to quote my own docstring this function will "convert string of letters to string of corresponding 5-bit patterns". To do this it iterates the letters in the string and for each one grabs the ASCII code using ord, subtracts 65 (because A = 65 in ASCII, whereas Bacon's encodings start at 0) and formats it as a 5-character binary string. This is then added to a list which we finally join and return.
_cased_text_from_bit_pattern
This is also used by the encipher function, and takes the target text and string generated by _string_to_bit_pattern. The letters in the target text are set to lower case or upper case depending on the value of the corresponding bit. Note that any non-alpha characters are skipped.
To do this we iterate the binary pattern, and within the loop firstly use a while loop to skim non-alpha characters, adding them unchanged to the final result list. When we hit a letter it is converted to upper or lower depending on the value of the current bit, and this is added to the result list. This list is then joined and returned.
_letter_quintet_to_bit_pattern
The last function is used by decipher and creates a string of 5 0s and 1s depending on the case of the letters in the input string. Note that in Python we can use comparison operators on letters without first converting them to numbers. Nice!
Too Many Functions?
No, not IMHO. The three pseudo-private functions (prefixed with _) are only called once so the code in them could be embedded within the functions that call them. However, each of these three functions carries out one specific and separate task and I feel the overall code is neater, better organised and easier to read, test and maintain if separated out into several functions.
Validation (Lack Of)
If this were a piece of production code for serious use it would be necessary to carry out some validation of the parameters, specifically checking that the target text is at least 5 times longer than the plain text, and that the enciphered text is formatted in a way that can be deciphered using Bacon's Cipher. As this is only a simple programming exercise I have not bothered doing so.
Now let's move on to baconscipherdemo.py, a simple bit of code to try out the previous module.
baconscipherdemo.py
import baconscipher
def main():
print("------------------")
print("| codedrome.com |")
print("| Bacon's Cipher |")
print("------------------\n")
plaintext = "Knowledge and human power are synonymous."
target_text = "There were under the law, excellent King, both daily sacrifices and freewill offerings; the one proceeding upon ordinary observance, the other upon a devout cheerfulness: in like manner there belongeth to kings from their servants both tribute of duty and presents of affection. In the former of these I hope I shall not live to be wanting, according to my most humble duty and the good pleasure of your Majestyundefineds employments: for the latter, I thought it more respective to make choice of some oblation which might rather refer to the propriety and excellency of your individual person, than to the business of your crown and state."
enciphered = baconscipher.encipher(plaintext, target_text)
print("Enciphered\n==========")
print(enciphered)
deciphered = baconscipher.decipher(enciphered)
print("\nDeciphered\n==========")
print(deciphered)
if __name__ == "__main__":
main()
The plaintext and target_text strings are both quotes from works by Francis Bacon. The plaintext comes from
Novum Organum; Or, True Suggestions for the Interpretation of Nature
The target_text is from
In main we simply pass these to baconscipher.encipher and print the result, which is then passed to baconscipher.decipher, again printing the result.
Now we can run the program with this command.
python3 baconscipherdemo.py
This is the end result.
Obviously the spaces and any punctuation are lost in the decipered text, which is also all upper-case. I hope you found this little project interesting. Please let me know.