Tuesday, July 8, 2008

Monoalphabetic Ciphers

With only 25 possible keys the Caesar cipher is far from secure.

If, instead, we can use a permutation of the alphabetic characters, then there are 26! or greater than 4 X 10^26 possible keys. This is 10 orders of magnitude greater than the key space for DES and would seem to eliminated brute-force techniques for cryptanalysis. Such an approach is referred to as a Monoalphabetic Substitution cipher.


Keeping it in similar words.. instead of encrypting the data by shifting the alphabets.. we redefine the alphabets with jumbled alphabets and then map the plain text accordingly to get the cipher text.


Since there are many keys in this method we may consider that this algorithm is strong.. but we are wrong..

This is because the redundancey in the english language. I the cryptanalyst knows the nature of the plaintext then the analyst can exploit the regularities of the language.

As a first step, the relative frequency of the letters can be determined and compared to a standard frequency distribution of English, such as shown.



If the message were long enough, this technique might be sufficient to get the plain text out of the cipher message.


This is the table indicating the letter frequency in different languages..

Letter French German Spanish Esperanto Italian Turkish Swedish
a 7.636% 6.51% 12.53% 12.12% 11.74% 11.68% 9.3%
b 0.901% 1.89% 1.42% 0.98% 0.92% 2.95% 1.3%
c 3.260% 3.06% 4.68% 0.78% 4.5% 0.97% 1.3%
d 3.669% 5.08% 5.86% 3.04% 3.73% 4.87% 4.5%
e 14.715% 17.40% 13.68% 8.99% 11.79% 9.01% 9.9%
f 1.066% 1.66% 0.69% 1.03% 0.95% 0.44% 2.0%
g 0.866% 3.01% 1.01% 1.17% 1.64% 1.34% 3.3%
h 0.737% 4.76% 0.70% 0.38% 1.54% 1.14% 2.1%
i 7.529% 7.55% 6.25% 10.01% 11.28% 8.27% 5.1%
j 0.545% 0.27% 0.44% 3.50% 0.00% 0.01% 0.7%
k 0.049% 1.21% 0.00% 4.16% 0.00% 4.71% 3.2%
l 5.456% 3.44% 4.97% 6.14% 6.51% 5.75% 5.2%
m 2.968% 2.53% 3.15% 2.99% 2.51% 3.74% 3.5%
n 7.095% 9.78% 6.71% 7.96% 6.88% 7.23% 8.8%
o 5.378% 2.51% 8.68% 8.78% 9.83% 2.45% 4.1%
p 3.021% 0.79% 2.51% 2.74% 3.05% 0.79% 1.7%
q 1.362% 0.02% 0.88% 0.00% 0.51% 0 0.007%
r 6.553% 7.00% 6.87% 5.91% 6.37% 6.95% 8.3%
s 7.948% 7.27% 7.98% 6.09% 4.98% 2.95% 6.3%
t 7.244% 6.15% 4.63% 5.27% 5.62% 3.09% 8.7%
u 6.311% 4.35% 3.93% 3.18% 3.01% 3.43% 1.8%
v 1.628% 0.67% 0.90% 1.90% 2.10% 0.98% 2.4%
w 0.114% 1.89% 0.02% 0.00% 0.00% 0 0.03%
x 0.387% 0.03% 0.22% 0.00% 0.00% 0 0.1%
y 0.308% 0.04% 0.90% 0.00% 0.00% 3.37% 0.6%
z 0.136% 1.13% 0.52% 0.50% 0.49% 1.50% 0.02%
à 0.486% 0 0 0 11.74%
0 0.0%
å 0 0 0 0 0 0 1.6%
ä 0 0 0 0 0 0 2.1%
œ 0.018% 0 0 0 0 0 0
ç 0.085% 0 0 0 0 1.26% 0
ĉ 0 0 0 0.66% 0 0 0
è 0.271% 0 0 0 11.79%
0 0.0%
é 1.904% 0 0 0 11.79%
0 0.0%
ê 0.225% 0 0 0 0 0 0
ë 0.000% 0 0 0 0 0 0
ĝ 0 0 0 0.69% 0 0 0
ğ 0 0 0 0 0 1.13% 0
ĥ 0 0 0 0.02% 0 0 0
î 0.045% 0 0 0 0 0 0
ì 0 0 0 0 11.28%
0 0
ï 0.005% 0 0 0 0 0 0
ı 0 0 0 0 0 5.20%* 0
ĵ 0 0 0 0.12% 0 0 0
ñ 0 0 0.03 0 0 0 0
ò 0 0 0 0 9.83%
0 0
ö 0 0 0 0 0 0.87% 1.5%
ŝ 0 0 0 0.38% 0 0 0
ş 0 0 0 0 0 1.94% 0
ß 0 0.31% 0 0 0 0 0
ù 0.058% 0 0 0 3.01%
0 0
ŭ 0 0 0 0.52% 0 0 0
ü 0 0 0 0 0 1.99% 0



An example:

given ciphertext:

UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ


1. count relative letter frequencies (see text)

2. guess P & Z are e and t

3. guess ZW is th and hence ZWP is the

4. proceeding with trial and error finally get:

it was disclosed yesterday that several informal but
direct contacts have been made with political
representatives of the viet cong in moscow




No comments:

Custom Search