Polyalphabetic Ciphers
Another way to improve
on the simple monoalphabetic technique is to use different monoalphabetic
substitutions as one proceeds through the plaintext message. The general name for
this approach is polyalphabetic substitution cipher. All these techniques
have the following features in common:
1. A set of related
monoalphabetic substitution rules is used.
2. A key determines which
particular rule is chosen for a given transformation.
The best known, and one
of the simplest, such algorithm is referred to as the Vigenère cipher. In
this scheme, the set of related monoalphabetic substitution rules consists of
the 26 Caesar ciphers, with shifts of 0 through 25. Each cipher is denoted by a
key letter, which is the ciphertext letter that substitutes for the plaintext
letter a. Thus, a Caesar cipher with a shift of 3 is denoted by the key
value d.
To aid in understanding
the scheme and to aid in its use, a matrix known as the Vigenère tableau is
constructed (Table 2.3). Each of the
26 ciphers is laid out horizontally, with the key letter for each cipher to its
left. A normal alphabet for the plaintext runs across the top. The process of
encryption is simple: Given a key letter x and a plaintext letter y,
the ciphertext letter is at the intersection of the row labeled x and
the column labeled y; in this case the ciphertext is V.
To encrypt a message, a
key is needed that is as long as the message. Usually, the key is a repeating
keyword. For example, if the keyword is deceptive, the message "we
are discovered save yourself" is encrypted as follows:
key: deceptivedeceptivedeceptive
plaintext: wearediscoveredsaveyourself
ciphertext: ZICVTWQNGRZGVTWAVZHCQYGLMGJ
Decryption is equally
simple. The key letter again identifies the row. The position of the ciphertext
letter in that row determines the column, and the plaintext letter is at the
top of that column.
The strength of this
cipher is that there are multiple ciphertext letters for each plaintext letter,
one for each unique letter of the keyword. Thus, the letter frequency information
is obscured. However, not all knowledge of the plaintext structure is lost. For
example, Figure 2.6 shows
the frequency distribution for a Vigenère cipher with a keyword of length 9. An
improvement is achieved over the Playfair cipher, but considerable frequency
information remains.
It is instructive to
sketch a method of breaking this cipher, because the method reveals some of the
mathematical principles that apply in cryptanalysis.
First, suppose that the
opponent believes that the ciphertext was encrypted using either monoalphabetic
substitution or a Vigenère cipher. A simple test can be made to make a
determination. If a monoalphabetic substitution is used, then the statistical
properties of the ciphertext should be the same as that of the language of the
plaintext. Thus, referring to Figure 2.5, there
should be one cipher letter with a relative frequency of occurrence of about
12.7%, one with about 9.06%, and so on. If only a single message is available
for analysis, we would not expect an exact match of this small sample with the
statistical profile of the plaintext language. Nevertheless, if the
correspondence is close, we can assume a monoalphabetic substitution.
If, on the other hand, a
Vigenère cipher is suspected, then progress depends on determining the length
of the keyword, as will be seen in a moment. For now, let us concentrate on how
the keyword length can be determined. The important insight that leads to a
solution is the following: If two identical sequences of plaintext letters
occur at a distance that is an integer multiple of the keyword length, they
will generate identical ciphertext sequences. In the foregoing example, two
instances of the sequence "red" are separated by nine character
positions. Consequently, in both cases, r is encrypted using key letter e,
e is encrypted using key letter p, and d is encrypted using key letter t.
Thus, in both cases the ciphertext sequence is VTW.
An analyst looking at
only the ciphertext would detect the repeated sequences VTW at a displacement
of 9 and make the assumption that the keyword is either three or nine letters
in length. The appearance of VTW twice could be by chance and not reflect
identical plaintext letters encrypted with identical key letters. However, if
the message is long enough, there will be a number of such repeated ciphertext
sequences. By looking for common factors in the displacements of the various
sequences, the analyst should be able to make a good guess of the keyword
length.
Solution of the cipher
now depends on an important insight. If the keyword length is N, then the
cipher, in effect, consists of N monoalphabetic substitution ciphers.
For example, with the keyword DECEPTIVE, the letters in positions 1, 10, 19,
and so on are all encrypted with the same monoalphabetic cipher. Thus, we can
use the known frequency characteristics of the plaintext language to attack each
of the monoalphabetic ciphers separately.
The periodic nature of
the keyword can be eliminated by using a nonrepeating keyword that is as long
as the message itself. Vigenère proposed what is referred to as an autokey
system, in which a keyword is concatenated with the plaintext itself to provide
a running key. For our example,
key: deceptivewearediscoveredsav
plaintext: wearediscoveredsaveyourself
ciphertext: ZICVTWQNGKZEIIGASXSTSLVVWLA
Even this scheme is
vulnerable to cryptanalysis. Because the key and the plaintext share the same
frequency distribution of letters, a statistical technique can be applied. For
example, e enciphered by e, by Figure 2.5, can be
expected to occur with a frequency of (0.127)2 0.016, whereas t enciphered
by t would occur only about half as often. These regularities can be
exploited to achieve successful cryptanalysis.[8]
[8] Although the techniques for breaking a Vigenère cipher are
by no means complex, a 1917 issue of Scientific
American characterized this system as "impossible of
translation." This is a point worth remembering when similar claims are
made for modern algorithms.
The ultimate defense against
such a cryptanalysis is to choose a keyword that is as long as the plaintext
and has no statistical relationship to it. Such a system was introduced by an
AT&T engineer named Gilbert Vernam in 1918. His system works on binary data
rather than letters. The system can be expressed succinctly as follows:
ci = pi ki
where
pi
|
= ith binary digit of
plaintext
|
ki
|
= ith binary digit of key
|
ci
|
= ith binary digit of
ciphertext
|
|
= exclusive-or (XOR) operation
|
Thus, the ciphertext is
generated by performing the bitwise XOR of the plaintext and the key. Because
of the properties of the XOR, decryption simply involves the same bitwise
operation:
pi = ci ki
The essence of this
technique is the means of construction of the key. Vernam proposed the use of a
running loop of tape that eventually repeated the key, so that in fact the
system worked with a very long but repeating keyword. Although such a scheme,
with a long key, presents formidable cryptanalytic difficulties, it can be
broken with sufficient ciphertext, the use of known or probable plaintext
sequences, or both.
No comments:
Post a Comment