Thursday 14 March 2013

Substitution Techniques


2.2. Substitution Techniques

In this section and the next, we examine a sampling of what might be called classical encryption techniques. A study of these techniques enables us to illustrate the basic approaches to symmetric encryption used today and the types of cryptanalytic attacks that must be anticipated.
The two basic building blocks of all encryption techniques are substitution and transposition. We examine these in the next two sections. Finally, we discuss a system that combines both substitution and transposition.
A substitution technique is one in which the letters of plaintext are replaced by other letters or by numbers or symbols.[1] If the plaintext is viewed as a sequence of bits, then substitution involves replacing plaintext bit patterns with ciphertext bit patterns.

Caesar Cipher

The earliest known use of a substitution cipher, and the simplest, was by Julius Caesar. The Caesar cipher involves replacing each letter of the alphabet with the letter standing three places further down the alphabet. For example,
plain:  meet me after the toga party
cipher: PHHW PH DIWHU WKH WRJD SDUWB

Note that the alphabet is wrapped around, so that the letter following Z is A. We can define the transformation by listing all possibilities, as follows:
plain:  a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C

Let us assign a numerical equivalent to each letter:
a
b
c
d
e
f
g
h
i
j
k
l
m
0
1
2
3
4
5
6
7
8
9
10
11
12


n
o
p
q
r
s
t
u
v
w
x
y
z
13
14
15
16
17
18
19
20
21
22
23
24
25


Then the algorithm can be expressed as follows. For each plaintext letter p, substitute the ciphertext letter C:[2]
[2] We define a mod n to be the remainder when a is divided by n. For example, 11 mod 7 = 4. See Chapter 4 for a further discussion of modular arithmetic.
C = E(3, p) = (p + 3) mod 26
A shift may be of any amount, so that the general Caesar algorithm is
C = E(k, p) = (p + k) mod 26
where k takes on a value in the range 1 to 25. The decryption algorithm is simply
p = D(k, C) = (C k) mod 26
If it is known that a given ciphertext is a Caesar cipher, then a brute-force cryptanalysis is easily performed: Simply try all the 25 possible keys. Figure 2.3 shows the results of applying this strategy to the example ciphertext. In this case, the plaintext leaps out as occupying the third line.



Three important characteristics of this problem enabled us to use a brute-force cryptanalysis:
  1. The encryption and decryption algorithms are known.
  2. There are only 25 keys to try.
  3. The language of the plaintext is known and easily recognizable.


In most networking situations, we can assume that the algorithms are known. What generally makes brute-force cryptanalysis impractical is the use of an algorithm that employs a large number of keys. For example, the triple DES algorithm, examined in Chapter 6, makes use of a 168-bit key, giving a key space of 2168 or greater than 3.7 x 1050 possible keys.
The third characteristic is also significant. If the language of the plaintext is unknown, then plaintext output may not be recognizable. Furthermore, the input may be abbreviated or compressed in some fashion, again making recognition difficult. For example, Figure 2.4 shows a portion of a text file compressed using an algorithm called ZIP. If this file is then encrypted with a simple substitution cipher (expanded to include more than just 26 alphabetic characters), then the plaintext may not be recognized when it is uncovered in the brute-force cryptanalysis.




No comments:

Post a Comment