The Laymen's Guide to the Entrails of RSA

This demonstration assumes that you already know what the RSA algorithm does, just not how it does it. In addition, I would like to say that every computation shown in this page has been generated for you based upon your selection of p, q, and the phrase to encrypt. No calculation here is faked (specifically, the de-encryption is not just regurgitating your original phrase). And now without further ado...

The Laymen's Guide to the Entrails of RSA:

p = 11, q = 13, and n = 143:

The RSA algorithm starts with the random selection of two large prime numbers. A prime number is a number divisible only by itself, and 1. For the sake of illustration, you were restricted to selecting somewhat small prime numbers. Your choice for p was 11, and for q was 13. The variable n is the product of p and q, which is 143.

PHI(n) = 120:

Leonhard Euler (1707-1783) discovered what is now referred to as Euler's PHI function (also known as his totient function). This function returns the number of numbers less than n that are relatively prime to n. Two numbers are relatively prime when they have no common factors except 1. For example, PHI(15) = count(1,2,4,7,8,11,13,14) = 8. More often than not, two randomly selected integers will be relatively prime to each other (around a 61% chance, see reference #3, p301). When n is a prime number, then any number less than n is a relative prime to n. Therefore PHI(n) = n-1 when n is prime.

If we used the PHI function on a product of two primes, such as our n variable, we can show that PHI(pq) = (p-1)(q-1). The product pq contains p factors of q, and q factors of p. Subtract these factor counts then add 1, because one of the factors (pq) was removed twice. In other words, pq - q - p + 1 = (p-1)(q-1). In our case, PHI(n) = PHI(143) = (11-1)(13-1) = 120. We will see why this value is of interest later.

e = 11:

The value e is our encrypting exponent. e is a small value chosen as a relative prime to phi(n). (e must be relatively prime to phi(n) so that we are later assured of finding our decrypting exponent d.) Given our PHI(n), potential values for e are 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, etc. Let's randomly select the value of 11 for e.

d = 11:

The value d is our decrypting exponent. To understand d, we must introduce the modulus operator. You can think of the modulus operator returning the remainder when one integer is divided by another. For example, 14 mod 5 = 4, because 14 divided by 5 has a remainder of 4. x mod y is the same as x - int (x / y) * y, where int() is a function that drops any digits to the right of the decimal point. Public key algorithms (such as RSA) routinely perform calculations on very large numbers (numbers often greater than the number of atoms in the universe). Modular arithmetic becomes useful because it keeps large values down to a reasonable size (0 <= x mod y < y).

Now, given our understanding of the modulus operator, we define d as the modular multiplicative inverse of e. A number is a multiplicative inverse of another number if their product is 1. (For example, 7 * 1/7 = 1.) For d to be a modular multiplicative inverse of e, we must find d such that e*d mod PHI(n) = 1. If ed mod PHI(n) = 1, then we know that ed is one more than some multiple of PHI(n), as shown:

    ed mod PHI(n) = 1   (therefore...)

    ed = PHI(n)*k + 1   (Assume that k is some positive integer)

These kind of equations (in our case, e*d - PHI(n)*k = 1) can be solved using the extended Euclidean algorithm. This is an extension to the greatest common denominator algorithm popularized by Euclid (365BC - 300BC) in Volume 7 of his book Elements. Given ax + by = gcd(a,b), where gcd(a,b) means the greatest common denominator for a and b, the extended Euclidean algorithm can find x and y. (The details of the extended Euclidean algorithm are beyond the scope of this discussion. See reference #2 p247 with the ERRATA 2.0 corrections, or reference #3 p302.) In our case, the greatest common denominator is 1, because we made sure that gcd(e,PHI(n)) = 1 when we selected e to be relatively prime to PHI(n).

    a * x + b * y = gcd(a,b)           (Map this to our equation...)

    e * d + PHI(n) * k = gcd(e,PHI(n)) (the gcd is 1, therefore...)

    11 * d + 120 * k = 1

Using the extended Euclidean algorithm, we find d and k such that:

    (11)(11)+(120)(-1) = (121) + (-120) = 1

So d = 11 (the k is discarded information).

If e wasn't relatively prime to phi(n), then there may have not been a d to find, or there may have been more than one value for d (See reference #1, p).

In this specific case, e = d, which is generally considered weak and is a crypto "no no".

Public Key (11, 143), Private Key (11,143):

We have now generated our key pair, where the public key is (e,n) = (11,143), and the private key is (d,n) = (11,143). It takes 8 bits to represent 143, which means we just generated 8 bit keys. Normally RSA key lengths are 768 bits or higher.

Now that the key pair has been generated, we should throw out p and q. We don't need them, and the private key could easily be deduced if either of them were to be discovered.

Either key can be used to encrypt, and the other key can decrypt. We will be using the public key to encrypt. Normally the data to be encrypted is compressed, then broken up into chunks (what most people call blocks) that are smaller than n, each of which gets encrypted. The chunks must be smaller than n because the encrypting/decrypting operations are an exponentiation, mod n. Since we have a relatively small n, we will be using small chunks. For demonstration purposes, we will encrypt on a character by character basis, each chunk simply being the character's ASCII value. (Not real world, but that's OK.)

Here is the message you selected to encrypt, along with each character's ASCII value:

T	h	i	s		i	s		a		t	e	s	t	.
84	104	105	115	32	105	115	32	97	32	116	101	115	116	46

Given the public key (e, n), which in this case is (11, 143), we can encrypt each ASCII value by raising it to the e power, mod n.

c	=	m^e	mod n

c1	=	84^11	mod 143	= 128
c2	=	104^11	mod 143	= 104
c3	=	105^11	mod 143	= 105
c4	=	115^11	mod 143	= 71
c5	=	32^11	mod 143	= 76
c6	=	105^11	mod 143	= 105
c7	=	115^11	mod 143	= 71
c8	=	32^11	mod 143	= 76
c9	=	97^11	mod 143	= 141
c10	=	32^11	mod 143	= 76
c11	=	116^11	mod 143	= 116
c12	=	101^11	mod 143	= 134
c13	=	115^11	mod 143	= 71
c14	=	116^11	mod 143	= 116
c15	=	46^11	mod 143	= 2

Given the private key (11,143), we can decrypt each ASCII value by raising it to the d power, mod n:

m	=	c^d	mod n

m1	=	128^11	mod 143	= 84	= 'T'
m2	=	104^11	mod 143	= 104	= 'h'
m3	=	105^11	mod 143	= 105	= 'i'
m4	=	71^11	mod 143	= 115	= 's'
m5	=	76^11	mod 143	= 32	= ' '
m6	=	105^11	mod 143	= 105	= 'i'
m7	=	71^11	mod 143	= 115	= 's'
m8	=	76^11	mod 143	= 32	= ' '
m9	=	141^11	mod 143	= 97	= 'a'
m10	=	76^11	mod 143	= 32	= ' '
m11	=	116^11	mod 143	= 116	= 't'
m12	=	134^11	mod 143	= 101	= 'e'
m13	=	71^11	mod 143	= 115	= 's'
m14	=	116^11	mod 143	= 116	= 't'
m15	=	2^11	mod 143	= 46	= '.'

(Note: n (143) must be larger than the numbers you wish to encrypt. Any value mod 143 will be less than 143. This is why the smallest primes you were allowed to choose were 11 and 13. 11*13 is 143, which is large enough to represent any 7-bit ASCII character (0 to 127)).

Why?

Why does this exponentiation work? It works because (m^e)^d mod n is equivalent to m mod n. The decrypting exponent d "cancels out" the encrypting exponent e.

Let's dive a little deeper into the details by first explaining why we need this PHI thing. Euler demonstrated that when a number that is less than and relatively prime to n is multiplied by itself phi(n) times, the result is always one more than some multiple of n. So if we take the variable a to the power of phi(n), then mod n, we get 1. This looks like:

    a^phi(n) mod n = 1, where gcd(a,n) = 1

Now if we multiply by another a, we get:

    a^(phi(n)+1) mod n = a, where gcd(a,n) = 1

For us the variable n is the product pq, so the variable a could take on any of the n - phi(n) values that would make it not relatively prime to n. But we want a to be able to take on any value less than n. Fortunately, it can be shown (See Reference #1, p835) that (given the above equation) the variable a doesn't have to be relatively prime to n, as long as n is the product of any number of distinct (no repeats) primes.

    a^(phi(n)+1) mod n = a,
    where n is the product of any number of distinct primes.

(Note that the following is NOT true: a^phi(n) mod n = 1, where n is the product of any number of distinct primes. We can't just change the assumptions behind Euler's theorem that easily!)

Now remember from earlier that ed = PHI(n)*k + 1?

    (m^e)^d mod n  ->  m^(ed) mod n

    ->  m^(k*PHI(n)+1) mod n  ->  m mod n

References:

1. Thomas H. Cormen, Charles E. Leiserson, and Ronald L. Rivest, Introduction To Algorithms, McGraw-Hill, 1991.

2. Bruce Schneier, Applied Cryptography: Protocols, Algorithms, and Source Code in C, 2nd ed, John Wiley & Sons, Inc., 1996.

3. Donald Knuth, The Art of Computer Programming: Volume 2, Seminumerical Algorithms, 2nd ed, Addison-Wesley, 1981.

Document content / Perl code by David Youd (youd1@llnl.gov)
Lawrence Livermore National Laboratories
Copyright 1997