Saturday, August 9, 2014

RSA Encryption Explained


RSA is a common encryption algorithm that I wanted to understand. Bruce Schneier's excellent Applied Cryptography gives a good explanation but I found a few steps missing. Building on his derivation, I am going flesh it out.

What you need to know

Vocabulary
  • Relatively prime (or mutually prime or coprime): two integers are relatively prime when their only common divisor is 1.
  • Congruent: if two integers are have the same value when mod-ed with a third integer, they are congruent.
Notation

There is a lot of modulus arithmetic in cryptography and the notation looks like this:

a ≡ b (mod c)

This should not be read in the fashion of a programmer, like a = b mod c (or a = b % c in most languages). Rather, the mod applies to the whole equation so you should read it more like:

(a ≡ b) (mod c)

Modulus Arithmetic Formulas

You also need to know some formulas.

[f1]    a p - 1 ≡ 1 (mod p)  iff p is prime

This is Fermat's little theorem and the best proof I found is here.

General Maths Formulas

As any school child can tell you:

[f2]    abc = (ab)c

and:

[f3]    ab + c = ab ac

Also, the result of a mod is just the remainder of a division so it should be obvious:

[f4]    a mod b = c a = kb + c

where k is just some integer.

The Derivation

Choose two large prime numbers, p and q. Let n be the product of the two.

[d1]    n = pq

Why they must be prime will become apparent but it involves [f1] above.

Choose any integer e that is relatively prime to n.

Then, calculate d such that:

(d2)    ed  1 (mod ((p-1)(q-1)))

And that's it. Numbers n an e make the public key and d is the private key.

Now, let's take a message, m. If it's too big, we can break it down and if it's too small we can pad it. But that's not of interest at the moment, so let's just say m is a number we want to encrypt.

Encryption looks like this:

[d3]    c  me (mod n)

where c is the encrypted result.

Decryption looks like this:

[d4]    m  cd (mod n)

Easy, right?

Yes, but why does it work?

Why is equation [d4] true?

Ok, substitute c in [d4] with c in [d3] and use [f2] to trivially simplify:

[d5]    m  med (mod n)

now substitute [d1] into this:

[d6]    m  med (mod pq)

Now, re-express [d2] in the format of [f4]:

[d7]    ed = k ((p-1)(q-1) + 1

and substitute [d7] for ed in [d6]:

[d8]    m  m(k(p-1)(q-1) + 1) (mod pq)

Using [f3], this is the same as:

[d9]    m  m . m k(p-1)(q-1) (mod pq)

which, because of f3 is the same as:

[d10]   m  m . mk(p-1) m(q-1) (mod pq)

Now, let's invoke [f1] for (q-1):

[d11]   m  m . (mq-1)m(p-1) (mod pq)
           m . 1k     m(p-1) (mod pq)
           m .        m(p-1) (mod pq)


and again for (p-1):

[d12]   m  m (mod pq)

which is obviously true, so what we started with [d4] must also be true.

No comments:

Post a Comment