Introduction

PIn previous Notebooks, we saw how to create an asymmetrical key (using RSA) and a symmetrical one (using Elliptical Curve Diffie-Hellman protocol). Now we will assume that we have the key between Alice and Bob and we will check how to encrypt / decrypt a file.

According to the NIST, there is 2 algorithm considered as safe (Publication and Block Cipher Techniques) which are AES and Triple DES. This is the one we will use in this exercice.

Principle

In encryption / decryption, the principle is to take a piece of code (fixed and determined by the key length and the algorithm used), encrypt it and do it for every block. This is mainly what I did in the notebook regarding RSA by taking the value of each character.

Padding

However, what happen with the last block. there is few chances that it has exaclty the size of the block. As a result, we have to apply some padding which have to be known. It exists several ones explained in Wikipedia. Based on some other videos I saw, I applied here the ANSI X9.23.

Mode Operation

On what I did on the notebook regarding RSA is not recommended because 2 identical input block will have 2 identical output block. It's the case with "l" letter in "Hello". However on this exemple, our block is far too small (it was just for learning purposes) but the result is still valid. It exist several alternative well explained also on Wikipedia. In this notebook, we will use CBC and CTR modes which are the 2 block cipher modes recommended by Niels Ferguson and Bruce Schneier (2 experts in cryptography). NIST also recommand them in the publication presented previously.

Now we have everythin explained, let's do some practical test with AES / Triple DES using those 2 modes and the padding ANSI X9.23

Implementation

Unfortunately, at this stage, it's still too complicated for me to implement those algorithms. As a reuslt I'll use a library called cryptography

In [2]:
import os
import re
import random
import hashlib

import cryptography
import lorem

from cryptography.hazmat.primitives.ciphers import Cipher, algorithms, modes
from cryptography.hazmat.backends import default_backend

AES (CBC)

let's create random keys to start. For this algorithm we need a 32 bit key as encryption key (but also another key commnly called IV which is used as a starting point for the Mode Operation). This key cqn be publicly transfered. It's just to avoid having repeating clock of encrypted message used by attacker to decypher it. This keyt must be 16 bit long.

In [3]:
letter = "0123456789ABCDEF"
random.seed(42)
key = "".join([random.choice(letter) for i in range(32)])
key = bytes(key, 'utf-8')

iv = "".join([random.choice(letter) for i in range(16)])
iv = bytes(iv, 'utf-8')

the module os also provides a generator easy to use :

In [66]:
# key = os.urandom(32)
# iv = os.urandom(16)

Let's now declare the encryption/decryption algorithm and the mode (the padding will be done manually)

In [4]:
algo = algorithms.AES(key)
In [5]:
mode = modes.CBC(iv)
In [6]:
cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())

For the exercice, let's create a text file with lorem text.

In [70]:
# with open("lorem.txt", "w") as f:
#     for i in range(20):
#         f.write(lorem.paragraph())

now we can create a new file and write each cyphered block in

In [71]:
encryptor = cipher.encryptor()
with open("encoded_CBC.txt", "wb") as f_out, open("lorem.txt", "rb") as f_in:
    while True:
        b = f_in.read(64)
        if not b:
            break

#         ANSI X9.23
        if len(b) < 64:
            padding_size = 64-len(b)
            padding_str = "0"*(padding_size-2) + "{:02d}".format(padding_size)
            b += bytes(padding_str, 'utf-8')
        ct = encryptor.update(b)
        f_out.write(ct)
encryptor.finalize()
Out[71]:
b''

This new file can be decrypted with the same principle

In [72]:
decryptor = cipher.decryptor()
with open("decoded_CBC.txt", "w") as f_out, open("encoded_CBC.txt", "rb") as f_in:
    while True:
        b = f_in.read(64)
        if not b:
            break
        ct = decryptor.update(b)
        ct=ct.decode("utf-8")

#         ANSI X9.23
        if ct[-2:].isnumeric():
            padding_size = int(ct[-2:])
            if ct[-padding_size:-2] == "0"*(padding_size-2):
                ct = ct[:-padding_size]
        
        f_out.write(ct)
decryptor.finalize()
Out[72]:
b''

To check that both files (initial and final) are identical, let's look at the hash of both files.

In [7]:
def get_md5(path_file):
    hash_md5 = hashlib.md5()
    with open(path_file, "rb") as f:
        for chunk in iter(lambda: f.read(4096), b""):
            hash_md5.update(chunk)
    return hash_md5.hexdigest()
In [8]:
get_md5("decoded_CBC.txt")
Out[8]:
'438b07c6a2bb6625b75947ce43e81510'
In [9]:
get_md5("lorem.txt")
Out[9]:
'438b07c6a2bb6625b75947ce43e81510'

AES (CTR)

Let's do the same but twith the Mode Operation (CTR) using the same keys

In [14]:
algo = algorithms.AES(key)
mode = modes.CTR(iv)
cipher = Cipher(algorithms.AES(key), modes.CTR(iv), backend=default_backend())
In [15]:
encryptor = cipher.encryptor()
with open("encoded_CTR.txt", "wb") as f_out, open("lorem.txt", "rb") as f_in:
    while True:
        b = f_in.read(64)
        if not b:
            break

#         ANSI X9.23
        if len(b) < 64:
            padding_size = 64-len(b)
            padding_str = "0"*(padding_size-2) + "{:02d}".format(padding_size)
            b += bytes(padding_str, 'utf-8')
        ct = encryptor.update(b)
        f_out.write(ct)
encryptor.finalize()
Out[15]:
b''
In [16]:
decryptor = cipher.decryptor()
with open("decoded_CTR.txt", "w") as f_out, open("encoded_CTR.txt", "rb") as f_in:
    while True:
        b = f_in.read(64)
        if not b:
            break
        ct = decryptor.update(b)
        ct=ct.decode("utf-8")

#         ANSI X9.23
        if ct[-2:].isnumeric():
            padding_size = int(ct[-2:])
            if ct[-padding_size:-2] == "0"*(padding_size-2):
                ct = ct[:-padding_size]
        
        f_out.write(ct)
decryptor.finalize()
Out[16]:
b''
In [ ]:
let's have a look at the hash of this new decrypted file.
In [17]:
get_md5("decoded_CTR.txt")
Out[17]:
'438b07c6a2bb6625b75947ce43e81510'

it's the same so everythin gis fine. We can also compare both encrypted file isn term of hash. We will see that even if they have the same key, the cfile will have different hash

In [18]:
get_md5("encoded_CBC.txt")
Out[18]:
'3751b303a9228b590473aaac8a425d0f'
In [82]:
get_md5("encoded_CTR.txt")
Out[82]:
'953f03bb3701383e47d570d5f3c2412c'

Perfect ! Let's not do the same with Triple DES in CBC

Triple DES (CBC)

This model requires a 24 bits key. Let's create new one.

In [19]:
letter = "0123456789ABCDEF"
random.seed(42)
key = "".join([random.choice(letter) for i in range(24)])
key = bytes(key, 'utf-8')

iv = "".join([random.choice(letter) for i in range(16)])
iv = bytes(iv, 'utf-8')

All the rest is similar

In [21]:
algo = algorithms.TripleDES(key)
mode = modes.CBC(iv)
cipher = Cipher(algorithms.AES(key), modes.CBC(iv), backend=default_backend())
In [22]:
encryptor = cipher.encryptor()
with open("encoded.txt", "wb") as f_out, open("lorem.txt", "rb") as f_in:
    while True:
        b = f_in.read(64)
        if not b:
            break

#         ANSI X9.23
        if len(b) < 64:
            padding_size = 64-len(b)
            padding_str = "0"*(padding_size-2) + "{:02d}".format(padding_size)
            b += bytes(padding_str, 'utf-8')
        ct = encryptor.update(b)
        f_out.write(ct)
encryptor.finalize()
Out[22]:
b''
In [23]:
decryptor = cipher.decryptor()
with open("decoded.txt", "w") as f_out, open("encoded.txt", "rb") as f_in:
    while True:
        b = f_in.read(64)
        if not b:
            break
        ct = decryptor.update(b)
        ct=ct.decode("utf-8")

#         ANSI X9.23
        if ct[-2:].isnumeric():
            padding_size = int(ct[-2:])
            if ct[-padding_size:-2] == "0"*(padding_size-2):
                ct = ct[:-padding_size]
        
        f_out.write(ct)
decryptor.finalize()
Out[23]:
b''

And once more, we can look at hashes. the final file has the same hash as the initial one.

In [88]:
get_md5("decoded.txt")
Out[88]:
'438b07c6a2bb6625b75947ce43e81510'
In [89]:
get_md5("encoded.txt")
Out[89]:
'5e1d256a8b3d21a050ef32de02cfadf9'

Conclusion

In this notebook, we saw the next block in the encryption algorithm. We now know :

  • how to generate a key
  • how to encrypt / decrypt messages with a symmetrical key
    • Using different Operation Modes
    • Using different Padding System
    • Using different Algorithms

The only remaining point was partially discussed in the first Notebook regarding encryption / decryption with asymmetrical key. This was discussed in the first notebook and all the rest from this notebook remain true. We have to take longer block and apply some padding. The Operation Mode can also be applied.

That means we are done for now on this domain. I bought a book about it and if there is some other interesting topics, I'll go thru on new Notebooks.