Cryptopals Set 2

2020/02/25

This blog now supports annotations! Highlight a field to comment, and if I'm wrong, I will pay you.

Been a while since I wrote about Set 1, but frankly it takes probably twice as long to explain something than it does to actually do it.

Anyway, here’s Cryptopals Set 2.

Set 2 Challenge 1 (Number 9): Implement PKCS#7 padding

Puzzle

A block cipher transforms a fixed-sized block (usually 8 or 16 bytes) of plaintext into ciphertext. But we almost never want to transform a single block; we encrypt irregularly-sized messages.

One way we account for irregularly-sized messages is by padding, creating a plaintext that is an even multiple of the blocksize. The most popular padding scheme is called PKCS#7.

So: pad any block to a specific block length, by appending the number of bytes of padding to the end of the block. For instance,

"YELLOW SUBMARINE"

… padded to 20 bytes would be:

"YELLOW SUBMARINE\x04\x04\x04\x04"

Solution

Instructions are pretty clear; just do it really.

One small 2-minute hiccup was what to do when I encountered a message with a full block size. After a quick google search it was pretty clear.

I should also mention that I had initially written this to process bytes. I ran into numerous complaints about utf-8 encoding, so I had to refactor this slightly to deal in hex instead of bytes.

def pkcs7_pad(msg, block_sz):
    msg = bytes.fromhex(msg)
    msg_blocks = [ msg[i:(i+block_sz)] for i in range(0, len(msg), block_sz) ]
    pad_len = block_sz - (len(msg) % block_sz)
    if len(msg_blocks[-1]) == block_sz: pad_len = block_sz
    return (msg + bytes([pad_len]*pad_len)).hex()

Set 2 Challenge 2 (Number 10): Implement CBC mode

Puzzle

CBC mode is a block cipher mode that allows us to encrypt irregularly-sized messages, despite the fact that a block cipher natively only transforms individual blocks.

In CBC mode, each ciphertext block is added to the next plaintext block before the next call to the cipher core.

The first plaintext block, which has no associated previous ciphertext block, is added to a “fake 0th ciphertext block” called the initialization vector, or IV.

Implement CBC mode by hand by taking the ECB function you wrote earlier, making it encrypt instead of decrypt (verify this by decrypting whatever you encrypt to test), and using your XOR function from the previous exercise to combine them.

The file here is intelligible (somewhat) when CBC decrypted against “YELLOW SUBMARINE” with an IV of all ASCII 0 (\x00\x00\x00 &c)

Solution

The Wikipedia article on CBC mode provides a pretty nice visual on how to encrypt:

and how to decrypt:

From these wonderful images we can just reuse the function we wrote for ECB and do a little bit extra computation to create ECB, simply chaining every block together.

def cbc_encrypt(msg, key, iv):
 
     padded_msg = pkcs7_pad(msg, block_size)
     msg_blocks = [ padded_msg[i:(i+block_size*2)] for i in range(0, len(padded_msg), block_size*2) ]
 
     iv_first_block = xor_hex_strings(iv, msg_blocks[0])
 
     (block_0, encryptor) = aes_ecb_encrypt_with_key(bytes.fromhex(iv_first_block), bytes.fromhex(key))
     block_i = block_0
     ctext = [block_i.hex()]
     for msg_block in msg_blocks:
         iv_i = xor_hex_strings(block_i.hex(), msg_block)
         (block_i, encryptor) = aes_ecb_encrypt_with_key(bytes.fromhex(iv_i), bytes.fromhex(key))
         ctext.append(block_i.hex())
 
     return ''.join(ctext)

Decryption is straightfoward as well.

def cbc_decrypt(ciphertext, key, iv):
     ctext_blocks = [ ciphertext[i:(i+block_size*2)] for i in range(0, len(ciphertext), block_size*2) ]

     CipherObj = Cipher(algorithms.AES(bytes.fromhex(key)), modes.ECB(), backend=default_backend() )
     iv_i = iv

     decrypted_text = []
     for ctext in ctext_blocks:
         d_i = aes_ecb_decrypt(CipherObj, bytes.fromhex(ctext)).hex()

         decrypted_text.append(xor_hex_strings(iv_i, d_i))
         iv_i = ctext

     return ''.join(decrypted_text[1:])

Running these two functions with the specified key, plaintext, and IV yields some fun beats 😊

...
Say -- Play that funky music Say, go white boy, go white boy go
play that funky music Go white boy, go white boy, go
Lay down and boogie and play that funky music till you die.

Play that funky music Come on, Come on, let me hear
Play that funky music white boy you say it, say it
Play that funky music A little louder now
Play that funky music, white boy Come on, Come on, Come on
...

Set 2 Challenge 3 (Number 11): An ECB/CBC detection oracle

Puzzle

Now that you have ECB and CBC working:

Write a function to generate a random AES key; that’s just 16 random bytes.

Write a function that encrypts data under an unknown key — that is, a function that generates a random key and encrypts under it.

The function should look like:

encryption_oracle(your-input) => [MEANINGLESS JIBBER JABBER]

Under the hood, have the function append 5-10 bytes (count chosen randomly) before the plaintext and 5-10 bytes after the plaintext.

Now, have the function choose to encrypt under ECB 1/2 the time, and under CBC the other half (just use random IVs each time for CBC). Use rand(2) to decide which to use.

Detect the block cipher mode the function is using each time. You should end up with a piece of code that, pointed at a block box that might be encrypting ECB or CBC, tells you which one is happening.

My immediate approach to this was to just reuse similar ideas that were implemented in Set 1 Challenge 8: simply partition the ciphertext and search for a duplicate block in the ciphertext. The only difference here is that instead of being given some text, I can control the ciphertext by providing crafted plaintext.

We can create duplicate blocks in ECB mode by also crafting plaintext with duplicate blocks; for example, encrypting “A”*BLOCK_SZ*2 will have the same encrypted blocks. However, in this case,the actual data going into the cipher has some prefix and suffixes, so just sending 2 blocks won’t be enough.

We do know the upper and lower bounds of the prefixes and suffixes though, so we can still craft our plaintext to account for the upper and lower bounds (or rather, just the prefix bounds, since the suffix won’t impact our “block cutoffs”). So thus, our message can just be “A”*(BLOCK_SZ*2 + 10).

def encryption_oracle(msg):
    rand_count = random.randint(5, 10)
    prepad_msg = get_random_int(rand_count*8) + msg.encode().hex() + get_random_int(rand_count*8)

    padded_msg = pkcs7_pad(prepad_msg.encode().hex(), 16)
    ciphertext = ''
    key = get_random_int(128)
    iv = get_random_int(128)

    if random.getrandbits(1):
        ciphertext = cbc_encrypt(padded_msg, key, iv)
    else:
        (ciphertext, CipherObj) = aes_ecb_encrypt_with_key(bytes.fromhex(padded_msg), bytes.fromhex(key))
        ciphertext = ciphertext.hex()

    return ciphertext

def detect_ecb(ciphertext):
    hexbytes = bytes.fromhex(ciphertext)
    blocks = [hexbytes[i:(i+16)].hex() for i in range(0, len(hexbytes), 16)]
    set_blocks = set(blocks)
    if len(set_blocks) != len(blocks):
        dup_arr = []
        for block in blocks:
            if block in set_blocks:
                set_blocks.remove(block)
            elif block not in set_blocks:
                if block not in dup_arr:
                    dup_arr.append(block)
        return dup_arr
    else: return []

def chall_three():
    msg = "A"*(16*2+10)
    ciphertext = encryption_oracle(msg)
    return detect_ecb(ciphertext)

Repeatedly running this should tell you whether or not ECB was used.


Set 2 Challenge 4 (Number 12): Byte-at-a-time ECB decryption (Simple)

Puzzle

Copy your oracle function to a new function that encrypts buffers under ECB mode using a consistent but unknown key (for instance, assign a single random key, once, to a global variable).

Now take that same function and have it append to the plaintext, BEFORE ENCRYPTING, the following string:

Um9sbGluJyBpbiBteSA1LjAKV2l0aCBteSByYWctdG9wIGRvd24gc28gbXkg aGFpciBjYW4gYmxvdwpUaGUgZ2lybGllcyBvbiBzdGFuZGJ5IHdhdmluZyBq dXN0IHRvIHNheSBoaQpEaWQgeW91IHN0b3A/IE5vLCBJIGp1c3QgZHJvdmUg YnkK

Base64 decode the string before appending it. Do not base64 decode the string by hand; make your code do it. The point is that you don’t know its contents.

What you have now is a function that produces:

AES-128-ECB(your-string || unknown-string, random-key)

It turns out: you can decrypt “unknown-string” with repeated calls to the oracle function!

Here’s roughly how:

  1. Feed identical bytes of your-string to the function 1 at a time — start with 1 byte (“A”), then “AA”, then “AAA” and so on. Discover the block size of the cipher. You know it, but do this step anyway.
  1. Detect that the function is using ECB. You already know, but do this step anyways.
  1. Knowing the block size, craft an input block that is exactly 1 byte short (for instance, if the block size is 8 bytes, make “AAAAAAA”). Think about what the oracle function is going to put in that last byte position.
  1. Make a dictionary of every possible last byte by feeding different strings to the oracle; for instance, “AAAAAAAA”, “AAAAAAAB”, “AAAAAAAC”, remembering the first block of each invocation.
  1. Match the output of the one-byte-short input to one of the entries in your dictionary. You’ve now discovered the first byte of unknown-string.
  1. Repeat for the next byte.

Solution

Conveniently, the attack is broken down into several steps, but of course as the last few sentences state, implementing is very different than “knowing”.

Finding the block size of the cipher is simple. Our pyca function will simply throw an exception any time an improperly formatted string is processed, thus we can just use a try-catch and brute force some message lengths.

Next, we can use our function we wrote in the previous challenge to detect ECB.

Before going further, let’s explain what the “steps” are conveying. For simplicity, like the problem mentions, suppose our block size is 4. Suppose also our controlled input is “AAAA” and the “unknown string” (i.e. secret) is “YELLOWSUBMARINES”. Then the oracle processes something like this:

AAAA | YELL | OWSU | BMAR | INES | ... 
       ^
       // super secret message!

Which produces a ciphertext as such where we know our own crafted input’s ciphertext:

OOOO | WQPO | 1JQI | a$%0| xl1s | ... 
       ^
       // just random gibberish

Ignore the other two “ciphertexts” (I honestly just typed in random gibberish) but keep the first ciphertext in mind– it’s important.

Now suppose our crafted input has 3 “A”’s instead of 4? Well then the oracle processes something like this:

AAAY | ELLO | WSUB | MARI | NES<\x01> | ...
   ^
   //we don't know this byte, but the oracle formats it as such

And the resulting ciphertext looks something like this:

OOOD | WPIO | JQLK | QPIO | JSSD | ... 
       // Again, random junk past this second block
   ^
   // But this "D" byte seems interesting

So now we know the ciphertext of “AAAY”. But of course, we don’t actually know that the last byte is a “Y”, so instead we can brute force it– Compare “AAAB” and see if that’s equal to “OOOD”, then “AAAC”, then “AAAD”, and so forth.

There are only 256 possibilities, so the brute forcing is trivial, and voila! We know the first encrypted byte is a “Y”!

We can then keep repeating the process. Send in the plaintext “AA” and obtain the output the ciphertext of “AAYE” (say, OODI) and brute force it– encrypt “AAYA” and compare with “OODI”, then “AAYB” and compare, and repeat the process until the entire string is decrypted.

The major “decrypting” code is below, but the rest of the supporting code will be on my Github.

def padding_oracle(unknown_hex, block_size, prefix):
    # added "prefix" arg to make challenge six easier

    decrypted_string = ''
    for i in range(0, block_size):
        my_string = ("A"*(block_size - 1- i)).encode().hex()
        ciphertext = generic_encrypt_ecb(prefix + my_string + unknown_hex, key, block_size)
        ciphertext_model = ciphertext[len(prefix):len(prefix)+block_size]
        for j in range(0, 255):
            my_string_padded = my_string + decrypted_string.encode().hex() + chr(j).encode().hex()
            ciphertext = generic_encrypt_ecb(prefix + my_string_padded + unknown_hex, key, block_size)
            if ciphertext[len(prefix):len(prefix)+block_size] == ciphertext_model:
                decrypted_string += chr(j)
                break

    return decrypted_string

As I mentioned in my last post, this just goes to show again that you should not have actual worthwhile data inside of ECB; if our data was truly random and just a nonce, it doesn’t matter what the decrypted text would be.

This challenge was probably the first one that required more thought– perhaps 2 hours worth of iterating and debugging. But dang was it fun.


Set 2 Challenge 5 (Problem 13): ECB cut-and-paste

Puzzle

Write a k=v parsing routine, as if for a structured cookie. The routine should take:

foo=bar&baz=qux&zap=zazzle

… and produce:

{
    foo: 'bar',
    baz: 'qux',
    zap: 'zazzle'
}

(you know, the object; I don’t care if you convert it to JSON).

Now write a function that encodes a user profile in that format, given an email address. You should have something like:

profile_for("foo@bar.com")

… and it should produce:

{
    email: 'foo@bar.com',
    uid: 10,
    role: 'user'
}

… encoded as:

email=foo@bar.com&uid=10&role=user

Your “profile_for” function should not allow encoding metacharacters (& and =). Eat them, quote them, whatever you want to do, but don’t let people set their email address to “foo@bar.com&role=admin”.

Now, two more easy functions. Generate a random AES key, then:

A) Encrypt the encoded user profile under the key; “provide” that to the “attacker”. Decrypt the encoded user profile and parse it.

B) Using only the user input to profile_for() (as an oracle to generate “valid” ciphertexts) and the ciphertexts themselves, make a role=admin profile.

Solution

For the profile_for function I’ve hardcoded it to always add a “user” role for any email, and to further ensure nobody can directly forge an admin role (a la “email=foo@bar&role=admin”) I’ve just put in an assert for the “admin” keyword.

Thus, the general format of my cookie/what might be stored in an actual database is this:

{"email": "<email>", "uid": "10", "role": "user"}

Obviously this isn’t a production grade function but I still believe it’s enough to prove I can bypass/modify ciphertext to have an admin role.

So, because ECB mode ciphertext blocks have zero dependencies on each other (i.e. it has weak confusion and diffusion), if we have access to an oracle, we can craft and concatenate ciphertexts without it “being wrong”.

Going back to the example in the previous challenge, suppose our block size is 4, and our input text is something like “foo=user”. The oracle would process this information as follows:

MESSAGE: foo= | user
CIPHER : a{{s | s26x    // gibberish; but I bet it's actual ciphertext for a key out there!

There’s no dependency between the plaintext “foo=” and “user” (and thus “a{{s” and “s26x”). This also means that if we were to change “s26x” to “s26y” we would decrypt something entirely different and entirely valid (say, I dunno, “useP” as a decrypted text).

Suppose we send in “admin”? Our oracle might return the following ciphertext:

MESSAGE: admi | n<padding here>
CIPHER : Qq%x | pMMM

And now suppose we take these two outputs, and craft a new ciphertext to send to the oracle, as below:

CIPHER : a{{s | Qq%x | pMMM
DECRYPT: foo= | admi | n<padding here>

We get “foo=admin” back, just from modifying the ciphertexts! We can just apply the same “arts and crafts” concept and pick and choose what texts we want. We do have to be slightly craftier though and embed the “admin” ciphertext in other cookie parameters according to the format above (e.g. in the email!) but that just requires some trial and error.

Below is the core of the “cutting”; other supporting code (e.g. profile_for) can be found on its respective folder.

email = "foo@bar.comAAAAAAAAAA" + bytes.fromhex(pkcs7_pad('"admin"}'.encode().hex(), 16)).decode()
profile_for(email, my_dict)
(ciphertext, CipherObj) = encrypt_profile(json.dumps(my_dict))

# cut off first two bytes (email) to get admin ciphertext
admin_ciphertext = ciphertext[32:48]

# 2) pad "admin" ciphertext onto encrypted cookie such that ciphertext ends in "role="

my_dict = {}
email = "foo@barLOL.co"
profile_for(email, my_dict)
(ciphertext, CipherObj) = encrypt_profile(json.dumps(my_dict))

# cut first 48 bytes and append with step 1 ciphertext; 48 = length of formatted json
plaintext = decrypt_profile(ciphertext[:48] + admin_ciphertext, CipherObj)

Set 2 Challenge 6 (Number 14): Byte-at-a-time ECB decryption (Harder)

Puzzle

Take your oracle function from #12. Now generate a random count of random bytes and prepend this string to every plaintext. You are now doing:

AES-128-ECB(random-prefix || attacker-controlled || target-bytes, random-key)

Same goal: decrypt the target-bytes.

Solution

I am assuming here that the random prefix is generated only once per instance of an oracle, and an attacker will only use this single instance.

Compared to Challenge 4 (Number 12) this challenge is indeed slightly more difficult, but not majorly. The vulnerability in Number 12 was that we were able to “force” the “target bytes” to start at a block boundary and “remove” bytes from our attacker controlled bytes from there.

Here, the vulnerability is no different– We are able to distinguish the “random-prefix” from the “target” and force it on a block boundary. It just requires a little more automation and some more inquiries to the oracle.

So what additional complexity does adding the prefix add? Let’s use an example to get our head around it. Suppose our block size is 8 bytes, our random prefix has 3 bytes and we apply the same prefix as Challenge 12 (which was just 1 block size worth bytes, or 8 bytes in this example):

OIWAAAAA | AAAazx22 | 7aZQz7sV | ...
   |---------|
   // Our input

5 bytes of our input were used to complete the “random-prefix”’s block. The solution for this is obvious, just input 13 bytes instead of 8.

OIWAAAAA | AAAAAAAA | azx227aZ | Qz7sv...

But wait! Clearly this only works if we know the prefix length, since we derived “5 bytes” from the knowledge of “3 bytes”, so the “obvious” solution doesn’t work. How else can we derive “3 bytes” though?

My solution was to keep adding bytes under my control and use my detect_ecb function on the resulting ciphertext until ECB was detected. In other words, if we keep on adding bytes as such:

OIWAAAAA | AAAAAAAA | Aazx227a | ZQz7sv...
OIWAAAAA | AAAAAAAA | AAazx227 | aZQz7s...
...
OIWAAAAA | AAAAAAAA | AAAAAAAA | azx227a..

We would eventually “obviously” see that the oracle was using ECB mode, and then know that the moment we “detect” it is the point the target-bytes are on a block boundary, and we know that the prefix_size is 3 bytes.

One optimization (which, in a real system, would save on network requests and thus potential detection by a sysadmin) is to start the size of the input bytes at 2*BLOCK_SZ since our method of detection is to create two identical blocks.

From here the process is simply identical to the one from challenge 12; in fact, I just used the same “padding oracle” function. Again, below is the core core and rest is on my Github. I definitely could’ve just used my find_blocksize function I wrote earlier instead of hardcoding the blocksize, but that’s not a major factor.

prefix_len = 0
target = "attack at dawnBBBBBBBBBBBBBBBBB".encode().hex()
my_string = ''
for i in range(32, 48):
    my_string = ("A"*i).encode().hex()
    ciphertext = generic_encrypt_ecb_wrapper(rand_prefix, my_string, target, 16)
    dup_blocks = detect_ecb(ciphertext)
    if dup_blocks:
        prefix_len = i - 32
        print(f"Padded {prefix_len} bytes to prefix with duplicate blocks {dup_blocks}")
        break

# from here, should be just the same as 12 
block_size = 16 
msg = rand_prefix + ("A"*prefix_len).encode().hex() 
decrypted_blocks = ''.join([ padding_oracle(target[i:i+block_size*2], block_size, '') for i in range(0, len(target), block_size*2) ]) 

padding = ord(decrypted_blocks[-1]) 
decrypted_blocks = decrypted_blocks[:-padding] 
 
print(decrypted_blocks) 
assert decrypted_blocks.encode().hex() == target 
NOTE

After I solved this, I retroactively searched online and was very surprised to find that the most popular solution was not this (my measure of most popular was the first 5 results on Google, ha). This section is an explanation of another similar solution of other people’s work.

Many people’s solutions was to keep track of a “null” input and keep adding bytes. On the first added byte, all blocks except the block with an attacker byte would be the same.

On the second added byte, they would compare the detected indices of the first ciphertext’s different with the same indices of the second ciphertext’s block, and keep repeating the process until this block was the same between the Nth and N+1 added byte.

More visually,

STEP 0: ... | OIW7aZQz | 7sV....      // Record ciphertext here
STEP 1: ... | OIWA7aZQ | z7sV...      // Keep track of first block location 
STEP 2: ... | OIWAA7aZ | Qz7sV..      // Compare block locations; if not equal, keep adding bytes
...
STEP N: ... | OIWAAAAA | 7aZQz7sV | ... 
STEP N+1: ..| OIWAAAAA | A7aZQz7s | ... // Compare blocks: locations are equal, so prefix is 3 bytes long 

I do still prefer my approach because reuses a lot of already written code and less index tracking than this other approach, but I am insanely biased. As long as the concepts make sense, it really doesn’t matter!

One person did do something similar to me, modifying his detect_ecb function itself.


Set 2 Challenge 7 (Number 15): PKCS#7 padding validation

Puzzle

Write a function that takes a plaintext, determines if it has valid PKCS#7 padding, and strips the padding off.

The string:

"ICE ICE BABY\x04\x04\x04\x04"

… has valid padding, and produces the result “ICE ICE BABY”.

The string:

"ICE ICE BABY\x05\x05\x05\x05"

… does not have valid padding, nor does:

"ICE ICE BABY\x01\x02\x03\x04"

If you are writing in a language with exceptions, like Python or Ruby, make your function throw an exception on bad padding.

Crypto nerds know where we’re going with this. Bear with us.

Solution

Incredibly straightforward. Just look at the last byte and sequentially check from the end of the string based off of that value. Definitely a softball, but I don’t mind.

def pkcs7_unpad(msg):
    padding = msg[-1]
    for c in range(len(msg) - padding, len(msg)-1):
        if msg[c] != padding:
            raise ValueError(f"Incorrect padding of message-- interpreted padding of {ord(padding)} but found {ord(msg[c])} in message")
    return msg[:-padding]

Set 2 Challenge 8 (Number 16): CBC bitflipping attacks

Puzzle

Generate a random AES key.

Combine your padding code and CBC code to write two functions.

The first function should take an arbitrary input string, prepend the string:

"comment1=cooking%20MCs;userdata="

.. and append the string:

";comment2=%20like%20a%20pound%20of%20bacon"

The function should quote out the “;” and “=” characters.

The function should then pad out the input to the 16-byte AES block length and encrypt it under the random AES key.

The second function should decrypt the string and look for the characters “;admin=true;” (or, equivalently, decrypt, split the string on “;”, convert each resulting string into 2-tuples, and look for the “admin” tuple).

Return true or false based on whether the string exists.

If you’ve written the first function properly, it should not be possible to provide user input to it that will generate the string the second function is looking for. We’ll have to break the crypto to do that.

Instead, modify the ciphertext (without knowledge of the AES key) to accomplish this.

You’re relying on the fact that in CBC mode, a 1-bit error in a ciphertext block:

STOP AND THINK

Before you implement this attack, answer this question: why does CBC mode have this property?

Solution

Firstly, here is the code for my oracle, the first function, just to demonstrate we properly escape the special characters “;” and “=”. This means, as in challenge 14, the only way to “get admin” is the modify the ciphertext itself.

def encryption_oracle(input_string):
    input_string = urllib.parse.quote(input_string)
    plaintext = '"comment1"="cooking%20MCs";"userdata"="' + input_string + '";"comment2"="%20like%20a%20pound%20of%20bacon"'
    return cbc_encrypt(plaintext.encode().hex(), key, iv)

Now let’s indeed stop and think as the puzzle wants us to do– what makes CBC exhibit the last two characteristics?

Again, Wikipedia is an amazing resource

As the picture demonstrates, during CBC decryption we directly use the ciphertext of block N and XOR this block with the cipher output to obtain the plaintext of block N+1. Or, put another way,

PN+1[i] = CN[i] ⊕ ON+1[i]

Where i is the respective byte for block N of the resulting plaintext P, ciphertext C, and cipher output O.

Suppose that I want to force the plaintext to a specific byte– say, the byte “A”. I could run the oracle to obtain the ciphertext, and simply XOR the byte that I want to evoke the same change on the plaintext. Or in other words:

(PN+1[i] ⊕ INJECTED-BYTE) = (CN[i] ⊕ INJECTED-BYTE) ⊕ ON+1[i]

Keeping this general formula in mind, consider the format of our plaintext/ciphertext:

"comment1=cooking%20MCs;userdata=<ATTACKER-BYTES>;comment2=%20like%20a%20pound%20of%20bacon"

As the challenge states, the goal is to inject the string ;admin=true; somewhere in this plaintext by modifying the ciphertext. Because we only control the “ATTACKER-BYTES” region/block, and affecting this region creates a predictable effect on the next block, we can simply insert enough bytes to form another “block” to play with.

For example– suppose “userdata=” is the start of one block of AES-CBC. If we pad our plaintext to look something like this, which outputs some ciphertext:

PLAINTEXT : ... | userdata=AAAAA | AAAA...AA | ;comment2=%20lik | ... // rest of cookie
                                  // "A"*16
CIPHERTEXT: ... | p42-xz02la05_Q | 2-qM...[0 | ...
                    // Block 1      Block 2

Under normal decryption, we obviously simply get the original plaintext back. But suppose I inject/edit the last byte of Block 1 with the byte “0x01”. Then the ciphertext and resulting plaintext would be:

CIPHERTEXT: ... | p42-xz02la05_P | 2-qM...[0 | ...
                            // ^ here, Q ⊕ 0x1 == P 
PLAINTEXT : ... | 4z-1@!X-q0z*2` | AAAA...A@ | ...  
                                        // ^ here, A ⊕ 0x1 == @
                                        // But 0x1 is arbitrary, it can be anything!

Using this information and ability to edit the next block’s plaintext, we can continue to edit every byte of the ciphertext to forge the “admin” plaintext during decryption. Going back to the equation above, we know our desired plaintext, and can derive the ciphertext by running the oracle once, and finding the injected byte is trivial.

Below is an example of the input userdata, and the resulting decrypted text, as well as the decryption oracle and the error injection function.

INPUT : 
b'"comment1"="cooking%20MCs";"userdata"="testAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA";"comment2"="%20like%20a%20pound%20of%20bacon"\x04\x04\x04\x04'

DECRYPTED OUTPUT:
b'"comment1"="cooking%20MCs";"userdata"="testAAAAA\xdb\x86\x8b\x81w:Y\x9c\xd1ca1c\xffji";admin=true;";"comment2"="%20like%20a%20pound%20of%20bacon"\x04\x04\x04\x04'

As you may see, I simply do a basic check for if ‘admin’ is within the decrypted output. A real parser will likely do more advanced checks, such as seeing that “userdata” is heavily corrupted.

def inject_string_cbc(ciphertext, injection_string, start_pos):
    ret_cipher = ciphertext
    for i in range(0, len(injection_string)):
        value = ord(injection_string[i]) ^ 65       
        ret_cipher = xor_single_byte(ret_cipher, value, start_pos + i)
    return ret_cipher

injection_string = '";admin=true;'
input_string = "testAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA"
ciphertext = encryption_oracle(input_string)
print(decryption_oracle(ciphertext))

injected_ciphertext = inject_string_cbc(ciphertext, injection_string, 64)
decrypted_text = decryption_oracle(injected_ciphertext)

assert b'admin' in decrypted_text

As always, other supporting code like the oracles are on my Github!


Summary

Set 2 was still beginner’s stuff, in my opinion, but still involved a decent amount of real crypto, going over some very classical attacks every security engineer should know about.

Set 2 was definitely more fun than Set 1, but also required twice the amount of words to explain things to my satisfaction. One other minor purpose of this series (other than to motivate me to finish it) is to get me writing and explaining certain technical concepts.

Frankly, writing this post took about the same time as completing the puzzles– I had finished this set two weeks prior but spent quite some time rewriting or rethinking explanations, or taking screenshots that I decided to scrap. If there is any criticism or desire for more details, do email me!

Looking forward to completing Set 3!