Hide

Problem C
Not-So-Straightforward Stream Cipher

The reverse engineering report noted the presence of the RC4 algorithm, but utilized in an unusual way.

Stream Cipher

RC4 is a type of stream cipher, well known for its simplicity and speed. Stream ciphers can be used to encrypt data of any length, since they use a keystream that is the same length as the plaintext, which are then XOR’ed together to produce ciphertext. Given a specific key as input, the RC4 key generation algorithm always outputs the same keystream. Encryption and decryption is performed using the same key, which produces the same output each time and is applied via XOR to the plaintext or ciphertext, respectively. The key scheduling and output algorithm are implemented as follows:

\includegraphics[width=0.9\textwidth ]{rc4_alg}

Here are some test vectors for RC4 to test your implementation. The columns are key, offset in the stream (to check further into your keystream) and the output bytes of the keystream. So for the second row, you are checking $16$ bytes starting at byte $256$ in the output.

KEY

OFFSET

OUTPUT

0x0102030405060708

0

0x97ab8a1bf0afb96132f2f67258da15a8

0x0102030405060708

256

0xd0a990ff2c05fef5b90373c9ff4b870a

0x0102030405060708

512

0x163b319923a6bdb4527c626126703c0f

0x4578616d706c65204b6579

0

0xc6d01f550a5eca49cee9ca4128969af2

0x4578616d706c65204b6579

256

0x463e0a5bff1fa8888a8451abdd75dda5

0x4578616d706c65204b6579

512

0xfb1126010ebc8e232c3c5f60aee185ee

Modified RC4

The ransomware developers believe in “security through obscurity”, which is universally viewed as poor practice amongst actual security professionals. They have taken this common algorithm and modified it for their purposes.

They attempted to initialize 4 separate instances of the algorithm using variations on their key. The output is then created after dropping different amounts of the initial key in each instance. Lets define $R(k,n)$ to be the RC4 algorithm initialized with a 16-byte key $k$ and dropping the first $n$ bytes of output. The developers attempted to create their stream cipher in the following way:

\[ S^{*}(k) = R(k,256) \oplus R(k1,1437) \oplus R(k2,512) \oplus R(k3,1352) \]

where $\oplus $ is the XOR operation, $k$ must be 16 bytes and $k1$, $k2$, and $k3$ are defined as:

\[ k1 = k \oplus {\tt 0x55555555555555555555555555555555} \]\[ k2 = k \oplus {\tt 0xaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa} \]\[ k3 = k \oplus {\tt 0x01010101010101010101010101010101} \]

Another way to say dropping the first $n$ bytes of output is that the routine does not start to output bytes until it has generated the $n$-th byte. For this reason, $R(k,0)$ would just be the normal RC4 algorithm (dropping $0$ output bytes), but $R(k,1)$ would start on the second output byte (dropping $1$ output byte).

While this is what the developers attempted to do, they actually initialized different variants of RC4 based on some typos in their code. They probably did not catch these mistakes because the algorithm still produces consistent output even with the typos.

Variants

The cyber security professionals noted $4$ different variants on the standard RC4 algorithm. They could determine how the variants were different, but failed to note which variants were actually in use in the stream cipher. Of the $4$ different versions, only one is actually properly coded RC4 and the other three include typos in the following manner (line numbers reference code above):

Variant

Typos in RC4 algorithm

1

Line 20: $K = S[S[(S[i] + S[j]) \mod 256]]$

2

Line 7: $j = (j + S[i] + K[j \mod len(K)]) \mod 256$

3

Line 7: $j = (j + S[j] + K[i \mod len(K)]) \mod 256$

The actual stream cipher you need to create code for is

\[ S(k) = R_1(k,256) \oplus R_2(k1,1437) \oplus R_3(k2,512) \oplus R_4(k3,1352) \]

where $R_ i(k,n)$ is now one of the $4$ instances of RC4 and the other terms are defined as before.

The stream cipher does not vary once it is defined, so once you figure out which variant each one is, you will have the correct stream cipher for future use. The report does not mention whether all four of the variants are used or if some are repeated, so you will need to determine this.

Summary

Determine which of the $256$ combinations of variants are instantiated and write a routine that creates output as defined above.

You can use the first sample input and output below as a test vector to confirm when you have the right combination:

Input

A 16 byte key encoded in HEX.

Output

Generate 128 bytes of output of $S(k)$ and encode it in HEX

Sample Input 1 Sample Output 1
e35223d1de0bfb55eb8f52fd2fc5cfda
ead8d509bbddc976c0c75e3fce1c48e05d9f8a82b6fb646caf962243f5a01cd4ec743f7d0a861800113c45f7259128c4c24d0e26d6cd9250fa7db1e70c9af861160da46c671642b5444b64cdb104e0b471f6d9de4458ff3795c23403040ae3f38d315543c1bfbce5a3b00bdb76a7409ac2b2f5a174c667c0ba07fc8d85706dc8
Sample Input 2 Sample Output 2
12ebad1a10daa4e344bd9733b6c40bfc
ab912dd33d340740be63d5237e4403be85ddaad679dc52edda465a1a852dc680958c910a25db6e9efc21aa7dd0d5452529c9602242e2dc5e012024936d84c342f7bba77b79028b78d94a58a04ce1e89ba669dabd0e0bac626a251e5d2fb58a374291534b4d57eb30039cfe2f7c91dcfc8ddcc8d09fe9a9d4f0dd703019fe3e9d
Sample Input 3 Sample Output 3
68193ee3992a8fb5826890047296d373
6f959d6795b0fee89b2b701d18baf9f223ba245411178721f9653ad3c31e17906a1b23781e279722ea8bb08fce86596b17efd4c73bee678c635603ca22ec2f2b0ed5c81eaa5890ac8cbf3d66b3a496638096f787ac64e2bd89b8139f039b1eceb820e7f58c18452e35f2ab863d1330dadd5f105694683ac057583cae060e18bf
Sample Input 4 Sample Output 4
300ccd949ee6fe6faecc52b75d7f7ed1
2ce4af3faff01021805476058d8f4cb6ad4dec35bec3eac4b1f344e95d6ea2bbb3604d5bd079cfb597d67775449e80a441770d180af70327f937cfc172acb9ad4084d0c8fa6e8c920636de1361df078205c632337ff99d3857cc63b48a0bc067adb0088ab3f7b218e6cb7a3ed37a21ad9baac46476ae13d56ffc34eb27fa7b3b

Please log in to submit a solution to this problem

Log in