Problem C
Not-So-Straightforward Stream Cipher
The reverse engineering report noted the presence of the RC4 algorithm, but utilized in an unusual way.
Stream Cipher
RC4 is a type of stream cipher, well known for its simplicity and speed. Stream ciphers can be used to encrypt data of any length, since they use a keystream that is the same length as the plaintext, which are then XOR’ed together to produce ciphertext. Given a specific key as input, the RC4 key generation algorithm always outputs the same keystream. Encryption and decryption is performed using the same key, which produces the same output each time and is applied via XOR to the plaintext or ciphertext, respectively. The key scheduling and output algorithm are implemented as follows:
Here are some test vectors for RC4 to test your implementation. The columns are key, offset in the stream (to check further into your keystream) and the output bytes of the keystream. So for the second row, you are checking $16$ bytes starting at byte $256$ in the output.
KEY |
OFFSET |
OUTPUT |
0x0102030405060708 |
0 |
0x97ab8a1bf0afb96132f2f67258da15a8 |
0x0102030405060708 |
256 |
0xd0a990ff2c05fef5b90373c9ff4b870a |
0x0102030405060708 |
512 |
0x163b319923a6bdb4527c626126703c0f |
0x4578616d706c65204b6579 |
0 |
0xc6d01f550a5eca49cee9ca4128969af2 |
0x4578616d706c65204b6579 |
256 |
0x463e0a5bff1fa8888a8451abdd75dda5 |
0x4578616d706c65204b6579 |
512 |
0xfb1126010ebc8e232c3c5f60aee185ee |
Modified RC4
The ransomware developers believe in “security through obscurity”, which is universally viewed as poor practice amongst actual security professionals. They have taken this common algorithm and modified it for their purposes.
They attempted to initialize 4 separate instances of the algorithm using variations on their key. The output is then created after dropping different amounts of the initial key in each instance. Lets define $R(k,n)$ to be the RC4 algorithm initialized with a 16-byte key $k$ and dropping the first $n$ bytes of output. The developers attempted to create their stream cipher in the following way:
\[ S^{*}(k) = R(k,256) \oplus R(k1,1437) \oplus R(k2,512) \oplus R(k3,1352) \]where $\oplus $ is the
XOR operation, $k$ must be
16 bytes and $k1$,
$k2$, and $k3$ are defined
as:
Another way to say dropping the first $n$ bytes of output is that the routine does not start to output bytes until it has generated the $n$-th byte. For this reason, $R(k,0)$ would just be the normal RC4 algorithm (dropping $0$ output bytes), but $R(k,1)$ would start on the second output byte (dropping $1$ output byte).
While this is what the developers attempted to do, they actually initialized different variants of RC4 based on some typos in their code. They probably did not catch these mistakes because the algorithm still produces consistent output even with the typos.
Variants
The cyber security professionals noted $4$ different variants on the standard RC4 algorithm. They could determine how the variants were different, but failed to note which variants were actually in use in the stream cipher. Of the $4$ different versions, only one is actually properly coded RC4 and the other three include typos in the following manner (line numbers reference code above):
Variant |
Typos in RC4 algorithm |
1 |
Line 20: $K = S[S[(S[i] + S[j]) \mod 256]]$ |
2 |
Line 7: $j = (j + S[i] + K[j \mod len(K)]) \mod 256$ |
3 |
Line 7: $j = (j + S[j] + K[i \mod len(K)]) \mod 256$ |
The actual stream cipher you need to create code for is
\[ S(k) = R_1(k,256) \oplus R_2(k1,1437) \oplus R_3(k2,512) \oplus R_4(k3,1352) \]where $R_ i(k,n)$ is
now one of the $4$
instances of RC4 and the other terms are defined as
before.
The stream cipher does not vary once it is defined, so once you figure out which variant each one is, you will have the correct stream cipher for future use. The report does not mention whether all four of the variants are used or if some are repeated, so you will need to determine this.
Summary
Determine which of the $256$ combinations of variants are
instantiated and write a routine that creates output as defined
above.
You can use the first sample input and output below as a
test vector to confirm when you have the right
combination:
Input
A 16 byte key encoded in HEX.
Output
Generate 128 bytes of output of $S(k)$ and encode it in HEX
Sample Input 1 | Sample Output 1 |
---|---|
e35223d1de0bfb55eb8f52fd2fc5cfda |
ead8d509bbddc976c0c75e3fce1c48e05d9f8a82b6fb646caf962243f5a01cd4ec743f7d0a861800113c45f7259128c4c24d0e26d6cd9250fa7db1e70c9af861160da46c671642b5444b64cdb104e0b471f6d9de4458ff3795c23403040ae3f38d315543c1bfbce5a3b00bdb76a7409ac2b2f5a174c667c0ba07fc8d85706dc8 |
Sample Input 2 | Sample Output 2 |
---|---|
12ebad1a10daa4e344bd9733b6c40bfc |
ab912dd33d340740be63d5237e4403be85ddaad679dc52edda465a1a852dc680958c910a25db6e9efc21aa7dd0d5452529c9602242e2dc5e012024936d84c342f7bba77b79028b78d94a58a04ce1e89ba669dabd0e0bac626a251e5d2fb58a374291534b4d57eb30039cfe2f7c91dcfc8ddcc8d09fe9a9d4f0dd703019fe3e9d |
Sample Input 3 | Sample Output 3 |
---|---|
68193ee3992a8fb5826890047296d373 |
6f959d6795b0fee89b2b701d18baf9f223ba245411178721f9653ad3c31e17906a1b23781e279722ea8bb08fce86596b17efd4c73bee678c635603ca22ec2f2b0ed5c81eaa5890ac8cbf3d66b3a496638096f787ac64e2bd89b8139f039b1eceb820e7f58c18452e35f2ab863d1330dadd5f105694683ac057583cae060e18bf |
Sample Input 4 | Sample Output 4 |
---|---|
300ccd949ee6fe6faecc52b75d7f7ed1 |
2ce4af3faff01021805476058d8f4cb6ad4dec35bec3eac4b1f344e95d6ea2bbb3604d5bd079cfb597d67775449e80a441770d180af70327f937cfc172acb9ad4084d0c8fa6e8c920636de1361df078205c632337ff99d3857cc63b48a0bc067adb0088ab3f7b218e6cb7a3ed37a21ad9baac46476ae13d56ffc34eb27fa7b3b |