Problem B
Parse The Data
The malware report mentions a serialized data structure
under one of the many layers. This structure contains metadata
about each of the files and includes important information like
original filename, key data, and the actual encrypted data.
This information will be important to extract from the data
structure. Review the analysis below and write a decoding
routine to parse this data.
Tag-Length-Value
According to the report, the serialized structure always
appears with a header that notates a version of
tag-length-value (TLV) format used: “tlv_format_6”. This is not
any type of standard, so it must be the ransomware developers’
own format numbering. This type of structure is also often
called type-length-value encoding.
After the header, the structure is a series of objects encoded with a single byte TAG, followed immediately by a multi-byte LENGTH, and then the raw data VALUE of LENGTH bytes. For each of the encoded objects in the data structure the TAG byte corresponds to a named entity, some of which were diagnosed in the report and make up the following table:
BYTE |
TAG |
TYPE |
0x01 |
VICTIM_ID |
ASCII STRING |
0x02 |
VERSION |
ASCII STRING |
0x06 |
FILENAME |
ASCII STRING |
0x08 |
IV |
RAW DATA |
0x09 |
NONCE |
RAW DATA |
0x13 |
KEY_HASH |
ASCII STRING |
0x27 |
ENC_DATA |
RAW DATA |
The length is encoded as a variable-length-quantity (VLQ). This is a way of encoding a value in multiple bytes without needing to know the number of bytes required ahead of time. For this structure, the length is encoded in bytes, each made up of the $8$ bits ${\tt fbbbbbbb}$, where the ${\tt f}$ is a bit flag which is $1$ if there are more length bytes that follow and $0$ if there are not. This is sometimes called a continuation bit. The remaining 7 bits (${\tt b}$s) make up the length value starting with the high order bits when reading from left to right. As an example, to encode $127$, the single byte ${\tt 0x7F}$ would be used and to encode $128$, the $2$ bytes ${\tt 0x8100}$ would be used. Refer to the following table for a few detailed examples:
LENGTH |
BITS |
ENCODED (BITS) |
ENCODED (BYTES) |
0 |
00000000 |
0-0000000 |
0x00 |
127 |
01111111 |
0-1111111 |
0x7F |
128 |
10000000 |
1-0000001 0-0000000 |
0x81 0x00 |
8192 |
00100000 00000000 |
1-1000000 0-0000000 |
0xC0 0x00 |
16383 |
00111111 11111111 |
1-1111111 0-1111111 |
0xFF 0x7F |
16384 |
01000000 00000000 |
1-0000001 1-0000000 0-0000000 |
0x81 0x80 0x00 |
The data that follows is the given length of raw, uncompressed bytes, followed by another TLV entry or the end of the stream. Here is an illustration of the data:
"tlv_format_6" (12 bytes) |
TAG (1 byte) |
LENGTH (varies) |
VALUE (varies) |
TAG (1 byte) |
… |
Input
Each line is a HEX encoded version of some sample serialized data in TLV format.
Output
Parse the data completely and print out the contents as TAG:VALUE on individual lines in the order the data is read from the structure. If the value type is an ASCII STRING, print it out raw. If the value type is RAW DATA, print it out as HEX. If you come across an unknown tag (not already diagnosed and in the table), use UNKNOWN for the tag and encode the value in HEX.
Sample Input 1 | Sample Output 1 |
---|---|
746c765f666f726d61745f360203312e300613436f72706f7261746520496e666f2e786c7378e42032316634363461346131336334636262633635393638636638386135313038331340313138333934333230343a3830363233373338363a323435333637383137383a313632313838353936303a333730333131393536323a33363833333532353436012437646165356332362d633439622d343963662d626131662d326138313838396630333535320131278107b63a716730b185c2c159600d71760153631f687e7e621a4a75153a38b75f2b524d86cbb165bb4ab8e97e9f636f4fa21f288a1d55d7fa22b682a96d26418f943be79552ed7d392435eb7ad116c39cc01af235ab6dcb4c31fe860491171a8e008a3533072be51ce358c94e4204f7ff1459f0250be7472be10a9b91d98bfa45d97eb84c18dcbc1bd3 |
VERSION:1.0 FILENAME:Corporate Info.xlsx UNKNOWN:3231663436346134613133633463626263363539363863663838613531303833 KEY_HASH:1183943204:806237386:2453678178:1621885960:3703119562:3683352546 VICTIM_ID:7dae5c26-c49b-49cf-ba1f-2a81889f0355 UNKNOWN:31 ENC_DATA:b63a716730b185c2c159600d71760153631f687e7e621a4a75153a38b75f2b524d86cbb165bb4ab8e97e9f636f4fa21f288a1d55d7fa22b682a96d26418f943be79552ed7d392435eb7ad116c39cc01af235ab6dcb4c31fe860491171a8e008a3533072be51ce358c94e4204f7ff1459f0250be7472be10a9b91d98bfa45d97eb84c18dcbc1bd3 |
Sample Input 2 | Sample Output 2 |
---|---|
746c765f666f726d61745f360203312e32060e446174612049523235302e706466e42039376236626435383934306130356232373630323466343837626533616433613201310908c61d5ed666ca471e012465383332613538392d366432652d343664392d626265612d613631646434666331383233133b3833393432303033343a313139363034313a3131313637353537323a333232363733313532303a3434353736363639353a3138393932303132333727208f57fd8b26bf78cf7bd7e79d0e9e348f0fb1204365997f8d0d0efd40bde36616 |
VERSION:1.2 FILENAME:Data IR250.pdf UNKNOWN:3937623662643538393430613035623237363032346634383762653361643361 UNKNOWN:31 NONCE:c61d5ed666ca471e VICTIM_ID:e832a589-6d2e-46d9-bbea-a61dd4fc1823 KEY_HASH:839420034:1196041:111675572:3226731520:445766695:1899201237 ENC_DATA:8f57fd8b26bf78cf7bd7e79d0e9e348f0fb1204365997f8d0d0efd40bde36616 |
Sample Input 3 | Sample Output 3 |
---|---|
746c765f666f726d61745f360203322e30060e456d61696c5f6c6973742e747874272127d0c6b1cefdc88f8a6b84dd9cd6fc2ab39b6158e44d4e2f3f34bfe279562c3821133e3230313332373631363a313038373232383830343a3133353438303136323a3834333631383331363a333032393437303634393a333139343236303530310808a5fe4980be9528fce4203562663432363163643764393337303530383835643434373738353430616139320131012433633637363963652d313637372d343364382d383639342d636434353637353064353237 |
VERSION:2.0 FILENAME:Email_list.txt ENC_DATA:27d0c6b1cefdc88f8a6b84dd9cd6fc2ab39b6158e44d4e2f3f34bfe279562c3821 KEY_HASH:201327616:1087228804:135480162:843618316:3029470649:3194260501 IV:a5fe4980be9528fc UNKNOWN:3562663432363163643764393337303530383835643434373738353430616139 UNKNOWN:31 VICTIM_ID:3c6769ce-1677-43d8-8694-cd456750d527 |
Sample Input 4 | Sample Output 4 |
---|---|
746c765f666f726d61745f36060b5f5f696e69745f5f2e707908082777725220229201012430653065306464332d333134342d343238612d616633332d653761353661313538616636278136cf7d0a1cccf8fc8da7acdb71f7a54d5708195f59a744aae87f588457d926af1124c963c1341ff29bc4d87671da85aec96ecb20d96e1e886960721a67c977d1011bfbe47c7ab7cf0593ea65559db56bcb31c2b25e5c5a405c61a02d5d397432057c689bf6a8acb886a15af61e96422fd6ffde67be2b98c7b3f80031d8bf2778c50c726488ed0ff647c211b2c25a20bcc5744c48cd991f2d1a6b1be0eedf12fd039b26d14a0905b1838fc928948e9e68a9a857e2199daee42031653262663631636437343333333136633263363565373461643731373139353201310203322e30133c33373838303036343a3238373430323031363a32353239373331343a313631343838393034303a333735303139393231363a34323033393638333139 |
FILENAME:__init__.py IV:2777725220229201 VICTIM_ID:0e0e0dd3-3144-428a-af33-e7a56a158af6 ENC_DATA:cf7d0a1cccf8fc8da7acdb71f7a54d5708195f59a744aae87f588457d926af1124c963c1341ff29bc4d87671da85aec96ecb20d96e1e886960721a67c977d1011bfbe47c7ab7cf0593ea65559db56bcb31c2b25e5c5a405c61a02d5d397432057c689bf6a8acb886a15af61e96422fd6ffde67be2b98c7b3f80031d8bf2778c50c726488ed0ff647c211b2c25a20bcc5744c48cd991f2d1a6b1be0eedf12fd039b26d14a0905b1838fc928948e9e68a9a857e2199dae UNKNOWN:3165326266363163643734333333313663326336356537346164373137313935 UNKNOWN:31 VERSION:2.0 KEY_HASH:37880064:287402016:25297314:1614889040:3750199216:4203968319 |