Viper

HomeAbout

Analyzing and mining PE32 files

In this brief blog post we'll look at what Viper currently provides to dissect PE32 files and mine your repository to identify similar binaries and possibly reconstruct campaigns and cluster malware attacks.

As with the previous posts, following is a comprehensive screencast showing how example usage of the modules and commands available in this specific scenario. For the sake of demonstration, I'm working on the APT1 samples pack published by Mila:

Firstly we create a new project, in this case named "apt1". Then we import all the PE32 files contained in the source directory, as shown at 00:12, through store --folder /tmp/apt1 --file-type PE32. You can apply additional filters to the store command if needed:

usage: store [-h] [-d] [-f <path>] [-s <size>] [-y <type>] [-n <name>] [-t]

Options:
    --help (-h) Show this help message
    --delete (-d)   Delete the original file
    --folder (-f)   Specify a folder to import
    --file-size (-s)    Specify a maximum file size
    --file-type (-y)    Specify a file type pattern
    --file-name (-n)    Specify a file name pattern 
    --tags (-t) Specify a list of comma-separated tags

In the cast I then launch yara scan --tag to tag all the malware samples that were previously imported, trying to perform an initial sorting, and then select and open one sample that was classified as thequickbrow_apt1.

At 1:32 we look for similar samples using fuzzy hashing, we already get some hits:

apt1 viper NEWSREELS_sample_BAABD9B76BFF84ED27FD432CFC6DF241 > fuzzy
[*] 5 relevant matches found
+-------+---------------------------------------------------+------------------------------------------------------------------+
| Score | Name                                              | SHA256                                                           |
+-------+---------------------------------------------------+------------------------------------------------------------------+
| 54%   | NEWSREELS_sample_2C49F47C98203B110799AB622265F4EF | 7e732e41d93b613cac1ba979d7f7c98c8603f65a50bbf6b6198f1ee396dc7174 |
| 91%   | NEWSREELS_sample_AF2F7B070245C90BD2A0A0845314173A | f737829e9ad9a025945ad9ce803641677ae0fe3abf43b1984a7c8ab994923178 |
| 97%   | NEWSREELS_sample_A639F598D4C0B9AA7A4691D05F27D977 | 693f08996d40c0c2bdb25ae5457d44f9df694a8972a70fe989312753c7fe9ab4 |
| 54%   | NEWSREELS_sample_0496E3B17CF40C45F495188A368C203A | dbbd5d7944b1791027762a40a70b3c74772a9d31b5c67b6519394a1705edabcc |
| 54%   | NEWSREELS_sample_B8277CCE81E0A372BC35D33A0C9483C2 | 383f0d2cbf8914c3ecb23ea82bff38e1c048980806e37d75e3539362d105675c |
+-------+---------------------------------------------------+------------------------------------------------------------------+

The pe module provides several options, we'll try out a few of them:

usage: pe <command>

Options:
    help        Show this help message
    imports     List PE imports
    exports     List PE exports
    resources   List PE resources
    imphash     Get and scan for imphash
    compiletime Show the compiletime
    peid        Show the PEiD signatures
    security    Show digital signature
    language    Guess PE language
    sections    List PE Sections

At 1:42 we use pe imports to check for the import table from the PE headers. Nothing particularly interesting in there.

At 1:50 we retrieve all available PE resources:

apt1 viper NEWSREELS_sample_BAABD9B76BFF84ED27FD432CFC6DF241 > pe resources
+---+------------+--------+----------------------------------+-------+-----------+--------------+----------------------------+
| # | Name       | Offset | MD5                              | Size  | File Type | Language     | Sublanguage                |
+---+------------+--------+----------------------------------+-------+-----------+--------------+----------------------------+
| 1 | RT_STRING  | 0x73b8 | 5d15b213e8ad8482ac6a7b7e86a3dbea | 0x98  | data      | LANG_CHINESE | SUBLANG_CHINESE_SIMPLIFIED |
| 2 | RT_VERSION | 0x70a0 | e343766d1fa3577f9006387578cace3b | 0x318 | data      | LANG_CHINESE | SUBLANG_CHINESE_SIMPLIFIED |
+---+------------+--------+----------------------------------+-------+-----------+--------------+----------------------------+

We can the inspect single resources by opening them in a new Viper session:

apt1 viper NEWSREELS_sample_BAABD9B76BFF84ED27FD432CFC6DF241 > pe resources --open 1
[*] Session opened on /tmp/tmprK0758/baabd9b76bff84ed27fd432cfc6df241_0x73b8_RT_STRING
apt1 viper baabd9b76bff84ed27fd432cfc6df241_0x73b8_RT_STRING >

apt1 viper baabd9b76bff84ed27fd432cfc6df241_0x73b8_RT_STRING > !hexdump -C $self
00000000  00 00 3c 00 47 00 6f 00  76 00 64 00 4a 00 6c 00  |..<.G.o.v.d.J.l.|
00000010  44 00 53 00 6d 00 68 00  45 00 6d 00 44 00 51 00  |D.S.m.h.E.m.D.Q.|
00000020  70 00 53 00 40 00 3d 00  35 00 69 00 31 00 62 00  |p.S.@.=.5.i.1.b.|
00000030  38 00 6b 00 35 00 46 00  42 00 54 00 37 00 73 00  |8.k.5.F.B.T.7.s.|
00000040  6d 00 45 00 33 00 5a 00  4f 00 5a 00 42 00 46 00  |m.E.3.Z.O.Z.B.F.|
00000050  62 00 3d 00 4a 00 59 00  37 00 62 00 33 00 6c 00  |b.=.J.Y.7.b.3.l.|
00000060  6a 00 4a 00 4b 00 48 00  71 00 48 00 35 00 7a 00  |j.J.K.H.q.H.5.z.|
00000070  57 00 54 00 4a 00 6f 00  34 00 6a 00 00 00 00 00  |W.T.J.o.4.j.....|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000098

As shown at 2:22, we can also scan the full repository for all the files that share one or more resources with the one currently open:

apt1 viper NEWSREELS_sample_BAABD9B76BFF84ED27FD432CFC6DF241 > pe resources --scan
[*] Scanning the repository for matching samples...
[*] 5 relevant matches found
+---------------------------------------------------+----------------------------------+----------------------------------+
| Name                                              | MD5                              | Resource MD5                     |
+---------------------------------------------------+----------------------------------+----------------------------------+
| NEWSREELS_sample_2C49F47C98203B110799AB622265F4EF | 2c49f47c98203b110799ab622265f4ef | e343766d1fa3577f9006387578cace3b |
| NEWSREELS_sample_AF2F7B070245C90BD2A0A0845314173A | af2f7b070245c90bd2a0a0845314173a | e343766d1fa3577f9006387578cace3b |
| NEWSREELS_sample_A639F598D4C0B9AA7A4691D05F27D977 | a639f598d4c0b9aa7a4691d05f27d977 | e343766d1fa3577f9006387578cace3b |
| NEWSREELS_sample_0496E3B17CF40C45F495188A368C203A | 0496e3b17cf40c45f495188a368c203a | e343766d1fa3577f9006387578cace3b |
| NEWSREELS_sample_B8277CCE81E0A372BC35D33A0C9483C2 | b8277cce81e0a372bc35d33a0c9483c2 | e343766d1fa3577f9006387578cace3b |
+---------------------------------------------------+----------------------------------+----------------------------------+

Similarly, we can scan the whole repository for sampes sharing the same imphash:

apt1 viper NEWSREELS_sample_BAABD9B76BFF84ED27FD432CFC6DF241 > pe imphash --scan
[*] Imphash: 4f0aca83dfe82b02bbecce448ce8be00
[*] Scanning the repository for matching samples...
[*] 2 relevant matches found
+---------------------------------------------------+------------------------------------------------------------------+
| Name                                              | SHA256                                                           |
+---------------------------------------------------+------------------------------------------------------------------+
| NEWSREELS_sample_AF2F7B070245C90BD2A0A0845314173A | f737829e9ad9a025945ad9ce803641677ae0fe3abf43b1984a7c8ab994923178 |
| NEWSREELS_sample_A639F598D4C0B9AA7A4691D05F27D977 | 693f08996d40c0c2bdb25ae5457d44f9df694a8972a70fe989312753c7fe9ab4 |
+---------------------------------------------------+------------------------------------------------------------------+

Another interesting capability is retrieving the compile time of the sample you are currently analyzing and scan the repository for all others sharing the same compiletime. You can also specify a time window in minutes. In the cast we look for all samples compiled in a 24 hours time span:

apt1 viper NEWSREELS_sample_BAABD9B76BFF84ED27FD432CFC6DF241 > pe compiletime --scan --window 1440
[*] Compile Time: 2012-04-17 11:32:54
[*] Scanning the repository for matching samples...
[*] 2 relevant matches found
+---------------------------------------------------+----------------------------------+---------------------+
| Name                                              | MD5                              | Compile Time        |
+---------------------------------------------------+----------------------------------+---------------------+
| NEWSREELS_sample_02C65973B6018F5D473D701B3E7508B2 | 02c65973b6018f5d473d701b3e7508b2 | 2012-04-17 10:30:01 |
| NEWSREELS_sample_933B11BC4799F8D9F65466FB2E3EA659 | 933b11bc4799f8d9f65466fb2e3ea659 | 2012-04-17 10:29:00 |
+---------------------------------------------------+----------------------------------+---------------------+

We can see that two additional hits show that the malware authors have been compiling these related samples within about an hour. This gives interesting insights on the context and connection between multiple malware samples you have collected and opens avenues to investigating further.

At the end of the cast you can see how you can also for example scan for all samples sharing the same PEiD signature with pe peid --scan as well as the ones matching the same programming language with pe language --scan.

There are several interesting uses with all the details available in a PE32 in order to mine and cluster malware in your repository. What we've seen in this post is just a small fraction of what we can potentially achieve.

If you have ideas on ways to expand this particular module, feedback and contributions are welcome through GitHub.

published on 2014-08-28 13:00:00 by nex

comments powered by Disqus