# Dissecting Smoke Loader | CERT Polska **cert.pl/en/news/single/dissecting-smoke-loader/** Smoke Loader (also known as Dofoil) is a relatively small, modular bot that is mainly used to drop various malware families. Even though it’s designed to drop other malware, it has some pretty hefty malware-like capabilities on its own. Despite being quite old, it’s still going strong, recently being dropped from RigEK and MalSpam campaigns. In this article we’ll see how Smoke Loader unpacks itself and interacts with the C2 server. Smoke Loader first surfaced in June 2011 when it was advertiesed for sale on grabberz.com and xaker.name 2 by a user called SmokeLdr. _Smoke Loader being sold on grabberz.com_ What’s interesting is that Smoke Loader is sold only to Russian-language speakers .3 1 Since all functionalities are clearly described in the mentioned forum posts up to 2016 there is no point in listing them all here. ----- [The sample we ll be analysing is d32834d4b087ead2e7a2817db67ba8ca.](https://www.virustotal.com/en/file/20dce650c10545ae85005b3fe159df250c4f1275edfe4439e2d5a2d0515029de/analysis/1524764893/) _Diagram presenting the unpacking timeline_ If you’re only interested in the final payload you can take a quick glance at the diagram above and skip to the final layer. ## Table of contents Layer I The first thing Smoke Loader hits us with is a simple PECompact2 or UPX compression. As with many executable compressions, both are pretty easy do decompress using publiclyaccessible software: _PECompact being used to decompress the first layer_ _Decompressing UPX-packed sample_ That wasn’t hard, let’s move on. ----- ## Layer II _Entry function, which handles the debugging check and performs some useless api calls as a_ _disguise_ ## Debugger checks The PEB structure is checked against some debugging challenges: ## Lots of garbage code Almost every function is injected with pointless instructions in order to make the disassembly more complicated than it really is. ----- _A part of RC4 function, which contains a lot of useless code_ ## RC4-encrypted imports In this stage, almost all imports and library names are encrypted with RC4 before being passed to LoadLibraryA and then to GetProcAddress. The encrypted imports are first placed on stack: ----- Then they are decrypted using RC4 with the hardcoded key: Finally, the library name is passed to LoadLibrary and the function name to GetProcAddress: A custom import table is populated this way and used further in execution. ## Unpacking Finally, a new process is created and two calls to WriteProcessMemory are performed: _The writes are pretty characteristic and can be easily noticed in the Cuckoo report_ One of them writes the MZ header and the other rest of the binary. If we concatenate these two writes we’ll get the next layer. ## Layer III We’re welcomed with: _The exported start address_ ----- Well, that s not good. What we see is a result of several obfuscation methods and tricks, We’ll look at each one and try to understand how it works. ## Jump chains [Almost all early-executed functions adapt a chained jumps obfuscation technique.](https://thisissecurity.stormshield.com/2018/03/20/de-obfuscating-jump-chains-with-binary-ninja/) Instead of placing the instructions in a normal, linear manner, instructions are mixed within the functions with jump instructions connecting consecutive instructions. _The control flow is all over the place_ ----- If we were to write a script to follow the program s flow and graph instructions we d probably get something like this: ----- ----- _Partially deobufscated start function_ One can almost immediately see that a vast majority of instructions are used only to divert the natural program flow. ### Defeating **Attempt I** We tried creating an idaapi script that looks through all instruction blocks within a function and tries to concat blocks that are connected with each other via a 1:1 jump (jump from one possible address to one possible location). The author had probably thought about that and implemented jmp instructions using consecutive jnz and jz instructions. This doesn’t complicate our solution too much though. _A very naive Python script implementing the mentioned approach_ If we run it on the start function and strip the jumps we get: A lot better! But we can actually do even better by letting IDA do most of the work for us. **Attempt II** The only thing we need to do in order to make IDA recognize these blocks as a valid function is to make sure that all of the jumps are marked as a definitive change of flow control. While jmp instructions are marked as such by default, the jz/jnz instructions need to by patched to jmp instructions: ----- _Notice the newly-created dotted line that denotes an end of function code_ This trick allows IDA to recognize function bodies and even attempt to decompile them: _Decompiled start function after patching all jn/jnz instructions_ While (as almost always) the decompilation isn’t 100% correct, it gives us a good basic idea what the function does. This function, for example, loads the PEB structure and then accessess the OSMajorVersion and BeingDebugged fields. ## Debugging checks In this layer, we’ve noticed 2 debugging checks, conveniently located right at the beginning of execution. While they are the same as in the previous stage the approach differs slightly. What is interesting is that the debugging checks values are used in calculating the next functions addresses: _Reading the BeingDebugged field from PEB_ _Reading the NtGlobalFlag field from PEB_ The code calculates the next jump address based on the values of BeingDebugged and NtGlobalFlag fields, if either one is not equal to 0 the execution jumps to a random invalid place in memory, harsh. Normally patching the binary or changing the values mid-debugging works though. ## Virtualization checks Binary tries to get the module handle of “sbiedll” (a library that is used in sandboxing processes in Sandboxie) using GetModuleHandleA, if it succeds and thus Sandboxie is installed on the system, the program exits. A registry key System\CurrentControlSet\Services\Disk\Enum is checked and if any of the following values are found within the string, the program exits. qemu virtio vmware vbox xen ## Function body encryption ----- A vast majority of functions are encrypted: _A function that is partially encrypted_ After deobufscation the encryption function turns out to be pretty simple: _Decompiled code decryption method_ It accepts an address and number of bytes in eax and ecx registers respectively and xors all bytes in that range with a hardcoded byte. What’s also interesting is that the binary tries to keep as little code unencrypted at a time as possible: _Example of keeping the code encrypted_ We’re able to decrypt the chunks using an idaapi patching script: _Simple idaapi script that xors a given region with a byte_ ## Assembly tricks This layer employs a few neat position-independent-code assembly tricks. ### Assembly Trick I call loc_4024A7 puts the next instructions (in this case string “kernel32”) address onto stack and jumps over the data to the code pop esi puts the string’s address into esi register cmp byte ptr [esi], 0 the pointer can be now used as a normal rdata string ----- ### Assembly Trick II Instead of executing jmp eax, eax is firstly pushed onto stack and then retn is executed. ### Assembly Trick III call $+5 jumps to the next instruction (as call $+5 instruction lengths is 5) but because it’s a call it also pushes the address onto stack. In this case this is used to calculate the program’s base address (0x004023AA – 0x23AA) ## Custom imports [This stage uses a custom import table using a djb2 hash lookup.](https://gist.github.com/lmas/664afa94f922c1e58d5c3d73aed98f3f) It first iterates over 4 hardcoded library names, loads each one using LdrLoadDll and stores the handle. ----- Next, it iterates over 4 corresponding import hashes arrays and looks for matching values. When a match is found, it grabs the functions address from the library thunk and stores it in an api table that is stored on the stack. ----- _Hashes of functions to be imported_ ----- _Constructed api function table_ ## Unpacking Finally, the program uses RtlDecompressBuffer with COMPRESSION_FORMAT_LZNT1 to decompress the buffer and execute the final payload using PROPagate injection .4 ## Layer IV (final) String encryption All strings are encrypted using RC4 with a hardcoded key: ----- _Function used to get a decrypted string from a specific index in the encrypted blob_ _Structure of encrypted strings blob_ In this sample, the buffer decrypts to: _Decrypted strings_ ## C2 URLs C2 URLs are stored encrypted in the data section: _Part of data section that contains the encrypted URLs_ The encrypted URL structure can be represented as: _Encrypted C2 URL structure_ The encryption method is a simple xor routine with the byte key being derived from the dword key: _Decompiled function used to decrypt C2 URLs_ Which can be rewritten to Python as: _Output example_ ----- ## Packet structure _Decompiled function used to pack and send command packets_ Which can be represented as a C structure: _A struct representing the structure of command packet_ Packet encryption is done using RC4 yet again. It’s worth nothing, however, that different keys are used for encrypting the outbound packets and decrypting the inbound ones: _A part of decompiled function responsible for encrypting packets before sending them to the_ _C2_ ----- _A part of decompiled function responsible for decrypting packets before parsing them_ ## Program routine The binary starts by obtaining a User Agent for IE version acquired by querying registry key Software\Microsoft\Internet Explorer and values svcVersion and Version. The obtained User Agent is used in later HTTP requests. Next, it tries to connect continuously to http://www.msftncsi.com/ncsi.txt until it gets a response, this way it makes sure that the machine is connected to the internet. Finallly, Smoke Loader begins its communication routine by sending a 10001 packet to the C&C. It gets a response with a list of plugins to be installed and a number of tasks to be fetched. The bot iterates over the task range and tries to get each task by sending a 10002 packet with the task number as an argument. The tasks payload is often not hosted on the C&C server but on a different host and a Location header with the real binary URL is returned instead. Upon execution of the task, a 10003 packet is sent back with arg_1 equal to task number and arg_2 equal to 1 if the task executed succesfully. ----- _Graph representation of the communication between bot and C2_ ## General IOCs Program dumps itself to %APPDATA%\Microsoft\Windows\[a-z]{8}\[a-z]{8}.exe Program creates a shortcut to itself in %APPDATA%\Microsoft\Windows\Start Menu\Programs\Startup\[a-z]{8}.lnk Performs a System\CurrentControlSet\Services\Disk\Enum\0 registry query GET requests to http://www.msftncsi.com/ncsi.txt POST requests with HTTP 404 responses that include data Example request and response: ----- Yara rule: ## Collected IOCs Malware configs: Hashes: ## References 1 [https://grabberz.com/showthread.php?t=29680](https://grabberz.com/showthread.php?t=29680) 2 [https://web.archive.org/web/20160419010008/http://xaker.name/threads/22008/](https://web.archive.org/web/20160419010008/http://xaker.name/threads/22008/) 3 [http://stopmalvertising.com/rootkits/analysis-of-smoke-loader.html](http://stopmalvertising.com/rootkits/analysis-of-smoke-loader.html) 4 [http://www.hexacorn.com/blog/2017/10/26/propagate-a-new-code-injection-trick/](http://www.hexacorn.com/blog/2017/10/26/propagate-a-new-code-injection-trick/) ----- https://blog.malwarebytes.com/threat-analysis/2016/08/smoke-loader-downloader-with-asmokescreen-still-alive/ -----