# [RE019] From A to X analyzing some real cases which used recent Emotet samples **[blog.vincss.net/2021/01/re019-from-a-to-x-analyzing-some-real-cases-which-used-recent-Emotet-samples.html](https://blog.vincss.net/2021/01/re019-from-a-to-x-analyzing-some-real-cases-which-used-recent-Emotet-samples.html)** ### 1. Introduction **Emotet (also known as Heodo, Geodo) is one of the most dangerous Trojan today. Through mass email spam** campaigns, it targets mostly companies and organizations to steal sensitive information from victims. Recent records show that Emotet is often used as a downloader for other malware, and is an especially popular delivery mechanism for banking Trojans, such as Qakbot and TrickBot, and also lead to ransomware attacks using Ryuk. [ANY.RUN’s annualreport pointed out that the most active malware in 2020 is Emotet.](https://any.run/cybersecurity-blog/annual-report-2020/) _Fig 1. Statistics of top threats by uploads for 2020_ In this article, we analyze in detail full attack flow in some real cases of recent Emotet samples which were discovered and handled by us while providing cyber security services to our customer: ¨ Sample 1: [· Document template: b836b13821f36bd9266f47838d3e853e](https://www.virustotal.com/gui/file/66cf65178099c0dc02f51ffb7f4f3f2fe6e6b9f216d855172eeed318023b3308) [· Loader binary: 442506cc577786006da7073c0240ff59](https://www.virustotal.com/gui/file/a2eefb7fd4de96d56e98eb9612701ce66ef8d2bfaff35ded09f42b44c4a085c4) ¨ Sample 2: [· Document template: 7dbd8ecfada1d39a81a58c9468b91039](https://www.virustotal.com/gui/file/dc40e48d2eb0e57cd16b1792bdccc185440f632783c7bcc87c955e1d4e88fc95) [· Loader binary: e87553aebac0bf74d165a87321c629be](https://www.virustotal.com/gui/file/9f3a07e6a4b0588cc4fcf4185639bde5806d212ea3b3ab8916fed2b1cc9414a7) ¨ Sample 3: [· Document template: d5ca36c0deca5d71c71ce330c72c76aa](https://www.virustotal.com/gui/file/694a91bfb3682ed18a198d7872e50684580dd7610d1534675c665d0367095660) ----- [Loader binary: 825b74dfdb58b39a1aa9847ee6470979](https://www.virustotal.com/gui/file/301c6be6158e5a28eeacab381f8e4266e4a15dc5bd149cb6724a3ea593bf0658) ### 2. Type of infection The main distribution method of Emotet malware is malicious email campaigns, using infected attachments, as well as embedded URLs. These emails may appear to come from trusted sources (cause the victim's email _account was taken over). This technique helps trick users into downloading the Trojan onto their machine._ Some illustration image of emails spread Emotet: _Fig 2. Examples of malicious emails with attachment_ ### 3. Document template and VBA code Emotet templates are constantly changing, the final target of attackers for leveraging templates to trick the victims into enabling macros to start the infection. ## 3.1. Sample 1 Document template: ----- _Fig 3. Sample 1’s document template_ This sample still acts in the usual way: Execute VBA code when opening document through Sub Document_open(). VBA code spawns powershell to execute encoded Base64 script. _Fig 4. VBA code spawns powershell to execute script_ The powershell script after decoding and deobfuscating usually look like the image below. It will download the payload which is an exe file to execute: ----- _Fig 5. Powershell script downloads payload from the C2 list for execution_ ## 3.2. Sample 2 Document template: _Fig 6. Sample 2’s document template_ This template also uses VBA, but there are some differences with Sample 1 as follows: VBA code is executed after closing document through Sub Document_Close(). Instead of using powershell, this sample spawns certutil.exe for decoding enncoded Base64 payload and then call rundll32 for executing the decoded payload. The payload and related information are hidden in the document in white font. _Fig 7. VBA code uses certutil for decoding payload and calls rundll32 to load payload_ Decode encoded base64 content will get VideoDownload.dll, this file has an exported function is In. This function is executed with the help of rundll32.exe. _Fig 8. Decoded payload is a DLL_ ----- _Fig 9. The expored function of DLL_ There is an embedded PE file in resource section of the above dll. The resource data is encoded. _Fig 10. DLL has a PE file that has been encoded_ The dll’s code when executed will load the content of a porn site, then retrieve the link of the .mp4 file (which is a hot keyword-related leaked sex clip of Vietnamese figure). It read bytes from mp4, through the loop, by using the read bytes as xor_key for decoding the above resource to get the complete PE file. Then it saves the decoded file to %temp%/tmp_e473b4.exe and execute this payload. ----- _Fig 11. Pseudocode performs decoding resource data and spawns new process_ ## 3.3. Sample 3 Document Template: _Fig 12. Sample 3’s document template_ Same as Sample 1: Execute VBA code when opening document through Sub Document_open(). VBA code also spawns powershell to execute encoded Base64 script. ----- _Fig13. VBA code spawns powershell to execute script_ The powershell script after decoding and deobfuscating will also performs the task of downloading the payload to execute: _Fig 14. Powershell script downloads payload from the C2 list for execution_ Differ from Sample 1 (use powershell to download loader is an exe file) and Sample 2 (decode DLL and _use this DLL to decrypt the loader as an exe file), in this Sample 3, the downloaded payload is a DLL file,_ exports Control_RunDLL function. Script uses rundll32 to execute this payload. So that, the downloaded payload is considered as a DLL loader. ### 4. Loader payload ## 4.1. Execution flow of loaders The payloads of Sample 1 and 2 (PDB path information: \eee\ggggggg\rseb.pdb) were built with Visual Basic: ----- _Fig 15. Loaders of Sample 1 and 2 were built with Visual Basic_ **Sample 3 was built with Visual C++ (PDB path information: E:\WindowsSDK7-Samples-** **master\WindowsSDK7-Samples-** **master\winui\shell\appshellintegration\RecipePropertyHandler\Win32\Release\RecipePropertyHandler.pdb)** _Fig 16. Loader of Sample 3 was built with Visual C++_ When first infected, the Emotet payload runs through two stages. During the first stage, it checks the victim system, if it's running with high privilege, it drops binary to CSIDL_SYSTEMX86, otherwise to **CSIDL_LOCAL_APPDATA. Finally, it launches the second instance. Payload running at the second stage will** communicate with C&C servers that embedded in its binary. _Fig 17. Sample 1 execution flow_ ----- _Fig 18. Sample 2 execution flow_ _Fig 19. Sample 3 execution flow_ ## 4.2. Technical analysis of the loader 4.2.1. Sample 1 and 2 These loaders when executed will allocate and unpack the main payload to the allocated memory and execute this payload: _Fig 20. Sample 1’s loader unpacks the main payload_ ----- _Fig 21. Sample 2’s loader unpacks the main payload_ These main payloads are quite small in size and were built with Visual C++: _Fig 22. The main payload of Sample 1 and 2_ ## 4.2.2. Sample 3 This sample, when executed, will get the address of two undocumented functions LdrFindResource_U and **LdrAccessResource from ntdll.dll. These functions are used to access resource data embedded in the** loader: _Fig 23. Sample 3’s loader accesses resource data_ ----- Next, it computes the MD5 hash of the pre initialized data and generates an RC4 key based on the computed hash. Then, use this RC4 key to decrypt the above resource data and execute the main payload: _Fig 24. Pseudocode performs decoding and executing the main payload_ The main payload is another DLL and also has an exported function is Control_RunDLL: _Fig 25. The main payload of Sample 3_ ### 5. Some techniques used in the main payload ## 5.1. Control Flow Flattening A program’s control flow is a path created out of the instructions that can be executed by the program. Disassemblers, like IDA, Ghidra, visualize control flow as a graph by creating a series of connected blocks (called “basic blocks”). In order to make reverse engineering more difficult, thwart the analysis and avoid detection, the main payload of Emotet usuallu apply an obfuscation technique is Control-flow flattening. Basically, this is a technique used to break the flow of a program's execution by flattening it. When the control flow is flattened, the program is divided into blocks, all of which are at the same level. Therefore, it will be difficult to determine the execution order of the program at the first glance. After divided into blocks, there is a control variable to determine which basic block should be executed. Its initial value is assigned before the loop. At each block will update the value of the control variable to redirect the program flow to another branch ----- Below is the illustration for the main function of each above payload: _Fig 26. The main function of the main payload of Sample 1_ _Fig 27. The main function of the main payload of Sample 2_ ----- _Fig 28. The main function of the main payload of Sample 3_ In order to deobfuscate this technique takes a lot of time and effort to do, so my personal experience as follows: Try using [HexRaysDeob plugin that was developed by RolfRolles.](https://github.com/RolfRolles/HexRaysDeob) Perform static analysis using IDA, trying to guess the purpose of the functions, and name them. Perform debug and synchronize function names, variables that set in IDA with debugger with the help of [Labeless plugin. During debugging, note the order in which the functions are executed and make a](https://github.com/a1ext/labeless/) comment back to IDA. ## 5.2. Dynamic modules resolve All payloads will rely on a pre-computed hash by the names of the DLLs to retrieve the base address of these DLLs when it needs to be used. In Sample 1 and 2, these hashes are passed directly to a function responsible for obtaining the base address of the DLL (f_resolve_modules_from_hash): ----- _Fig 29. Sampe 1 and 2 call f_resolve_modules_from_hash_ Particularly in Sample 3, there is a little bit of change, hash values are pre-computed according to the name of the DLL and the API function passed to the same function (f_get_api_funcs). Within this function, it uses these hash values to retrieve the base address of the DLL: _Fig 30. Sample 3 call f_resolve_modules_from_hash_ The search algorithm in all three payloads is similar, only difference in the xored value: _Fig 31. Pseudocode performs looking up the hashes of the DLL name_ Rewrite the hash function, combined with IDAPython to get a list of DLLs that Emotet uses: _Fig 32. Results when using IDAPython_ ----- The list of major DLLs that Emotet uses: **[+] userenv.dll** **[+] wininet.dll** **[+] urlmon.dll** **[+] shlwapi.dll** **[+] shell32.dll** **[+] advapi32.dll** **[+] crypt32.dll** **[+] wtsapi32.dll** **[+] kernel32.dll** **[+] ntdll.dll** _Fig 33. List of major DLLs that Emotet uses_ ## 5.3. Dynamic APIs resolve In all three payloads, when need to use which API function Emotet will search and call that function. Based on the base address of the given DLL, payloads resolve APIs by looking up the pre-computed hash. ----- In Sample 1 and 2,, these hashes are passed directly to a function responsible for obtaining API address (f_resolve_apis_from_hash): _Fig 34. Sampe 1 and 2 call f_resolve_apis_from_hash_ In Sample 3, as mentioned above, hash values are passed to the same function (f_get_api_funcs). Within this function calls to function (f_resolve_apis_from_hash) to retrieve the address of the API: _Fig 35. Sample 3 call f_resolve_apis_from_hash_ The search algorithm in all three payloads is similar, only difference in the xored value: _Fig 36. Pseudocode performs looking up the hashes of the API name_ Rewrite the hash function that payload uses, combined with IDAPython to retrieve all APIs and annotate to related code. The list of APIs used in these payloads are similar and similar to the other variants. The final result is as follows: ----- _Fig 37. The final result when using IDAPython to annotate related code_ ## 5.4. Decrypt strings All strings are encrypted and only decrypt at runtime. The structure of the encrypted data is shown as below. The decryption algorithm of the payloads is the same: _Fig 38. The payloads call the string decryption function_ Based on the above information, can use IDApython to create a script to decrypt data as follows: _Fig 39. Python code is used for decrypting data_ The list of strings obtained in payloads is quite similar: ----- _Fig 40. List of strings obtained after using the script_ ## 5.5. List of C2 (IP & Port) A list of C2 IP addresses and ports of Emotet payloads is stored in .data section as 8-byte blocks: _Fig 41. List of C2s is stored in each payload_ Through script can quickly retrieve the entire list of this C2: ----- _Fig 42. List of IP:Port used by payloads_ ## 5.6. RSA Public Key Through analysis, Emotet embeds an RSA public key in payloads. This RSA public key is also stored as a regular encrypted string and is decoded just like we did with strings. This key will then be used for the secure communication with the the C2 above. All three payloads above after decrypt have the same RSA Public Key: _Fig 43. RSA Public Key after decrypted_ ## 5.7. Enumerating running processes To get the list of the processes running on the victim machine, the payloads use APIs function **CreateToolhelp32Snapshot; Process32FirstW; Process32NextW. List the processes are guaranteed:** No process names where parent process ID is 0. No process is executed by Emotet. No duplicated process names. ----- _Fig 44. The payloads collect a list of the processes running on the victim machine_ ### 6. Conclusion Emotet was first discovered in 2014 as a banking Trojan, over time it continues to evolve and has always been a leading threat to organizations around the world. Emotet has once again proven to be an advanced threat capable of adapting and evolving quickly in order to wreak more havoc. This malware is mainly distributed through email spam campaigns, so to prevent it, organizations should regularly train information security awareness for end users. ### 7. References / Further Reading _[Click here for Vietnamese version.](https://blog.vincss.net/2021/01/re019-phan-tich-tu-a-den-x-chien-dich-tan-cong-thuc-te-su-dung-Emotet-gan-day.html)_ **Tran Trung Kien (aka m4n0w4r)** **Malware Analysis Expert** **R&D Center - VinCSS (a member of Vingroup)** -----