Dancing With Shellcodes: Cracking the latest version of Guloader
By Eli Salem
Published: 2021-04-19 · Archived: 2026-04-05 13:31:11 UTC
Guloader is a downloader that has been active since 2019. It is known to deliver various malware, more notably:
Agent-Tesla, Netwire, FormBook, Nanocore, and Parallax RAT.
The malware architecture consists of a VB wrapper and a shellcode that does all the malicious activities of
Guloader. Although many malware use crypters that have shellcode in their initial droppers, the Guloader
shellcode is notorious for its anti-analysis capabilities; thus making the unpacking mechanism of Guloader much
more challenging.
The majority of the anti-analysis functionality of Guloader is already published by several security researchers.
However, for researchers who are not 100% familiar with the Guloader shellcode, it could be challenging to
predict where these features are located, which might lead to failure in analysis.
In this article, I will present a step-by-step dynamic analysis of Guloader. As well, the malware anti-analysis
functions, and how to overcome them.
Also, I will demonstrate the malware’s main objectives.
Note- Guloader heavily uses time checks and other traditional anti-analysis techniques. Therefore, to save time, in
this analysis I will use the ScyllaHide plugin.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 1 of 25

Also, several of the Guloader’s anti-analysis techniques are impossible to evade without manual intervention. So I
will mainly (but not only) focus on them.
File metadata
Hash: d55259bcf47af7e645ab7b003aa2cd4071cb36c6
Press enter or click to view image in full size
Sample metadata in Pestudio
Getting into the shellcode
In its initial state, Guloader is wrapped with a VB. To overcome it, we’ll first reach the entry point and then set a
breakpoint on VirtualAlloc. Next, we will click Run 12 times (the VB wrapper calls several times to VirtualAlloc,
but we only care about the 12th time).
As we return to user code from the 12th VirtualAlloc, we will see the next image
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 2 of 25

12th VirtualAlloc
Now, Guloader will write the shellcode to this newly allocated memory - The process consists of several JMP
instructions. Scroll down until you’ll see a CALL to the register EDI (the place where the shellcode is eventually
stored). Taking this CALL will lead us to the shellcode itself.
Call the shellcode
Immediately after taking the CALL to EDI, we’ll see a jump to another location. Take this jump as well.
Take the jump
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 3 of 25

The shellcode
After taking the initial jump, we see three different functions. For our unpacking tutorial, we can skip them and go
straight to the JMP 602766, located at the end.
Take the jump
After taking the jump, we see an immediate CALL to 600144, step into it.
Step into
Now, we see several functions and a JMP at the end. Also, we see that the first function is 6013A9.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 4 of 25

Anti VM function
Anti-Analysis 1: Anti-VM
To our surprise, when we will try to step over the CALL to function 6031A9 we encounter the following message
box.
Gotcha
Why did it happen?
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 5 of 25

Without paying attention, the shellcode pushed 8 pre-computed hashes into the stack, in the following order:
push 0xB314751D
push 0xA7C53F01
push 0x7F21185B
push 0x3E17ADE6
push 0xF21FD920
push 0x27AA3188
push 0xDFCB8F12
push 0x2D9CC76C
These hashes will be used by the function 6031A9 in the following manner:
1) The function will use the API call ZwQueryVirtualMemory (the kernel equivalent of VirtualQuery) to scan the
process’s memory.
2) The pre-computed hashes will be calculated using the djb2 algorithm. Each one of them will represent a string
that is related to a Virtual Machine product (for example 0xB314751D represents “vmtoolsdControlWndClass”).
3) If one of these strings will be found by the ZwQueryVirtualMemory, the process will create the previously
mentioned message box.
How we overcome this anti-VM technique?
There are three different approaches we can take:
1) The first approach is to change the pre-computed hashes on the stack before the call to 6031A9.
2) Fill the CALL line with no operation (NOP)
3) Change the control flow by redirecting the EIP register to contain the address of the next instruction (after the
CALL to 6031A9)
For this example, I took the first approach and changed the hashes suffix to “22”.
Press enter or click to view image in full size
Changing the hashes on the stack
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 6 of 25

As we continue to step over to the next functions, we encounter the function 601F28, which is 2 functions below
6031A9 (the anti-VM function).
Anti-Analysis 2: Time checks & CPUID
If we will try to step over this function, we’ll see that we are stuck and can't move forward.
Anti-Analysis function
Why did it happen?
Inside the function 601F28, there is another routine that consists of two anti-analysis mechanisms. Time cheks
using RDTSC (Read Time-Stamp Counter), and anti-VM using CPUID.
Anti-Analysis function
How we overcome this anti-analysis?
Similar to the first anti-VM, we can change the control flow with the EIP register, or fill the line of the CALL to
601F28 with NOPS.
After choosing our preferred method, we can go to the next JMP instruction.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 7 of 25

NOP the function
After taking the jump, we immediately find ourselves in another CALL to a function called 6001C2, step into it.
Step Into
Next, we see a function named 602F54 that will take a big role in the main functionality of the shellcode.
This function is responsible for accessing the process environment block (PEB) and returning an API call.
We also see a direct call to the register EAX - something that is always interesting to inspect when we are dealing
with shellcodes.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 8 of 25

Resolving API Calls
When we step over 602F54, we see that it returns the API call TerminateProcess. Then, we’ll take a jump to
6027A0.
Press enter or click to view image in full size
Take the jump
After taking the jump, we find ourselves in a call to the function 6001ED.
Step Into
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 9 of 25

After stepping into this function, we see that we in a location that will call directly to the register EAX.
Now, this register holds the API call EnumWindows (Enumerates all top-level windows on the screen).
Press enter or click to view image in full size
EnumWindows
Anti-Analysis 3: Anti-VM\Anti-Sandbox
After we step over the call to EnumWindows, we see the line: cmp eax,c.
Using this line the shellcode determines if there are at least 12 (C in hexadecimal) windows in the machine. If not,
the process will be terminated using the previously mentioned API call - TerminateProcess.
Check for at least 12 windows
How we overcome this anti-sandbox?
Switch the flag in the JGE jump if necessary, however, I did not have any issues with it.
As we continue with the normal execution of the shellcode, we see more instances of the function 602F54, one of
these instances resolves the function ZwProtectVirtualMemory (the kernel equivalent of VirtualProtect).
Right after, we’ll see multiple Push 0 instructions and a CALL to the function 6034F4.
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 10 of 25

Getting into the Anti-breakpoint function
Anti-Analysis 4: Anti breakpoints
When we step into this function, we observe an interesting anti-debugging technique. In its first lines, the
shellcode gets the function DbgBreakPoint and store it on esp+18.
Getting DbgBreakPoint
Then, it gets the function DbgUiRemoteBreaking, and store its address in esp+1C
Getting DbgUiRemoteBreakin
Next, the shellcode gets the address of DbgBreakingPoint (esp+18) moves it to the EAX register, and writes the
byte 90 into it.
As we remember, 90 represent NOP, which means that each time a breakpoint will occur it will not break because
of the NOP.
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 11 of 25

Patching DbgBreakPoint
Then, the shellcode will do the same with DbgUiRemoteBreaking. However, it will patch its beginning with 6A,
0, B8, and then add the function ExitProcess after. So every time a breakpoint will be happening the process will
be terminated.
Funny enough, this anti-breakpoint mechanism is under another Anti-analysis mechanism using the RDTSC time
checks.
Patching DbgUiRemoteBreakin
In the end, from the disassembler point of view, the changes will look like this:
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 12 of 25

Before and after patch
How we overcome this anti-breakpoint?
The best way is to bypass the function that responsible for this anti-analysis mechanism, which is 6034F4. Either
NOP or Control flow solutions are fine here.
Press enter or click to view image in full size
NOP Anti-Analysis function
Anti-Analysis 5: Anti-VM
Next, we see the function 602038, if we step over it and we’ll see the string “C:\Program Files\qqa\qqa.exe”. This
is because 602038 functionality is to search whether the Qemu gues agent is located on the machine. This is
another anti-VM feature of Guloader.
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 13 of 25

Qemu gues agent
—
Get Eli Salem’s stories in your inbox
Join Medium for free to get updates from this writer.
Remember me for faster sign in
In the next two calls, we see a call to 602F54 which resolves NtSetInformationThread. This API call will be stored
in the EAX register and will be executed several instructions later. However, in this case, we need to pay attention
to the argument NtSetInformationThread gets.
Anti-Analysis 6: NtSetInformationThread
The second argument is ThreadHideFromDebugger (11), which in this case will cause the process to crash if it's
working under a debugger.
Press enter or click to view image in full size
NtSetInformationThread Anti-Analysis
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 14 of 25

How we overcome this anti-debugger technique?
ScyllaHide covers this technique, however, we can just change the control flow or insert NOPs.
After bypassing NtSetInformationThread, we will keep step-over until we will reach a JMP at the end of this large
routine, In my case, it is 602773
Take the jump
Right after we took the jump, we see a call to another function, step into it.
Step into
After stepping into the function, we found ourselves in a unique location. Using other pre-computed hashes, the
shellcode searches for installed products with the API MsiEnumProducA and MsiGetProductInfo (again with the
djb2 algorithm).
I will not focus on this technique, but it is explained in detail here.
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 15 of 25

MsiEnumProducA and MsiGetProductInfo
After the execution of MsiEnumProductsA, we see the instruction JNE 6004C8, by default we will not take this
jump, but for the sake of bypassing this anti-analysis, we will change the ZF (zero flag) from 1 to 0, and take the
jump.
Press enter or click to view image in full size
Change the flag
Shellcode main function
Once we took the jump, we will reach one of the most important functions in the shellcode. This function will
mainly consist of two important functions.
The first one is the already mentioned 602F54 which will resolve API calls. The second one is 603B93 which will
be responsible to execute them (except few cases). This function will be the main execution function, where the
most important API calls will be executed.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 16 of 25

These two functions will be used multiple times during the final stages of the shellcode. Set a breakpoint on
603B93 and step into it.
Press enter or click to view image in full size
Two important functions
Because of the fact that this function will be responsible for the majority of the API calls execution, we’ll want to
set a breakpoint in strategic locations so we’ll have the option to hit Run and speed things up.
My preferred locations are the call to EAX, which is the location when the API call will be executed, and JMP
ECX, which is the location where the function will return to the core parent function.
However, before we’ll reach these important functions we need to bypass multiple anti-analysis checks that
happened right before.
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 17 of 25

Execution function architecture
Anti-analysis 7: Hardware breakpoints
The DR (debug registers) are located in the following locations:
[eax+4] = DR 0
[eax+8] = DR 1
[eax+C] = DR 2
[eax+10] = DR 3
[eax+14] = DR 4
[eax+18] = DR 5
The shellcode will compare any of these registers to the number 0, if one of them is not 0 that means there is a
hardware breakpoint. In this case, the shellcode will jump using the JNE 603C97 and the process will be
terminated.
If we want to observe how this anti-analysis mechanism works, we can click “follow in dump” on one of these DR
locations (for example eax+4), and see it has the same address of the chronological number we set the hardware
breakpoint.
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 18 of 25

Hardware breakpoint example
How we overcome this technique?
If you set a hardware breakpoint, you can change the flag so the JNE jump will not be taken.
The easiest solution will be to use the ScyllaHide plugin.
Anti-analysis 8: Software breakpoints
In this technique, the shellcode will get the API call to be executed from the EAX register, move one byte to the bl
portion of the EBX register, and will inspect if any software breakpoints assign to it.
If it has any software breakpoint, it will have one of the breakpoint opcodes(for example, 0xCC which means INT
3, and as we know, the INT 3 opcode represents a software breakpoint).
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 19 of 25

Software breakpoint example
As expected, if a software breakpoint is present, the shellcode will go to the location that will terminate the
process.
Press enter or click to view image in full size
Software breakpoint example
How we overcome this technique?
Change the ZF to be 0, or change the instruction to be NOP. As mentioned before, the easiest solution is the
ScyllaHide plugin.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 20 of 25

Finally, we bypass all of the anti-analysis mechanisms and we can focus on Guloader’s main goal. Because we
already set a breakpoint on the call to EAX, and JMP ECX we can click Run, and observe the functions that bein
executed.
The first API call that is interesting for us is CreateProcessInternalW (which is the kernel equivalent to
CreateProcessA). In this case, the process to be created is RegAsm.exe, this is also a hint for us that the malware
to be downloaded will probably be written in .NET (In this case, it’s Agent-Tesla).
Press enter or click to view image in full size
Creating process
The RegAsm process will be spawned in a suspend mode which indicates process hollowing injection, this
variation of process hollowing is a bit unique, but because we only care about unpacking the final payload I will
not cover it here, however, you can read here for more details.
RegAsm in suspend state
As we continue to observe the API calls that being executed, we see NtMapViewOfSection. When we encounter
this function, click step over on JMP ECX, to return to the parent function. Then, continue to step over
instructions manually until you see an instruction that calls for a function stored in the location [ebp+30]. This line
will execute the API call NtWriteVirtualMemory (which is the kernel equivalent to WriteProcessMemory).
This instruction will write a second shellcode to the RegAsm process.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 21 of 25

Write the second shellcode
Now, we can go to the third argument of NtWriteVirtualMemory and click “follow in dump” to observe the new
shellcode that will be written.
Press enter or click to view image in full size
Observing the second shellcode
Next, we can copy and dump the entire buffer that contains the second shellcode. In this way, we can debug it
without any dependency on RegAsm.
Wrap the first shellcode
After the first shellcode creates the RegAsm process and injects a second shellcode into it, it will execute the API
call NtResumeThread to activate the second shellcode within the RegAsm memory.
Now, we basically have two options, we can open a new debugger and attach it to RegAsm, or, we can debug the
dumped second shellcode as a stand-alone shellcode using tools such as BlobRunner.
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 22 of 25

My preferred option is to debug it using the BlobRunner tool because I don't want to be dependent on RegAsm.
Also, I want to have the option to debug it over and over again as quickly as possible.
For those of you who are not familiar with the BlobRunner tool, please look at the following video.
Debugging the second shellcode
When we start to debug the second shellcode, we notice that to our surprise this shellcode starts the same as the
first one, In fact, this is the almost same shellcode. This resembles give us the advantage to bypass all the anti-analysis mechanism that we already see in the first shellcode.
Differences from the first shellcode
After we reach the main function we saw in the first shellcode, we will set the same breakpoints. Then, as we click
Run and step over functions, we start to see indications of additional capabilities that we have not seen in the first
shellcode.
First, we see a call to a location in the stack (in this case, [ebp+D8]), that will execute the function
InternetOpenUrlA, we also see the C2 it will use.
Press enter or click to view image in full size
Observing the C2
Then, in the function that executes API calls, we see other wininet API calls being executed.
Press enter or click to view image in full size
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 23 of 25

Observing the C2
At this point I decided to finalize my analysis because we achieve the two goals of this article:
1) We learn how to crack the two shellcode stages of the Guloader malware.
2) We observe how to find the C2 that will be responsible for downloading the additional malware.
Recap
When we sum up the entire architecture of Guloader, we observe several stages and key features:
1) The malware initially come wrapped with a VB layer
2) After the VB part ends, the entire malware activity is executed by a shellcode.
3) The shellcode contains multiple anti-analysis mechanisms, some of them are inescapable without manual
intervention.
4) The shellcode creates the process RegAsm and injects a second shellcode into it with a unique variation of the
Process Hollowing injection.
5) The second shellcode downloads further malware
The Guloader mechanism is depicted in the following diagram:
Press enter or click to view image in full size
Guloader architecture
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 24 of 25

Conclusion
In this article, I covered the entire process of the Guloader malware and presented several anti-analysis
mechanisms from this shellcode-based downloader.
During this step-by-step observation, we saw how this malware's unique characteristic challenges security
researches, and also how untraditional is Guloader in the current cybercrime landscape.
References:
https://kienmanowar.wordpress.com/2020/06/27/quick-analysis-note-about-guloader-or-cloudeye/
https://www.crowdstrike.com/blog/guloader-malware-analysis/
https://www.blueliv.com/cyber-security-and-cyber-threat-intelligence-blog-blueliv/research/playing-with-guloader-anti-vm-techniques-malware/
https://blog.vincss.net/2020/05/re014-guloader-antivm-techniques.html
https://labs.k7computing.com/?p=21725
https://github.com/OALabs/BlobRunner
Source: https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
https://elis531989.medium.com/dancing-with-shellcodes-cracking-the-latest-version-of-guloader-75083fb15cb4
Page 25 of 25