How To Use Ghidra For Malware Analysis - Identifying, Decoding
and Fixing Encrypted Strings
By Matthew
Published: 2023-12-05 ยท Archived: 2026-04-05 16:46:31 UTC
In this post, we will investigate a Vidar Malware sample containing suspicious encrypted strings. We will use
Ghidra cross references to analyse the strings and identify the location where they are used.
Using this we will locate a string decryption function, and utilise a debugger to intercept input and output to
obtain decrypted strings.
We will then semi-automate the process, obtaining a full list of decoded strings that can be used to fix the
previously obfuscated Ghidra database.
Summary
During basic analysis of a Vidar file, we can see a large number of base64 strings. These strings are not able to be
decoded using base64 alone as there is additional encryption. By using Ghidra String References we can where
the base64 is used, and hence locate the function responsible for decoding.
With a decoding function found, it is trival to find the "start" and "end" of the decryption process. Using this
knowledge we can load the file into a debugger and set breakpoints on the beginning and end of the decoding
function. This enables us to view the input (encoded string) and output (decoded string) without needing to reverse
engineer the decryption process.
By further adding a simple log command into the debugger (x32dbg), we can tell x32dbg to print all values at the
start and end of the decryption function. This is a means of automation that is simple to implement without coding
knowledge.
Once the encrypted/decrypted contents have been obtained, we can use this to manually edit the original Ghidra
file and gain a deeper understanding of the malware's hidden functionality.
Obtaining the File
The file can be downloaded here from Malware Bazaar.
SHA256: 0823253d24e0958fa20c6e0c4b6b24028a3743c5c895c577421bdde22c585f9f
Initial Analysis and Identifying Strings
We can download the file from Malware Bazaar using the link above, we can then unzip the file using the
password infected .
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 1 of 26
We like to create a copy of the origininal file with a shorter and more useful file name. In this case we
have chosen vidar.bin .
We can perform some basic initial analysis using Detect-it-easy. A typical workflow in detect-it-easy is to look for
strings contained within the file.
If we select the "strings" option, we can see a large number of base64-like strings.
(You could also use PeStudio or any other tooling that can identify strings)
The default minimum string length is 5, which results in a lot of junk strings. By increasing this to 10,
we can more easily identify strings of interest.
In the screenshot below we can see a group of base64-like strings. In many cases, encoded strings like these are
used to obfuscate functionality and Command-and-Control (C2) servers.
Hence, they are a useful indicator to hone in on with tooling like Ghidra.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 2 of 26
Now that we've identified some interesting strings within the file, we can use Ghidra to analyse them further and
attempt to establish some context as to how they are used.
How To Load a File Into Ghidra
To analyse these strings further, we can go ahead and load the file into Ghidra.
This can be done by dragging the file into Ghidra, accepting all default options and allowing the Ghidra analysis
to run for a few minutes.
We can then continue our analysis by locating the same strings we found during the initial analysis. In this case,
we can start with the first base64 string of tw+lvmZw5kffvene
The screenshots below demonstrate how to perform a string search with Ghidra. Search -> For Strings
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 3 of 26
Ghidra will present a window like the one below; we can typically go ahead and accept the defaults.
Make sure that Selection Scope -> Search All is selected. Sometimes Ghidra changes to Selection
Scope -> Search Selection if you have something highlighted.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 4 of 26
Once we've accepted the default search options, we can filter at the beginning of our previous string tw+ to
locate it.
This will reveal 3 strings, starting with tw+
We can double-click on any of the returned strings to go to its location within the file.
Ghidra will automatically recognise if the location storing the string has been used elsewhere in the file.
This is known as a cross reference (xref) and is an extremely useful concept to become familiar with.
In this view, we can also see that one Cross Reference (XREF) is available. This indicates that Ghidra has found
one location where the string is used.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 5 of 26
Double-clicking the xref value will show us where the string has been referenced.
After double-clicking on the xref value, we can see the base64 string (as well as others) contained within the
function FUN_004016a6 .
We can also see each of these strings is passed to FUN_00401526 . Since every string is going to the same function,
it is very likely the one responsible for decryption.
Side note - These strings undergo additional obfuscation as well as base64. We won't be able to decode
them using base64 alone.
If we click on the FUN_00401526 function taking all the encoded strings, we can see that it's rather long,
confusing and contains a lot of junk code.
Luckily, we don't need to analyse it in detail in order to decrypt the strings. Since we know the location of the
function within the file, we can use a debugger to obtain the decrypted content for us.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 6 of 26
The name of the function is the location within the file. This is all we need to be able to locate it within
a debugger.
Eg for function FUN_00401526 , the location of the function will be 00401526 .
As a side note, if we look at the same function within the disassembly view on the left hand side, we
can see that there are 542 xrefs available.
This means that FUN_00401526 is used 542 times throughout the file, a number this high is another
strong indicator that the function is used for decoding.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 7 of 26
We now know the location of a function that is likely responsible for decrypting the strings. Although we could
analyse it statically, this is difficult, time consuming and often unnecessary.
A better method is to load the file into a debugger and use breakpoints to monitor the function's location. This
method can be used to obtain input (encrypted string) and output (decrypted string) without needing to analyse the
function manually. We just need to know where the function starts.
Loading The File Into x32dbg
Since we now have a function to monitor, we can go ahead and load the file into x32dbg for further analysis.
We can start this by dragging the file into x32dbg and allowing the file to reach its entry point using F9 or
Continue .
Before continuing analysis in the debugger, we need to confirm the base address is the same as in Ghidra. This
ensures that the function will be stored at the same location.
The location within Ghidra and X32dbg will always be + xyz. But if
differs, then we occasionally need to fix it.
We can double-check the base address by clicking on the Memory map option within x32dbg. The base address
will be the one on the same line as your file name.
The base address in our case was 0x000f0000 (this address may differ for you)
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 8 of 26
We need to make sure that this base address is aligned with Ghidra.
The base address can be found in Display Memory Map -> View Base Address .
In this case, Ghidra's base address is 0x00400000 , we can manually change this to match the 0x000f0000 found
in x32dbg.
Fixing the base address is as simple as changing the value to 0x000f000
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 9 of 26
After selecting OK , Ghidra will reload the file with the new base address.
After reloading a base address, sometimes Ghidra will get lost. You may need to do another string
search + xref (same process as before) to identify the string decryption function again.
With the correct base address now loaded, the string decryption function will have a new name FUN_000f1526 to
reflect its new location.
We can now use this address 000f1526 to create a breakpoint within x32dbg.
Setting Breakpoints on the Decryption Function
We now want to create a breakpoint at the corrected address of the decryption function.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 10 of 26
Using the new address of 000f1526 , we can go back to x32dbg and create a breakpoint using bp 000f1526
With the breakpoint set, we can let the malware run until the function is triggered.
When the breakpoint is hit, we can view the current encoded string within the stack window on the right-hand side
of x32dbg.
If we allow the function to complete using the Execute Until Return option, we can jump to the end of the
decryption function and see if any decrypted output is present.
Execute Until Return tells the debugger to allow the current function to finish without continuing
beyond the current function. This is an easy way to obtain function output without it getting lost
somewhere during execution.
The "Execute Until Return" button looks like this.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 11 of 26
After the Execute Until Return has completed, we can observe the first decoded string HAL9TH within the
register window.
The decoded string is contained within EAX , which is the most common location where function
output will be stored.
Now that the decoded string is visible, we should note the current location of EIP within the debugger. This will
tell us where we can find a decrypted copy of the string.
In the screenshot below, we can see that this location is 0x000f16a3 . This is the end of the decryption function,
and we should create another breakpoint here.
Creating a breakpoint here is functionally identical to using Execute Until Return every time we hit
the function, but creating a second breakpoint is much easier.
The new breakpoint can be created with bp 000f16a3 or by pressing F2 on the address highlighted in green.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 12 of 26
If we continue to execute using F9 or Continue , we will hit the original string decryption function again.
This time, there is a new encoded string present in the stack window lgWSvkdzsA== .
Allowing the malware to run with F9 again, will trigger our second breakpoint, which contains the decoded
value of JohnDoe .
As you obtain decrypted values, it can be useful to google them to determine their purpose within the context of
malware.
According to CyberArk, The two values JohnDoe and HAL9TH are default values used by the Windows
Defender Emulator. The malware likely uses these values later to determine if it's being emulated inside of
Windows Defender.
Obtaining Additional Decoded Values
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 13 of 26
By allowing the malware to execute with F9 , we will continue to hit the existing breakpoints and observe
decoded values.
Here, we can see that the malware has decrypted some Windows API names (LoadLibraryA, VirtualAlloc) as well
as strings related to Crypto Wallets (Ethereum, ElectronCash, Binance).
This knowledge allows us to assume that the malware is dynamically loading APIs and likely stealing Crypto
Wallet data.
If we recall, there were 542 references to the string decryption function before. Since there are a few too many to
observe manually, we can perform some basic automation using a debugger.
Automating the Process With Conditional Breakpoints
Now that we have existing breakpoints at the start and end of the decryption function, we can add a log condition
to print the interesting values to the log window.
We can add a log condition by modifying our existing breakpoints. We can do this within the breakpoint window,
and then Right-Click -> Edit on the two existing breakpoints.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 14 of 26
Printing Encoded Strings With x32dbg
Our first breakpoint is at the "start" of the encryption function, and we know from previous analysis that the
encoded value will be inside the stack window.
Observing the stack window closer, we can see that the exact location is [esp+4]
We can now tell the breakpoint to log the string contained at [esp+4]
We can do this with the command Encoded: {s:[esp+4]} . The "Encoded: " part is not necessary but it makes the
output easier to read.
Since we don't need to stop at every breakpoint (we just want to log the results), we can add another condition
run; in Command Text .
This will tell x32dbg to resume execution after printing the output.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 15 of 26
Printing Decoded Strings with x32dbg
We can repeat the same process for the second breakpoint.
This time instead of printing [esp+4] , we want to print the decoded value contained in eax
After editing the second breakpoint, we want it to look something like this.
This should be identical to the previous breakpoint, with only [esp+4] being replaced with eax .
We can also change Encoded: to Decoded: to make the final output easier to read.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 16 of 26
With the new breakpoints saved, we can restart the malware or allow it to continue its current execution. This will
print all encoded and decoded values to the log window.
(You can find the log window next to the breakpoints window)
After restarting the malware and leaving the breakpoints intact, we can see our initial encoded string and its
decoded value of kernel32.dll .
We can also see additional decoded values related to Ethereum key stores.
Obtaining Only Decrypted Values
By temporarily disabling the initial breakpoint (right click -> disable) , we can print only the decoded values.
Here, we can see some potential encryption keys, as well as SQL commands used to steal mozilla Firefox cookies.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 17 of 26
We can also observe that the malware attempts to steal credit card information from web browsers.
,
If we go back to Ghidra, we can revisit the initial function containing references to encrypted strings.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 18 of 26
Since we now have both the encrypted and decrypted values, we can edit the Ghidra view to reflect the decoded
content.
Here, we can see decoded values within x32dbg, reflecting the same encoded values as the above screenshot.
We can also note that after each call to the decoding function, the result is stored inside of a global variable
(indicated by a green DAT_00138e98 etc, on the left-hand side).
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 19 of 26
This usually means that the same variable will be referenced each time the decoded string is used. If we
rename the variable once, it will be renamed in all other locations that reference it.
We will see this in action in a few more screenshots.
Using the output from x32dbg, we can begin renaming those global variables DAT_000* etc to their decoded
values.
This will significantly improve the readability of the Ghidra code.
This process can be done manually or by saving the x32dbg output and creating a Ghidra Script. The
process of scripting this is in Ghidra is relatively complicated and will be covered in a later post.
For now, we can edit the names manually (Right Click -> Rename Global Variable)
Below we can see the same code after some slight renaming. Make sure to reference the x32dbg output.
We like to prepend each variable with str_ to indicate that it's a string. This is optional but improves
the readability of the code.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 20 of 26
With the DAT_* locations modified to their decoded values, any location within Ghidra that contains the same
DAT_ value will now have a suitable name, making it much easier to infer the purpose of the function.
To determine where a variable is used, we can again use cross references. Double clicking on any of
the DAT_* values will show it's location and any available cross references where it is used.
For example, here is the function containing "JohnDoe" before the DAT_* values are renamed.
If we had encountered this function without first decrypting strings, it would be difficult to tell what the function
is doing.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 21 of 26
After marking up the DAT_* values with more appropriate names, the function looks like this.
Since we googled these values and determined they are used for Defender Emulation checks, we can infer that this
is (most likely) the purpose of the function.
Using that assumption, we can change the name to something more useful.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 22 of 26
Now, anywhere where that function is called will be much more understandable.
To see where a function is called, we can double click it and view the x-refs again to see where the
function is used.
Here is one such reference, which doesn't make much sense at an initial glance.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 23 of 26
After renaming the function to mw_checkDefenderEmulation , it begins to make more sense.
After renaming all remaining DAT_* variables, it begins to make even more sense.
The malware is temporarily going to sleep and repeatedly checking for signs of Defender Emulation.
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 24 of 26
A similar concept can be seen with the decoded string for VirtualAlloc.
Below is a function referencing VirtualAlloc, prior to renaming variables.
After renaming, we see that its primary purpose is creating memory using VirtualAlloc.
(There are some other things going on, but the primary purpose is memory allocation, hence we can
rename this function to mw_AllocateWithVirtualAlloc )
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 25 of 26
This process can be repeated until all points of interest have been labelled with appropriate values.
This is time-consuming if you wish to mark up an entire file, but it is effective and will reveal a significant portion
of the file's previously hidden functionality.
Once you're comfortable with performing this process manually, you can eventually create a script to do the same
thing for you.
Creating a script will still require obtaining the decrypted strings through some means, but renaming everything
can be done well with a Ghidra script.
Conclusion
We have now looked at how to identify basic obfuscated strings, decrypt them, and fix their values within Ghidra.
Although this is a relatively simple example, the same overall process and workflows are repeatable across many,
many malware samples.
As you become more confident, many of these steps can be automated further or scripted. The renaming process
can be replaced with a Ghidra script, and the "debugger" process can be replaced with scripted Emulation
(Unicorn, Dumpulator etc).
Regardless, this blog demonstrates some core skills that are important for building the baseline skills to begin
exploring future automation.
Sign up for Embee Research
Malware Analysis and Threat Intelligence Research
No spam. Unsubscribe anytime.
Source: https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
Page 26 of 26
https://embee-research.ghost.io/ghidra-basics-identifying-and-decoding-encrypted-strings/
We can also observe that the malware attempts to steal credit card information from web browsers.
,
If we go back to Ghidra, we can revisit the initial function containing references to encrypted strings.
Page 18 of 26