{
	"id": "92b31eaa-1c3f-4e6e-a795-16089cd2a65f",
	"created_at": "2026-04-06T00:10:55.454753Z",
	"updated_at": "2026-04-10T03:20:25.674956Z",
	"deleted_at": null,
	"sha1_hash": "0c5591a051dc794a3a6bd43f2f9c890e915a3a90",
	"title": "Reverse Engineering Gootkit with Ghidra Part I",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 1357917,
	"plain_text": "Reverse Engineering Gootkit with Ghidra Part I\r\nBy Open Malware - Danny Quist\r\nArchived: 2026-04-05 17:15:34 UTC\r\nGhidra is pretty handy for looking at malware. This series of post is an informal overview of what I do. Gootkit is\r\na great implant to learn the functionality of Ghidra. Gootkit is a NodeJS server with packaged Javascript\r\nimplementing the implant functionality. There are lots of libraries linked into the main executable including Node,\r\nOpenSSL, and many more. As a reverse engineer it is difficult to identify and identify open libraries. In this post, I\r\nwill go through my analysis process to use and understand Ghidra’s functionality.\r\nI will first begin by basic code analysis, and understanding how to rename variables and types. I am going to\r\navoid dynamic analysis initially, because dynamic analysis is something that you can buy or implement cheaply\r\nenough. In a real-world scenario I typically start dynamic analysis using a range of tools, then delve into the code\r\nas a secondary step.\r\nThe purpose is to learn Ghidra, not to do a great job at reverse engineering all of Gootkit. It is highly informal, and\r\nmeant to be that way.\r\nGhidra All the Things!\r\nThere are now a few tutorials available on installing and configuring Ghidra Ghidra. Create a new project, and\r\nthen import the decrypted rbody32 sample into the project. The sample I will be using is:\r\n$ shasum rbody32.x.dec\r\n6170e1658404a9c2655c13acbe1a2ad17b17feae\r\nIt is a decoded version of the file downloaded from a compromised Gootkit site. While Gootkit is the topic for this\r\nblog, this process can be applied generally to anything else.\r\nYour import summary should look a lot like this:\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 1 of 12\n\nFigure 1: Ghidra import summary for the relevant Gootkit example\r\nGhidra Import Summaries\r\nImport summaries tell you critically important facts about the sample that you’re looking at. The key thing to\r\nremember is that Ghidra is primarily a source code reverse engineering tool. There are a few salient bits to draw\r\nyour attention to:\r\nFirst, compiler identification. In this case Ghidra identifies VisualStudio:unknown as the likely compiler. This\r\nmakes sense, as it is based off of a NodeJS, which is a C++ program, and Visual Studio is the compiler of choice\r\nfor Windows. Knowing the compiler is important later when you’re puzzling through some obtuse assembly code,\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 2 of 12\n\ntrying to figure out if the compiler generated some weird code, or the malware author was being tricky. Ghidra is\r\nexcellent about identifying and categorizing compiler generated nonsense, and saves a bunch of time.\r\nSecond, Compiler ID appears to be the the platform that the compiler was run on. As you look at more assembly\r\ncode, you’ll get a good idea of how each of them generate code for standard C and C++ programming patterns.\r\nMy indicator when looking at code is whether or not it was hand-rolled assembly, or is compiler generated.\r\nTypically hand-rolled, artisinally crafted assembly is a good indication that there are shenanigans afoot. Hand\r\ncoded assembly can be significantly more difficult to understand, where a compiler will try to do things the same\r\nway.\r\nWhy do I care so much about compiler produced versus hand-coded assembly? As an analyst, you have a budget\r\nof time and attention that you can focus on every bit of code. During an investigation I tend to hit a point of\r\ndiminishing returns where fatigue sets in, and I start to miss critically important details. The code placed around\r\nchecking return values and stack canaries is something I spend way too much time classifying in a sample. If a\r\ntool can identify that, I can label it as not important and go on with life. If the tool does not identify that, or more\r\nlikely I get drawn in anyway, there are all sorts of suspicious APIs that are very distracting. ExitProcess ,\r\nanything thread related, etc.\r\nAdditional information is an excellent resource too. Looking at the high-level DLLs the sample is using can give\r\nyou an idea of what the functionality is going to be.\r\nExisting Gootkit Research\r\nLargely this document will consist of reproducing the already existing Gootkit analyses. Gootkit is served from a\r\ncompromised host and runs a small command and control server. The user is tricked/hacked into downloading a\r\ncompromised PDF/DOC/implant, which then contacts the call-home server. Generally if you see .*/rbody32 or\r\n.*/rbody320 in the URL, you’ve most likely got the right sample.\r\n@jgegeny has a copy of the extracted JavaScript files. The functionality signatures, and overall path to success\r\ndepends on understanding the JavaScript. I will focus on trying to extract them.\r\nIn general the things you need to know about Gootkit:\r\n1. It’s based on a all-in-one compiled version of a NodeJS application. If you ever needed a more clear and\r\npresent indication that Node is evil, look no further\r\n2. It has a second DLL inside of it to handle password and credential harvesting.\r\n3. All of the functionality exists as JavaScript files, which we would like to decode and obtain.\r\nAnalyzing Gootkit\r\nAnalysis Goals\r\n1. Generate new indicators of compromise\r\n2. Find attribution information for the authors\r\n3. Show the functionality of Ghidra\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 3 of 12\n\n4. Extract all the Javascript code\r\nAssumptions\r\n1. There is Javascript hiding inside Gootkit, and is a good source for IOCs.\r\n2. The JavaScript files are probably compressed or encrypted.\r\n3. The Password Grabber DLL is also embedded in this binary\r\nDouble-click the rbody32.x.dec inside of the project view and enjoy the 1337 dragon graphic animation. The\r\nanswer to “would you like to analyze now?” is always yes.\r\n Figure 2:\r\nAn exercise in clicking the Yes button until something happens\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 4 of 12\n\nFigure 3: Be sure to select ‘Aggressive Instruction Finder’ and bravely ignore all the warnings.\r\nGhidra Analysis Options\r\nFigure 3 shows the analysis options that Ghidra has available. Similar to IDA, you should most likely ignore these\r\nindividual settings and just accept the defaults. (The exception being Aggressive Instruction Finder )\r\nLooking at some of the default options, there are all sorts of goodies available. I’ll go through my favorites so far:\r\n1. Apply Data Archives - Search for embedded archive formats, and display information about them. Have a\r\nblob of zip/base64/lznt1 data you find? Ghidra looks for these as well and calls them out.\r\n2. Embedded Media - More often than not, especially if your sample is trying to impersonate a benign\r\nprogram, you’ll find media or other sheisty information embedded. This will create bookmarks for you to\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 5 of 12\n\nlater use and analyze.\r\n3. Windows .* - All of the internal things that Windows compilers use to make life difficult. Previously these\r\nall had to be waded through individually. Now Ghidra will figure them out, add salient information to the\r\nanalysis, and generally save you time.\r\nHopefully in the time it took you to read the above, your analysis is finished. Let’s jump right into analyzing the\r\nGUI and starting to use our workflow.\r\nGUI Overview\r\nAfter all the analysis is completed, you should be presented with the business end of Ghidra, it’s GUI. Take in the\r\nWindows 95 era Java Swing GUI, and remember a time when you could hot-patch the page fault handler without\r\nthe Windows kernel immediately labeling you as a malcontent.\r\nFigure 4: First view of the GUI with annotations. Clean version without the annotations can be found here\r\nEnable Entropy Visualization\r\nThis is a cool trick that saved me a lot of time. Enable entropy visualization. Click the drop down menu on the top\r\nright of the Listing view, and select “Show Entropy.”\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 6 of 12\n\nFigure 5: Click the pulldown to enable entropy visualization\r\nEntropy, or the measure of randomness is useful for identifying encrypted or compressed portions of the\r\nexecutable. This is probably a good time for you to learn some math if you’re not already familiar. Wikipedia\r\nprovides a good overview of Entropy if you’re into that sort of thing. All you need to know is that the higher the\r\nentropy (red in this case) means that there is likely a compressed, or encrypted blob of data. Goal 4 of our analysis\r\ngoals is to extract the compressed JavaScript, so this is a good place to start looking.\r\nEntropy does not always mean compressed or encoded data, nor does it mean that all encoded or compressed data\r\nis high entropy. All things being equal, it does mean something you should take a look at. In general, it’s a good\r\nplace to start looking and I appreciate that Ghidra includes this as a default option.\r\nFigure 6: The code listing with the high-entropy portions\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 7 of 12\n\nAnalysis: Find the Embedded Code Part 1 - A Failure\r\nNow that we have a good entropy visualization, let’s try and take a shortcut to finding the compressed code.\r\nInspect the High Entropy Areas\r\nIf you click next to the red area in the executable, you should see a reference to the entropy being somewhere\r\nclose to 8 in the tool-tip pop up. Select as close to the top as you can, then scroll the code view up until you see\r\nreferences to functions. Why functions? Because the address cross-references (XREFs) can contain random data,\r\nand not necessarily what you’re looking for. Code references are where the executable is looking at that specific\r\naddress. From here we will inspect all of the XREFs and look for anything that looks like encryption.\r\nWhat does encryption code look like? This is a hard question. One way to answer that is to compile a bunch of\r\nencryption reference code, and look at what code is generated. In the end a couple of rules-of-thumb apply:\r\nHow to find encoding, encryption, and obfuscation the hard way\r\n1. Is there an xor with differing operands? xor eax, 0x42 would be an example, and xor eax, eax would\r\nnot.\r\n2. Are there lots of shift instructions in the same code? The shl and shr instructions being the most\r\nnotable\r\n3. There’s a noticeable loop structure\r\n4. Data is modified, and stored somewhere else in the program\r\nWith an eye on those details, I will inspect each of the listed cross references to see if I can infer what the\r\ncompressed code is.\r\nThe first reference occurs at address 0x100f56f9 inside of FUN_100f56b0, and is a good example of what we are\r\nnot looking for.\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 8 of 12\n\nFigure 7: FUN_100f56b0 assembly view and its decompilation\r\nRename Global Variables and Functions Using ADD\r\nThe first thing to do is to change the name of DAT_104af3ed to something more noticeable. Since reverse\r\nengineering is all about abductive reasoning, I’m going to assume (abduct) that this is compressed or encrypted\r\ncode. If any facts present themselves that contradict this assumption, I will modify my assumption and\r\nsubsequently change the variable name to match my new assumption. Abductive reasoning is a good lifestyle\r\nchoice, but that’s a highly personal matter. In the grand effort to increase global information entropy, confusion,\r\nand make a slightly offensive joke I call it Abductive Data Describer (ADD) workflow.\r\nGaze Upon the Magnificence of the Decompiler\r\nYou should notice that the decompilation window now has code in it. You may also notice that there are no\r\ngoto s in this code. Further inspection will reveal that aside from automatically assigned labels, the code looks\r\nmore or less reasonable. When I first reversed Gootkit with Ghidra and saw this decompilation, I had a very Jodie\r\nFoster in Contact moment when I first saw the decompiler working. Decompiler quality is informally judged by\r\nhow many goto s produced instead of the more common if/else/switch/throw/catch statements. C and C++\r\ndevelopers are threatened from birth against using goto s, except in some very narrow circumstances, so a\r\ndecompiler using them is akin to taking a shortcut. In practice I have found that once you fully fill out the types of\r\nall the variables, the decompiler outputs legible C code. Programming idioms and patterns matter, so it’s a good\r\nidea to study them.\r\nLet’s rename a variable using our ADD workflow:\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 9 of 12\n\nFigure 8: Rename the variable pointing to the high-entropy code to something more descriptive\r\nRename your Functions\r\nThis function is most likely not what we are looking for, however we have invested some time in looking at it. It’s\r\na good idea to rename the function any time you have a high-level concept you’re looking for. My names tend to\r\nbe pretty descriptive, and describe both my confidence in and the contents of the function. I use uncertain names\r\nlike some_xors_and_bitshifts to imply how much time I’ve spent on it. Later I’ll change it to something more\r\nspecific if I spend more time on it, like high_entropy_flag_mod() and actually know what it’s function is.\r\nThere is no xor instructions, and there is no loop. Likely this is a helper function that is looking at the flags of the\r\ndata. It’s a good idea to rename functions with your best guess (ADD), so I’m going to do that. I’ve also relabeled\r\nthis function as high_entropy_flag_mod() .\r\nRename your variables\r\nIf you figure out the types used in a code sample, you can redefine those as well using CTRL-L , or right-clicking\r\nand selecting ‘Retype Variable’. The more correct information you provide about the types, the more accurate the\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 10 of 12\n\ndecompiler output will be.\r\nNext function! To get back to the data view, click the left arrow button until you see the view again. This works\r\nsimilar to the escape key in IDA and Binary Ninja. If you renamed the function, your listing should look like\r\nthis:\r\nFigure 8: The updated code listing once you have renamed the referencing function\r\nNotice that all but one of the functions has been renamed, reducing how many functions you need to analyze.\r\nThere is only one remaining, FUN100f7680 and it bears inspection. The decompiler shows that a lot of our\r\nencryption qualifications are met: xors, bit shifting, and even a do {} while () loop! Upon further inspection,\r\nthe only xor in the code is at the very top of the function. This is a trick that Visual Studio uses to prevent stack\r\nbased buffer overflows called a Canary. If you see an xor at the beginning of a function, this is most likely what it\r\nis. Similarly, there will be a subsequent function call that reverses the process, and exits the program.\r\nFurther inspection of the function shows that this is just a flag checking algorithm inside of a loop. Rename the\r\nfunction (I used high_entropy_loop_flag_check() ) and move on. A good next step is to look at the XREFs for\r\nthe function, and look at the parent code. I only saw one XREF FUN_100f7bc0 so that is the next target.\r\nInferring Functionality using API Calls\r\nThe first thing I noticed about FUN_100f7bc0 are the API calls being made. These function calls give us an idea\r\nabout what the program is being used for. Looking up API calls on MSDN will give you an idea about what the\r\ndeveloper is doing.\r\nAPI Call (MSDN) Typical Usage\r\nWaitForSingleObject\r\nWait until the specified object is available or times out. Typically used to\r\nimplement a Mutex, Semaphore, or other multiprocess primitives\r\nMultiByteToWideChar\r\nConvert a multi-byte character to a ‘wide’ character. Unicode in Windows is full of\r\npain and misery due to an early Windows design decision to ignore Unicode\r\nWriteConsoleW\r\nWrite a buffer to the console. The W stands for ‘wide’. An A at the end would\r\nindicate an ascii string\r\nGetLastError\r\nWhy did my last function return an error? The Linux pattern is to use errno then\r\nbitteryly complain about reentrancy issues\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 11 of 12\n\nTable 1: A listing of API calls found in FUN_100f6bc0\r\nConclusion: This is OpenSSL\r\nI quickly came to the realization that despite my initial hopes, this is not a decryption function. I follow a similar\r\nrenaming process for all of the referenced functions, until everything is renamed. This particular branch of code\r\nseems to focus on outputting data to the terminal.\r\nSometimes you win, and sometimes you lose. I figured out I was in the wrong area when I scrolled a bit further\r\ndown in listing and saw this jump out at me:\r\nSince the implant portion of Gootkit is packaged Javascript with an embedded NodeJS server, which uses\r\nOpenSSL, this is likely just a statically linked copy of the OpenSSL code. In other words, a false lead.\r\nNext Steps\r\nIn the next post, I will go over Ghidra’s binary diffing feature and see if it can help identify embedded libraries.\r\nSource: https://dannyquist.github.io/gootkit-reversing-ghidra/\r\nhttps://dannyquist.github.io/gootkit-reversing-ghidra/\r\nPage 12 of 12",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"references": [
		"https://dannyquist.github.io/gootkit-reversing-ghidra/"
	],
	"report_names": [
		"gootkit-reversing-ghidra"
	],
	"threat_actors": [],
	"ts_created_at": 1775434255,
	"ts_updated_at": 1775791225,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/0c5591a051dc794a3a6bd43f2f9c890e915a3a90.pdf",
		"text": "https://archive.orkl.eu/0c5591a051dc794a3a6bd43f2f9c890e915a3a90.txt",
		"img": "https://archive.orkl.eu/0c5591a051dc794a3a6bd43f2f9c890e915a3a90.jpg"
	}
}