{
	"id": "d6177e05-9bb6-46c4-84e4-7884922eed77",
	"created_at": "2026-04-06T00:21:01.522995Z",
	"updated_at": "2026-04-10T03:21:54.996852Z",
	"deleted_at": null,
	"sha1_hash": "88648f039c62dabeadddc2beb6f182ef9f1cfab9",
	"title": "Teasing the Secrets From Threat Actors: Malware Configuration Parsing at Scale",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 1725561,
	"plain_text": "Teasing the Secrets From Threat Actors: Malware Configuration\r\nParsing at Scale\r\nBy Mark Lim, Daniel Raygoza, Bob Jung\r\nPublished: 2023-05-03 · Archived: 2026-04-05 20:40:35 UTC\r\nExecutive Summary\r\nConfiguration data that changes across each instance of deployed malware can be a gold mine of information\r\nabout what the bad guys are up to. The problem is that configuration data in malware is usually difficult to parse\r\nstatically from the file, by design. Malware authors know the intelligence value as they provide directives for how\r\nthe malware should behave.\r\nMalware is like most complex software systems in that there are many advantages for code reuse and abstraction.\r\nTherefore, it is not surprising to see that the concept of software configuration is pervasive across the various\r\nmalware families we analyze. After all, it’s pretty hard to imagine a stereotypical cybercriminal wanting to bother\r\nwith recompiling their code to change an IP address or whatever else, when going after different targets.\r\nBut the good news is that statically armored configuration data can often easily be found and parsed directly from\r\nmemory. We will cover a nice example of an IcedID (information stealer) configuration, how it was obfuscated\r\nand how we’ve extracted it.\r\nPalo Alto Networks customers receive improved detection for the evasions discussed in this blog through\r\nAdvanced WildFire. As we continue to parse and extract this information from malware families at scale, we hope\r\nto build out a pool of threat intelligence that will better help us understand the campaigns and tactics of the\r\nvarious threat actors who are targeting various organizations.\r\nRelated Unit 42 Topics Memory Detection\r\nWhat Are Malware Configurations?\r\nSo what exactly do we mean by the term “configuration” when talking about malware? Outside the context of\r\nmalware, we think of configuration in terms of defining how systems should behave. For example, we would\r\nconsider the rules used to define which networking routes for a firewall are allowed, or which font size your web\r\nbrowser uses while you read this, as configurable information.\r\nFor malware, this is no different. Malware configurations are just collections of elements that define how a\r\nmalware operates, such as the following:\r\nCommand-and-control (C2) network addresses\r\nPasswords for remote administrators\r\nFile paths in which to drop persistent payloads\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 1 of 10\n\nThe way these elements are embedded in malware components tends to be specific to each malware family. Also,\r\nthey might evolve over time as malware undergoes development, or when malware authors change their build\r\nprocess.\r\nGenerally speaking, malware configuration elements tend to be the properties of malware that the authors want to\r\nmake easily editable between campaigns and deployments without requiring manual code edits for each one.\r\nMalware configuration elements can also expose latent behaviors and malware infrastructure that are not typically\r\nobservable under routine dynamic analysis.\r\nMalware configurations have intelligence value for security practitioners because they provide insights into\r\ncampaigns over time. In some cases, defenders could use them as actionable artifacts for network detection, or for\r\nidentifying infected hosts. The successful extraction and validation of a malware configuration can also be used to\r\nreinforce our confidence when identifying a file as malicious.\r\nBecause malware configurations have value to security systems and defenders alike, it is state-of-practice for\r\nmodern malware authors to protect their configuration elements using different techniques. These protections\r\noften include a blend of encryption, obfuscation and compression. They might also be layered with evasive\r\ntechniques.\r\nThis protection poses a significant challenge for malware configuration extractors that operate solely by using\r\nstatic analysis, because all of these protections must be detected and bypassed before extraction can be performed.\r\nUsing an advanced dynamic analysis sandbox combined with intelligent runtime memory analysis makes it\r\npossible to bypass many of these protections and pinpoint the best opportunities to perform extraction.\r\nWhen we represent and store these configurations using standardized schemas, it enables us to extract maximum\r\nvalue through automation, machine learning and interactive analysis. The DC3-MWCP library defines a schema\r\nfor many of the most common configuration element types, and it provides a simple library for serialization to\r\nJSON.\r\nThe MITRE MAEC and STIX projects also provide us with a more general vocabulary for representing malware\r\nconfiguration elements. This also allows us to correlate the elements with observable objects collected during\r\ndynamic analysis.\r\nIcedID Analysis\r\nLet’s look at one IcedID binary and how its configurations are encrypted.\r\nHash 05a3a84096bcdc2a5cf87d07ede96aff7fd5037679f9585fee9a227c0d9cbf51\r\nThis particular attack chain, shown in Figure 1, was discovered in early November 2022. It delivered IcedID, an\r\ninformation stealer also known as Bokbot, as the final payload. This threat is well-known malware that has been\r\nattacking people since 2019.\r\nThe following diagram shows the infection chain.\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 2 of 10\n\nFigure 1. IcedID infection chain.\r\nAuthors of IcedID took pains to hide their configurations. Recent samples of IcedID stage two would only be\r\ndownloaded if the victim’s machine matched the requirements of the threat actor.\r\nThe configurations of IcedID consisted of C2 URLs and their campaign IDs. The C2 URLs included some that\r\nmight not be revealed during the execution of the IcedID binaries. The campaign ID links IcedID samples back to\r\nspecific threat actors.\r\nWe will go through the following steps to extract the configurations found in the IcedID stage one and two\r\nbinaries:\r\n1. Unpack the IcedID binary\r\n2. Locate the encrypted configuration data blob\r\n3. Extract the encryption key\r\n4. Decrypt the configuration data blob with the encryption key\r\nUnpacking IcedID Stage One\r\nIcedID stage one unpacks itself by first allocating memory using the VirtualAlloc function. This is followed by\r\nerasing the allocated memory using the Memset function, as shown in Figure 2. Finally, it copies the unpacked\r\ndata to the allocated memory using the Memmove function.\r\nTo dump the unpacked data, we set a breakpoint at Memmove. The second argument of Memmove contains the\r\naddress of the unpacked data. Figure 2 also shows the DOS MZ header of the unpacked IcedID stage one in the\r\nright-hand side of the hex dump.\r\nFigure 2. Unpacking IcedID stage one.\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 3 of 10\n\nLocating the Encrypted Configuration Data Blob\r\nNext, we located the encrypted configuration data blob using the unpacked stage one IcedID. While debugging the\r\nunpacked IcedID stage one file, we set a breakpoint at the address that called WinHttpConnect, as shown in Figure\r\n3. The address pointed to by register RDI contains the string of the C2 URL.\r\nFigure 3. Debugging IcedID stage one.\r\nBy backtracing the code, we located a function that used the decrypted configuration as shown in Figure 4.\r\nFigure 4. Tracing code in IcedID stage one.\r\nTracing the code flow back, we found the loop that decrypted the configuration, as shown in Figure 5.\r\nFigure 5. Configuration decryption loop for IcedID stage one.\r\nThe instruction at 0x7FEF33339CD loaded the address of the encrypted configuration data blob\r\n(Encrypted_Config) into register RDX.\r\nExtracting the Encryption Key\r\nThe instruction at 0x7FEF33339D4 reads the encryption key. The key is 0x40 bytes offset from the address of\r\nEncrypted_Config. We also learned the configuration is 0x20 bytes long. An XOR loop was used to decrypt the\r\nconfiguration.\r\nDecrypting the Configuration Data Blob With the Encryption Key\r\nAfter gathering the encryption key, the encrypted data blob and the decryption routine, we can now decrypt the\r\nconfiguration using the following script shown in Figure 6.\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 4 of 10\n\nFigure 6. Configuration decryption script for IcedID stage one.\r\nThe decrypted IcedID stage 1 configuration has the following format, as shown in Figure 7.\r\nFigure 7. IcedID stage one configuration format.\r\nFrom the decrypted configuration, we can extract the following IoCs:\r\nC2 URL bayernbadabum[.]com\r\nCampaign ID 1139942657\r\nNow, we will decrypt the configuration for the IcedID stage two binary.\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 5 of 10\n\nUnpacking the IcedID Stage Two Binary\r\nAs the IcedID stage two binary uses the same packer as stage one, we will not repeat the unpacking steps here.\r\nLocating the Encrypted Configuration Data Blob\r\nWe set a breakpoint at the address that calls Winhttpconnect, as shown in Figure 8.\r\nFigure 8. Debugging IcedID stage two.\r\nAfter tracing the code, we located the function that used the decrypted configuration, as shown in Figure 9.\r\nFigure 9. Tracing code in IcedID stage two.\r\nExtracting the Encryption Key\r\nTracing the code flow even further back, we found the function that decrypts the configuration. The first few\r\ninstructions located the encrypted configuration blob. The encrypted blob is 0x25c bytes long. The encryption key\r\nis the last 0x10 bytes of the encrypted configuration blob, as shown in Figure 10.\r\nFigure 10. Loading the encryption key for IcedID stage two.\r\nAfter retrieving the encryption key, the next step is the loop to decrypt the encrypted blob, as shown in Figure 11.\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 6 of 10\n\nFigure 11. Configuration decryption loop for IcedID stage two.\r\nDecrypting the Configuration Data Blob With the Encryption Key\r\nWe replicated the instructions in the decryption loop using Python. After gathering the encryption key, encrypted\r\ndata blob and the decryption routine, we can now decrypt the configuration using the following script (shown in\r\nFigure 12).\r\nFigure 12. Configuration decryption script for IcedID stage two. Note: Jquinn147 and myrtus0x0\r\npublished a similar configuration decryption script for IcedID in May 2021, called IcedDecrypt\r\n(GitHub).\r\nThe decrypted IcedID stage two configuration has the following format, shown in Figure 13.\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 7 of 10\n\nFigure 13. Configuration format for IcedID stage two.\r\nFrom the decrypted configuration, we can extract the following indicators of compromise (IoCs):\r\nC2 URLs\r\nnewscommercde[.]com\r\nspkdeutshnewsupp[.]com\r\ngermanysupportspk[.]com\r\nnrwmarkettoys[.]com\r\nC2 URI news\r\nCampaign ID 1139942657\r\nWe have manually decrypted the configuration for both the IcedID stage one and two binaries.\r\nScaling Up\r\nNow that we’ve discussed the work of figuring out how to target the configuration data in memory, the next\r\nchallenge is to figure out how to perform this at scale. The massive scale of most malware processing systems\r\nmeans that most practitioners looking to build out a configuration extraction system will need to be careful about\r\nadding additional overhead. This means that we will need a mechanism to intelligently identify only the samples\r\nof interest for each parser, so we’re not unnecessarily running dozens of parsers across millions of samples.\r\nWe think a reasonable approach to this problem involves using intelligent runtime memory analysis, as it provides\r\nus with excellent visibility into the secrets malware authors want to protect. A typical workflow for our malware\r\nconfiguration extractors includes the following activities:\r\nScanning memory and/or other dynamic analysis artifacts\r\nApplying a noise filter on the results to identify the best candidates for extraction\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 8 of 10\n\nPerforming extraction using the best fitting module and storing the results for reporting and indexing\r\nGeneralizing this common workflow presented us with the opportunity to make the following improvements:\r\nOptimizing the search phase by only scanning analysis data once in most cases\r\nApplying abstractions and reusable code for many common tasks\r\nLimiting the impact of modules with problematic inputs or other bugs\r\nGiving our security researchers visibility into the performance of their modules\r\nThe following example shows some of the IoCs from a recent IcedID extractor after being deployed at scale.\r\nHaving a nice framework for deploying configuration extractors means that once you are finished crafting a\r\nconfiguration extraction script, it’s time to kick your feet up and relax while hundreds of configurations flow into\r\nyour malware configuration database.\r\nFigure 14. IoCs from IcedID samples.\r\nConclusion\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 9 of 10\n\nThank you for joining us in this overview of malware configurations and why we are working hard to parse this\r\ninformation at scale in Advanced WildFire. Reverse engineering variants of each malware family allow us to build\r\nout parsers to extract meaningful and relevant data for all of them at scale.\r\nThere is a staggering amount of diversity among payloads in the malware landscape, which makes the task of\r\nsupporting them all more or less impossible. Where possible, we use metrics-based approaches to prioritize focus\r\non the malware families and variants most relevant to our customers. In this ongoing area of research, our team\r\nwill continue to expand support for new malware families and variants.\r\nPalo Alto Networks customers receive protections from threats such as those discussed in this post with Advanced\r\nWildFire.\r\nIndicators of Compromise\r\n05a3a84096bcdc2a5cf87d07ede96aff7fd5037679f9585fee9a227c0d9cbf51\r\nSource: https://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nhttps://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing\r\nPage 10 of 10",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"references": [
		"https://unit42.paloaltonetworks.com/teasing-secrets-malware-configuration-parsing"
	],
	"report_names": [
		"teasing-secrets-malware-configuration-parsing"
	],
	"threat_actors": [],
	"ts_created_at": 1775434861,
	"ts_updated_at": 1775791314,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/88648f039c62dabeadddc2beb6f182ef9f1cfab9.pdf",
		"text": "https://archive.orkl.eu/88648f039c62dabeadddc2beb6f182ef9f1cfab9.txt",
		"img": "https://archive.orkl.eu/88648f039c62dabeadddc2beb6f182ef9f1cfab9.jpg"
	}
}