{
	"id": "bd295ff2-b978-4b23-b0a3-f157112adf67",
	"created_at": "2026-04-06T00:10:36.122529Z",
	"updated_at": "2026-04-10T13:12:53.507804Z",
	"deleted_at": null,
	"sha1_hash": "1d46156a7fb228766334a6057329ced5aa543569",
	"title": "Deep Technical Dive – Adventures in Evasive Malware – Ariel's blog",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 3250505,
	"plain_text": "Deep Technical Dive – Adventures in Evasive Malware – Ariel's\r\nblog\r\nBy AK\r\nArchived: 2026-04-05 17:25:14 UTC\r\nNymaim is mostly known worldwide as a downloader, although it seems they evolved from former versions, now\r\nhaving new functionalities to obtain data on the machine with no need to download a new payload. Some of the\r\nexported functionalities allow harvesting passwords and browsers data from the machine, hidden on the file\r\nsystem until communication occurs. Payloads downloaded from the C\u0026C are not saved locally on the machine but\r\ninstead are loaded dynamically to memory with a unique internal calling convention.\r\nOne of the signature features I noticed when I began analyzing the Nymaim payload were the novel anti-reverse\r\nengineering and obfuscation techniques. Frustrating the analyzer many different code pieces for the same function\r\nrequires piecing them together in order to fully understand the code. Most of the code is heavily obfuscated using\r\n‘spaghetti code’ methods but we’ll dive into that in a 1 (bit).\r\nIn addition to the already obfuscated code, the DGA (Domain generation algorithm) use quite an interesting\r\ntechnique to make sure it won’t be sink-holed easily, as well as further challenging analyzation.\r\nIn this blog, I will review the anti-reverse engineering techniques the malware authors implemented in the code,\r\nexplain the unique DGA they made, and show different automation concepts to conquer the code and make the\r\nanalyzer’s life a lot easier.\r\nAnd so it begins…\r\nIn general, when I dive into a new malware, I begin with a set of goals or objectives I need to discover and\r\nunderstand such as the DGA mechanism of a malware, or analyzing the protocol and functionality. When I focus\r\non the DGA for instance, while debugging, I expect the malware to hit (at some point) a DNS resolving function\r\nsuch as getaddrinfo , gethostbyname  or any similar API. Unfortunately, Nymaim hit none of the expected\r\nDNS resolving APIs exported. Confused for a moment, I decided to try a breakpoint on the sendto  function and\r\nindeed the breakpoint is hit. It is a crafted DNS request with a messed up Call Stack and a hardcoded dns server. I\r\ncan’t conclude anything definitive, I have to find the caller to the sendto  function manually. Following the RETs\r\nand JMPs I finally get to the function called the sendto function. But wait, it looks so weird! (Dramatic\r\ndrumming…)\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 1 of 16\n\nFig. 1, The calling convention to the sendto function\r\nno way this is the sendto function!\r\nSo, it continues! Obfuscation is legit code protection\r\nLet us examine the IDA snippet above (Fig. 1), while keeping in mind what the sendto function looks like:\r\nWS2_32!sendto(SOCKET s,\r\n                       const char *buf,\r\n                       int len,\r\n                       int flags,\r\n                       const struct sockaddr *to,\r\n                       int tolen)\r\nThere are 6 arguments in total. After static analysis of the code, the arguments passed on the stack don’t make\r\nmuch sense in terms of what sendto is expecting (value wise). Also there are 9 push opcodes in total. Something\r\nfishy is going on in there. Let’s examine the last call function call sub_1805525 which is the OPCODE I\r\nreturned to manually from the sendto  function.\r\n\u003cSpoilerAlert\u003e\r\nThis function is one of many spaghetti functions found in the code\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 2 of 16\n\n\u003c/SpoilerAlert\u003e\r\nFig. 2, How the called function looks like\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 3 of 16\n\nTo fully comprehend what is going on, we will first have to understand how\r\nthe stack would look after calling this function in terms of EBP offsets:\r\nFirst of all pushing EAX (arg_8) and then two more DWORDS, arg_4 (0xCF260F5F) and arg_0 (0x30D8FC16).\r\nThen calling the function (call sub_1805525) which will put the appropriate ret address as the last value on the\r\nstack and that’s all we need to know stack-wise for now when calling this function.\r\nThen, inside the called function, the function’s prologue happens\r\npush ebp\r\nmov ebp, esp\r\nThis puts into the base register (EBP) the current stack address to relatively point to stack variables using EBP and\r\nnot ESP. Let’s see what this function does exactly (As seen on the IDA snippet above):\r\n(0) + (1) Overwrite arg_8 with the RetAddress , (2) + (3) sum the values of the two DWORDS pushed on the\r\nstack ( arg_0 + arg_4 ), (4) the result from the last operation will be added to the arg_8 which was already\r\noverwritten with the RetAddress .\r\nBasically it receives two numbers and a dummy stack value, 3 arguments in total. Resulting in a new return\r\naddress with the value of [ReturnAddress + arg_0 + arg_4] .\r\nXreferencing this whole mathematical function shows me it is being called from 36 m\r\nore places. There are dozens (!) more variants of this function and about 2600 different places in which all of the\r\nvariants being called inside the code.\r\nBack to analyzing, the new address should be:\r\n[0x0183BF0B + 0x30D8FC16 + 0XCF260F5F] , cutting the 32 bit part will result in  [0x0182CA80]\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 4 of 16\n\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 5 of 16\n\nFig. 3, API obfuscation for some api calls, sendto commented\r\nGreat success! The above snippet (Fig. 3) is another part of the obfuscation. The function that would be called\r\nnext ( sub_180D32D ) is some API-Wrapper. Actually there are no standard API calls anywhere in the code,\r\neverything is calculated dynamically… everything. It’s terrible I know.\r\nDiving into that API-Wrapper function is possible (and actually required for the most part). However I won’t do\r\nthat in the scope of this blog post.\r\nSo this spaghetti calling convention messes up the code and I will have to fix it if I want to do any effective static\r\nanalysis of it. Before I present the solution for this problem, however, Let’s examine the rest of the unresolved\r\nissues in the calling function to sendto .\r\nFig. 4, Caller to the sendto function, extra unresolved code\r\nThe next thing we need to investigate, is the repeated function sub_183AC7E\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 6 of 16\n\nFig. 5, Push reg obfuscation\r\nI will make it easy, This is a huge switch-case of putting a register value on the stack dependent of the given value.\r\nFor example, the following code (our sendto scenario):\r\nseg000:0183BED7 push 10h\r\nseg000:0183BED9 push 72h ; 'r'\r\nseg000:0183BEDB call sub_183AC7E\r\nseg000:0183BEE0 push 6Fh ; 'o'\r\nseg000:0183BEE2 call sub_183AC7E\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 7 of 16\n\nseg000:0183BEE7 push 6Ch ; 'l'\r\nseg000:0183BEE9 call sub_183AC7E\r\nseg000:0183BEEE push 73h ; 's'\r\nseg000:0183BEF0 call sub_183AC7E\r\nseg000:0183BEF5 push dword ptr [ebp-184h]\r\nCan be translated to\r\npush 10h\r\npush esi\r\npush ebx\r\npush eax\r\npush edi\r\npush dword ptr [ebp-184h]\r\nNow i can peacfully say i know everything i need to de-obfuscate this  sendto call (Well not everything, i did\r\nskip the API-Wrapper function, but everything besides that) With all this new information at hand, we can move\r\non to the next part\r\nTomāto-Tomăto, Potāto-Potăto It’s all the same\r\nThe two problems i aim to solve, fixing that spaghetti code calling convention, and to fix the push_reg function.\r\nThese two functions rule most of the code, so fixing these two should be a huge step forward in understanding the\r\ncode and statically analyzing it.\r\nSo how is it done? Easy, Magic!\r\nor in its unofficial name, IDA-Python, scripting an automation process to go over all of the code, wherever one of\r\nthese functions occur, fix it and change it to a simpler and more readable code format while retaining the same\r\nfunctionality.\r\nSo let’s get practical shall we? Starting with the push_reg function\r\nI need to change every call to that function, which is made up of two opcodes:\r\n6A XX push \u003cBYTE\u003e\r\nE8 XX XX XX XX call \u003cDWORD\u003e\r\nPush and Call, which are both in total 7 bytes in memory. If I could replace these 7 bytes with the appropriate\r\nvalues of the Push \u003cRegister\u003e and do it over all of the code, it will be a big step in de-obfuscating the code.\r\nSo now that I know exactly what I want to replace, I wrote a script which does exactly that:\r\nPUSH_REGISTER_ADDR = 0x0183AC7E\r\nPUSH_REG_VALUE = 0x6C\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 8 of 16\n\nSIZEOF_PUSH_BYTE = 2\r\n \r\ndef fix_reg_push(function_address):\r\n patched_counter = 0\r\n unpatched_counter = 0\r\n values_to_patch = {PUSH_REG_VALUE : 0x50, # push eax\r\n PUSH_REG_VALUE + 1 : 0x51, # push ecx\r\n PUSH_REG_VALUE + 2 : 0x52, # push edx\r\n PUSH_REG_VALUE + 3 : 0x53, # push ebx\r\n PUSH_REG_VALUE + 5 : 0x55, # push ebp\r\n PUSH_REG_VALUE + 6 : 0x56, # push esi\r\n PUSH_REG_VALUE + 7 : 0x57, # push edi}\r\n \r\n # Go through all xrefs\r\n for xcall in XrefsTo(function_address):\r\n \r\n # Make code if is not already\r\n opcode_length = idc.MakeCode(xcall.frm - SIZEOF_PUSH_BYTE)\r\n if SIZEOF_PUSH_BYTE != opcode_length:\r\n print \" [*] fix_reg_push not code [0x%08X]\" % push_addr\r\n not_code_counter += 1\r\n continue\r\n \r\n # Obtain previous opcode address\r\n push_addr = idc.PrevHead(xcall.frm)\r\n \r\n # Sanity check 2\r\n if \"push\" != GetMnem(push_addr):\r\n print \" [*] fix_reg_push not push instruction [0x%08X]\" %\r\n print GetMnem(push_addr)\r\n not_push_counter += 1\r\n continue\r\n \r\n # Get new value\r\n push_value = GetOperandValue(push_addr, 0)\r\n byte_val = values_to_patch.get(push_value, None)\r\n if None == byte_val:\r\n print \" [*] fix_reg_push unexpected push value [0x%08X]\"\r\n bad_push_counter += 1\r\n continue\r\n \r\n # Patch code\r\n idaapi.patch_word(push_addr, 0x04EB) # EB 04 -\u003e Jmp $+4...\r\n idaapi.patch_long(push_addr + 2, 0x90909090) #\r\n idaapi.patch_byte(push_addr + 6, byte_val)\r\n \r\n patched_counter += 1\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 9 of 16\n\nprint \" [*] fix_reg_push - Total: [%d]\\npatched functions: [%d]\\nunpatched functions: [%d\r\n \r\n \r\ndef main():\r\n fix_reg_push(PUSH_REGISTER_ADDR)\r\n \r\nif \"__main__\" == __name__:\r\n main()\r\nThe code above is separated into a couple of sections:\r\nCalling my  fix_reg_push function with the appropriate function address which handles the push register by\r\nvalue\r\nRunning through all the Xrefs of the function and making IDA identify the bytes as code if it hasn’t already.\r\nOtherwise there would be issues identifying opcodes later in the script\r\nMaking sure the xref is valid and working as expected. I don’t want to create any weird code patches so I make\r\nsome necessary sanity checks\r\nPatching the code, changing the 7 original bytes to PUSH \u003creg\u003e and JMP \u003cbyte\u003e for better code clarity.\r\nLets examine the before and after results:\r\nBefore After\r\nAs you can see, I translated the reg_push functions (all of them) to a readable simple de-obfuscated push opcodes\r\nwhich have a length of one byte. I could have just done a NOP-slide for the rest of the bytes left, but I decided to\r\nimplement a jmp opcode instead with the memory I had left to overwrite. It’s a matter of taste. The code became\r\nmuch more readable and now I can finally read which register represents which value on the stack. This function\r\nwas fixed at over 3,900 places in the code. So it was definitely worth it.\r\nAnd that’s it for the first part.\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 10 of 16\n\nPatching the code on IDA made everything a lot more readable in terms of static analysis. Next, there is still that\r\nspaghetti calling convention I will have to fix, but as I investigated more of the code, I noticed there are dozens of\r\nvariations with different calculations being made, and for each one of those, there are a dozen more duplications\r\nwhich look identical to each other. The only logical thing left for me to do, was to make a regex to find every\r\nmatching function.\r\nFig. 6, three spaghetti functions found on the code, using add, xor and sub for calculations\r\nFortunately finding the common base between all functions wasn’t so hard. All of them have more or less the\r\nsame prologue, and pretty much the same epilogue. So creating some kind of byte regex to find all of them (and\r\nfix them!) isn’t very hard. So I’ve done just that.\r\nAfter automatically finding all of these spaghetti functions, I will patch the code just as I have done with the\r\n‘push_reg‘ functions. Only this time I have a lot more “space” in terms of bytes to do so\r\nFig. 7, focusing on the sendto call\r\nIn total, there are 16 bytes, that I would like to change to just CALL (5 bytes), so I have enough space to override\r\nas I want. This method is practically the same as the method I used before. So there is no reason to put another\r\ncode block to show how its done. Looking for all variants of these functions gave me a result of almost 100\r\ndifferent variations, with a total of approximately 3,000 different Xrefs in the code (for all variants).\r\nThe final result after patching both the spaghetti calling convention and the push registry by value:\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 11 of 16\n\nFig. 8, Final patch\r\nYou Can Run, But You Can’t Hide…\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 12 of 16\n\nFinally, having the important parts de-obfuscated, I could continue on to the DGA. Let me pre announce, the\r\nauthors intent to avoid being sinkholed payed off, good job! It has been a while since I’ve seen someone trying to\r\nprotect their code and their DGA as much as they did. So let’s get to it\r\nMost malwares who have a DGA use some value which changes periodically. This one is no different and is based\r\non the current date to calculate it’s DGA (Day, Month, Year). Though it’s not as simple as it sounds: Instead of\r\nusing some sort of builtin linear random function (such as msvcsrt!rand and msvcrt!srand), they implemented\r\ntheir own functions for making random numbers and setting the initial seed. Their MagicSeed (I’m going to use\r\nthat term a lot), means the number calculated every day, generated by the current date for example is made out of\r\n128 bits. Every time anything needs to obtain the MagicSeed’s value, the MagicSeed changes as well. So I had to\r\nfollow all of the code very carefully, not to miss anything regarding the MagicSeed’s usage.\r\nHow It All Works\r\nI will now explain how the malware reaches the C\u0026C server and the obfuscation made behind the DGA.\r\nAs you would expect from any malware, they make a simple domain list using a MagicSeed, then try to resolve\r\neach of the domains created, using google’s dns servers to prevent being dns-sinkholed, until one is being resolved\r\nand that would usually be the C\u0026C server. However, this is not our case because it would be too boring to talk\r\nabout just that wouldn’t it?\r\nSo as it gets more complicated, as when trying to resolve all of the generated domains, only the first domain\r\nwhich will be resolved into exactly two different IP addresses.  For example, these domains (which are generated\r\nat 30/09/2016):\r\nGenerated Domain Resolved IP addresses\r\njfwwqi.com\r\navljz.net\r\n4.2.0.1\r\n4.2.0.2\r\n4.2.0.3\r\nhlrhtvl.com\r\nmcodqfban.com\r\n192.168.0.1\r\n192.168.0.2\r\nxdvhfogmw.pw 13.37.80.80\r\nobsvi.com\r\nigcvdloatwf.in\r\nzcekjgrmmx.in\r\nThe only domain that will be used from this list would be\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 13 of 16\n\nmcodqfban.com\r\n192.168.0.1\r\n192.168.0.2\r\nBecause it is being resolved into two different IP addresses.\r\nYet, these two IP addresses have no direct connection to the C\u0026C server. They are just going to be another\r\nstepping stone in Nymaim’s logic in order to create a new MagicSeed number.\r\nAnd with that new MagicSeed, create a new domain list. with exactly the same algorithm as the first domain list\r\nwas generated, But hold on, there is more:\r\nBefore trying to use this newly created domain list, a checksum algorithm is used over the newly created domain\r\nlist, and the result is compared with a builtin checksums list.\r\nThis probably means that the domains themselves are finite and have probably been pre-bought, or they are just\r\nwaiting for the right time to buy a new domain that matches their checksum list.\r\nAfter the list passes the checksum check, the first domain in the list is taken and its TLD is changed to “.COM”.\r\nAfter all this effort, I would guess that domain is all that is left and the IP addresses matching the resolving of this\r\ndomain are what would be the C\u0026C server. However my guess was wrong. The IPs resolved from that newly\r\ncreated domain are not yet the correct IP addresses of the C\u0026C servers. For every IP address we get from the DNS\r\nrequest, a loop of xoring and rotation calculations are being made over each of the IP Addresses in order to obtain\r\nthe real C\u0026C server IP addresses (Finally!). Let’s summarize everything with a pseudo code:\r\ntlds = [“.net”, “.com”, “.in”, “.pw”]\r\n \r\nGenerateDomains(magic_num)\r\n{\r\n domains = []\r\n \r\n seed = CreateUniqueSeed(TODAYS_DATE)\r\n rand = GetRandomNumber(seed)\r\n \r\n for(int i=0; i\u003c16; i++)\r\n {\r\n domain_str = GenString(rand, seed, magic_num)\r\n domain_str += tlds[GetRandomNumber(seed)\r\n domains += [domain_str]\r\n }\r\n return domains\r\n}\r\nResolveDomains(domain_list)\r\n{\r\n for(i =0, i\u003c16; i++)\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 14 of 16\n\n{\r\n ip_addresses = DnsResolve(domain_list[i])\r\n if (2 == ip_addresses.length())\r\n return ip_addresses\r\n }\r\n}\r\nMain()\r\n{\r\n domains = GenerateDomains(0)\r\n ips = ResolveDomains(domains)\r\n \r\n new_domains = GenerateDomains(ips)\r\n domain = new_domains[0].replace(\".com\")\r\n \r\n real_ips = ResolveDomains(domain)\r\n real_ips = XorIPS(real_ips)\r\n \r\n CommunicateWithRealServer(real_ips)\r\n}\r\nGenerateDomains\r\nCreating a unique seed based on current date\r\nGenerate random number from seed\r\nCreate a domain string from generated random number and the seed\r\nCreate a new random, use it to append TLD\r\nReturns a list of 16 domains created\r\n ResolveDomains\r\nTrying to resolve domain list ip addresses\r\nCheck if exactly 2 IP addresses were obtained in the dns request\r\nReturn list of resolved addresses\r\nMain\r\nGenerating first list of domains\r\nGet good matching ip addresses (Only 2 ip addresses)\r\nGenerate new list of domains from the ips we got\r\nChange TLD of the first domain from the list generated\r\nResolve domain\r\nObtain real C\u0026C ip addresses through calculations\r\nCommunicate with C\u0026C\r\nI have also added a graph form for convenience\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 15 of 16\n\nFig. 9, Graph format of the pseudo code\r\nThis is a lot of stuff to do in order just to get a C\u0026C server IP address. Those little tricks they used made it harder\r\nto reverse and understand the Nymaim code, and harder to sink-hole the malware as well.\r\nSo here we see prime example of how malware authors try to avoid being sink-holed by using obfuscation\r\nmethods as protection for their code.\r\nBut then again, everything can be conquered and beaten if you wear on your malware thinking-cap and put your\r\nmind into it.\r\nRef analyzed sample:\r\nc41ffc1fd6e3f5157181b6e45f45f4fe\r\nSource: https://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nhttps://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/\r\nPage 16 of 16",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia",
		"ETDA"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://arielkoren.com/blog/2016/11/02/nymaim-deep-technical-dive-adventures-in-evasive-malware/"
	],
	"report_names": [
		"nymaim-deep-technical-dive-adventures-in-evasive-malware"
	],
	"threat_actors": [],
	"ts_created_at": 1775434236,
	"ts_updated_at": 1775826773,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/1d46156a7fb228766334a6057329ced5aa543569.pdf",
		"text": "https://archive.orkl.eu/1d46156a7fb228766334a6057329ced5aa543569.txt",
		"img": "https://archive.orkl.eu/1d46156a7fb228766334a6057329ced5aa543569.jpg"
	}
}