{
	"id": "7fd86ccb-3c16-4fd0-a292-6fb093eec4ec",
	"created_at": "2026-04-06T00:09:58.196887Z",
	"updated_at": "2026-04-10T13:12:20.783084Z",
	"deleted_at": null,
	"sha1_hash": "d78ba334e864f4aca4b6e68ee4db0b14895b4457",
	"title": "A quirk in the SUNBURST DGA algorithm",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 412490,
	"plain_text": "A quirk in the SUNBURST DGA algorithm\r\nBy Nick BlazierJesse Kipp\r\nPublished: 2020-12-18 · Archived: 2026-04-05 14:18:22 UTC\r\n2020-12-18\r\n4 min read\r\nOn Wednesday, December 16, the RedDrip Team from QiAnXin Technology released their discoveries (tweet, github)\r\nregarding the random subdomains associated with the SUNBURST malware which was present in the SolarWinds Orion\r\ncompromise. In studying queries performed by the malware, Cloudflare has uncovered additional details about how the\r\nDomain Generation Algorithm (DGA) encodes data and exfiltrates the compromised hostname to the command and control\r\nservers.\r\nBackground\r\nThe RedDrip team discovered that the DNS queries are created by combining the previously reverse-engineered unique guid\r\n(based on hashing of hostname and MAC address) with a payload that is a custom base 32 encoding of the hostname. The\r\narticle they published includes screenshots of decompiled or reimplemented C# functions that are included in the\r\ncompromised DLL. This background primer summarizes their work so far (which is published in Chinese).\r\nRedDrip discovered that the DGA subdomain portion of the query is split into three parts:\r\n\u003cencoded_guid\u003e + \u003cbyte\u003e + \u003cencoded_hostname\u003e\r\nAn example malicious domain is:\r\n7cbtailjomqle1pjvr2d32i2voe60ce2.appsync-api.us-east-1.avsvmcloud.com\r\nWhere the domain is split into the three parts as\r\nEncoded guid (15 chars) byte Encoded hostname\r\n7cbtailjomqle1p j vr2d32i2voe60ce2\r\nThe work from the RedDrip Team focused on the encoded hostname portion of the string, we have made additional insights\r\nrelated to the encoded hostname and encoded guid portions.\r\nAt a high level the encoded hostnames take one of two encoding schemes. If all of the characters in the hostname are\r\ncontained in the set of domain name-safe characters \"0123456789abcdefghijklmnopqrstuvwxyz-_.\" then the\r\nOrionImprovementBusinessLayer.CryptoHelper.Base64Decode algorithm, explained in the article, is used. If there are\r\ncharacters outside of that set in the hostname, then the OrionImprovementBusinessLayer.CryptoHelper.Base64Encode is\r\nused instead and ‘00’ is prepended to the encoding. This allows us to simply check if the first two characters of the encoded\r\nhostname are ‘00’ and know how the hostname is encoded.\r\nThese function names within the compromised DLL are meant to resemble the names of legitimate functions, but in fact\r\nperform the message encoding for the malware. The DLL function Base64Decode is meant to resemble the legitimate\r\nhttps://blog.cloudflare.com/a-quirk-in-the-sunburst-dga-algorithm/\r\nPage 1 of 4\n\nfunction name base64decode, but its purpose is actually to perform the encoding of the query (which is a variant of base32\r\nencoding).\r\nThe RedDrip Team has posted Python code for encoding and decoding the queries, including identifying random characters\r\ninserted into the queries at regular character intervals.\r\nOne potential issue we encountered with their implementation is the inclusion of a check clause looking for a ‘0’ character in\r\nthe encoded hostname (line 138 of the decoding script). This line causes the decoding algorithm to ignore any encoded\r\nhostnames that do not contain a ‘0’. We believe this was included because ‘0’ is the encoded value of a ‘0’, ‘.’, ‘-’ or ‘_’.\r\nSince fully qualified hostnames are comprised of multiple parts separated by ‘.’s, e.g. ‘example.com’, it makes sense to be\r\nexpecting a ‘.’ in the unencoded hostname and therefore only consider encoded hostnames containing a ‘0’. However, this\r\ncauses the decoder to ignore many of the recorded DGA domains.\r\nAs we explain below, we believe that long domains are split across multiple queries where the second half is much shorter\r\nand unlikely to include a ‘.’. For example ‘www2.example.c’ takes up one message, meaning that in order to transmit the\r\nentire domain ‘www2.example.c’ a second message containing just ‘om’ would also need to be sent. This second message\r\ndoes not contain a ‘.’ so its encoded form does not contain a ‘0’ and it is ignored in the RedDrip team’s implementation.\r\nThe quirk: hostnames are split across multiple queries\r\nA list of observed queries performed by the malware was published publicly on GitHub. Applying the decoding script to this\r\nset of queries, we see some queries appear to be truncated, such as grupobazar.loca , but also some decoded hostnames are\r\ncuriously short or incomplete, such as “com”, “.com”, or a single letter, such as “m”, or “l”.\r\nWhen the hostname does not fit into the available payload section of the encoded query, it is split up across multiple queries.\r\nQueries are matched up by matching the GUID section after applying a byte-by-byte exclusive-or (xor).\r\nAnalysis of first 15 characters\r\nNoticing that long domains are split across multiple requests led us to believe that the first 16 characters encoded\r\ninformation to associate multipart messages. This would allow the receiver on the other end to correctly re-assemble the\r\nmessages and get the entire domain. The RedDrip team identified the first 15 bytes as a GUID, we focused on those bytes\r\nand will refer to them subsequently as the header.\r\nWe found the following queries that we believed to be matches without knowing yet the correct pairings between message 1\r\nand message 2 (payload has been altered):\r\nPart 1 - Both decode to “www2.example.c” r1q6arhpujcf6jb6qqqb0trmuhd1r0ee.appsync-api.us-west-2.avsvmcloud.com r8stkst71ebqgj66qqqb0trmuhd1r0ee.appsync-api.us-west-2.avsvmcloud.com\r\nPart 2 - Both decode to “om” 0oni12r13ficnkqb2h.appsync-api.us-west-2.avsvmcloud.com ulfmcf44qd58t9e82h.appsync-api.us-west-2.avsvmcloud.com\r\nThis gives us a final combined payload of www2.example.com\r\nThis example gave us two sets of messages where we were confident the second part was associated with the first part, and\r\nallowed us to find the following relationship where message1 is the header of the first message and message2 is the header\r\nof the second:\r\n_Base32Decode(message1) XOR KEY = Base32Decode(message2)_\r\nThe KEY is a single character. That character is xor’d with each byte of the Base32Decoded first header to produce the\r\nBase32Decoded second header. We do not currently know how to infer what character is used as the key, but we can still\r\nmatch messages together without that information. Since A XOR B = C where we know A and C but not B, we can instead\r\nuse A XOR C = B. This means that in order to pair messages together we simply need to look for messages where XOR’ing\r\nthem together results in a repeating character (the key).\r\nBase32Decode(message1) XOR Base32Decode(message2) = KEY\r\nLooking at the examples above this becomes\r\nMessage 1 Message 2\r\nHeader r1q6arhpujcf6jb 0oni12r13ficnkq\r\nBase32Decode (binary)\r\n101101000100110110111111011\r\n010010000000011001010111111\r\n01111000101001110100000101\r\n110110010010000011010010000\r\n001000110110110100111100100\r\n00100011111111000000000100\r\nhttps://blog.cloudflare.com/a-quirk-in-the-sunburst-dga-algorithm/\r\nPage 2 of 4\n\nWe’ve truncated the results slightly, but below shows the two binary representations and the third line shows the result of the\r\nXOR.\r\n101101000100110110111111011010010000000011001010111111011110001010011101110110010010000011010010000001000110110110100111100100\r\nWe can see the XOR result is the repeating sequence ‘01101101’meaning the original key was 0x6D or ‘m’.\r\nWe provide the following python code as an implementation for matching paired messages (Note: the decoding functions are\r\nthose provided by the RedDrip team):\r\n# string1 is the first 15 characters of the first message\r\n# string2 is the first 15 characters of the second message\r\ndef is_match(string1, string2):\r\n encoded1 = Base32Decode(string1)\r\n encoded2 = Base32Decode(string2)\r\n xor_result = [chr(ord(a) ^ ord(b)) for a,b in zip(encoded1, encoded2)]\r\n match_char = xor_result[0]\r\n for character in xor_result[0:9]:\r\n if character != match_char:\r\n return False, None\r\n return True, \"0x{:02X}\".format(ord(match_char))\r\nThe following are additional headers which based on the payload content Cloudflare is confident are pairs (the payload has\r\nbeen redacted because it contains hostname information that is not yet publicly available):\r\nExample 1:\r\nvrffaikp47gnsd4a\r\naob0ceh5l8cr6mco\r\nxorkey: 0x4E\r\nExample 2:\r\nvrffaikp47gnsd4a\r\naob0ceh5l8cr6mco\r\nxorkey: 0x54\r\nExample 3:\r\nvvu7884g0o86pr4a\r\n6gpt7s654cfn4h6h\r\nxorkey: 0x2B\r\nWe hypothesize that the xorkey can be derived from the header bytes and/or padding byte of the two messages, though we\r\nhave not yet determined the relationship.\r\nUpdate (12/18/2020):\r\nErik Hjelmvik posted a blog explaining where the xor key is located. Based on his code, we provide a python\r\nimplementation for converting the header (first 16 bytes) into the decoded GUID as a string. Messages can then be paired by\r\nmatching GUID’s to reconstruct the full hostname.\r\ndef decrypt_secure_string(header):\r\n decoded = Base32Decode(header[0:16])\r\n xor_key = ord(decoded[0])\r\n decrypted = [\"{0:02x}\".format(ord(b) ^ xor_key) for b in decoded]\r\n return ''.join(decrypted[1:9])\r\nUpdated example:\r\nMessage 1 Message 2\r\nHeader r1q6arhpujcf6jb 0oni12r13ficnkq\r\nhttps://blog.cloudflare.com/a-quirk-in-the-sunburst-dga-algorithm/\r\nPage 3 of 4\n\nMessage 1 Message 2\r\nBase32Decode Header (hex) b44dbf6900cafde29d05 d920d2046da7908ff004\r\nBase32Decode first byte (xor key) 0xb4 0xd9\r\nXOR result (hex) 00f90bddb47e495629 00f90bddb47e495629\r\nCloudflare's connectivity cloud protects entire corporate networks, helps customers build Internet-scale applications\r\nefficiently, accelerates any website or Internet application, wards off DDoS attacks, keeps hackers at bay, and can help you\r\non your journey to Zero Trust.\r\nVisit 1.1.1.1 from any device to get started with our free app that makes your Internet faster and safer.\r\nTo learn more about our mission to help build a better Internet, start here. If you're looking for a new career direction, check\r\nout our open positions.\r\nCloudflare Zero TrustCloudflare GatewayDeep DiveThreat Intelligence\r\nSource: https://blog.cloudflare.com/a-quirk-in-the-sunburst-dga-algorithm/\r\nhttps://blog.cloudflare.com/a-quirk-in-the-sunburst-dga-algorithm/\r\nPage 4 of 4",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://blog.cloudflare.com/a-quirk-in-the-sunburst-dga-algorithm/"
	],
	"report_names": [
		"a-quirk-in-the-sunburst-dga-algorithm"
	],
	"threat_actors": [
		{
			"id": "aa73cd6a-868c-4ae4-a5b2-7cb2c5ad1e9d",
			"created_at": "2022-10-25T16:07:24.139848Z",
			"updated_at": "2026-04-10T02:00:04.878798Z",
			"deleted_at": null,
			"main_name": "Safe",
			"aliases": [],
			"source_name": "ETDA:Safe",
			"tools": [
				"DebugView",
				"LZ77",
				"OpenDoc",
				"SafeDisk",
				"TypeConfig",
				"UPXShell",
				"UsbDoc",
				"UsbExe"
			],
			"source_id": "ETDA",
			"reports": null
		}
	],
	"ts_created_at": 1775434198,
	"ts_updated_at": 1775826740,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/d78ba334e864f4aca4b6e68ee4db0b14895b4457.pdf",
		"text": "https://archive.orkl.eu/d78ba334e864f4aca4b6e68ee4db0b14895b4457.txt",
		"img": "https://archive.orkl.eu/d78ba334e864f4aca4b6e68ee4db0b14895b4457.jpg"
	}
}