{
	"id": "cfe989cf-3394-4299-9dab-11fa4dcaeb0d",
	"created_at": "2026-05-06T02:02:42.845303Z",
	"updated_at": "2026-05-06T02:03:52.6517Z",
	"deleted_at": null,
	"sha1_hash": "f69347192ba97b49b4053fc8fe59d17679174581",
	"title": "When Malware Authors Study Algebra: The Group Theory Inside Bedep's DGA",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 123515,
	"plain_text": "When Malware Authors Study Algebra: The Group Theory Inside\r\nBedep's DGA\r\nBy Threat Research TeamThreat Research Team\r\nArchived: 2026-05-06 02:00:47 UTC\r\nTLDR: Bedep was a malware family active mainly in 2014 and 2015 that used an unusually sophisticated domain\r\ngeneration algorithm, or DGA, to hide its command-and-control infrastructure. Instead of relying only on the date,\r\nit used real foreign exchange rates published by the European Central Bank, making its future domains much\r\nharder to predict in advance. Under the hood, the malware used ideas from group theory to generate a fixed set of\r\nunique domains without collisions, a rare example of advanced mathematics being applied directly in real-world\r\nmalware.\r\nMost domain generation algorithms are boring. A linear congruential generator, a date seed, some modular\r\narithmetic, a character table -- seed = (seed * 0x41C64E6D + 0x3039) % 2^31 , rinse, repeat. You reverse them in\r\nan afternoon, pre-compute next month's domains over lunch, and move on.\r\nBedep author implemented a number theory textbook inside a 32-bit DLL, and the result is one of the most\r\nmathematically elegant pieces of malware ever shipped.\r\nThis post is a deep dive into this unique engineering.\r\nWhat is Bedep?\r\nBedep is an ad-fraud botnet that was active from late 2014 through 2015, delivered exclusively through the Angler\r\nexploit kit. When a Flash zero-day (CVE-2015-0311) was burning, Angler was dropping Bedep. The malware\r\nitself was a modular downloader focused on click fraud, but its infrastructure is what made it interesting. In early\r\n2015, new variants appeared with a DGA for C2 resolution. A three-day sinkhole by Arbor Networks (ASERT)\r\ncaught phone-homes from roughly 82,000 unique IPs, spread across every continent except -- notably -- Russia.\r\nDennis Schwarz originally reversed the DGA at ASERT, who published a proof-of-concept Python\r\nreimplementation and a detailed write-up titled \"Bedep's DGA: Trading Foreign Exchange for Malware\r\nDomains.\" The original blog post has since been taken down from NETSCOUT's site, but survives on the\r\nWayback Machine. The PoC lived at github.com/arbor/bedep_dga . Dennis got the algorithm working but noted\r\nthat parts of it were opaque -- he called the core transform a \"blackbox\" and flagged the embedded table as \"likely\r\nsomething Fermat number related.\" He was right. Let's open that box.\r\nThe ECB trick: Seeding a DGA from the financial markets\r\nBefore the math starts, Bedep needs a seed that both the bot and the botmaster can independently derive without\r\ncommunicating. Most DGAs use the date. Bedep uses the date and the global foreign exchange market.\r\nhttps://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga\r\nPage 1 of 6\n\nThe algorithm fetches two XML files from legitimate public services:\r\n1. UTC time from earthtools.org/timezone/0/0 -- the current timestamp, converted to \"days since year\r\nzero.\"\r\n2. Euro foreign exchange reference rates from www.ecb.europa.eu/stats/eurofxref/eurofxref-hist-90d.xml -- the European Central Bank's daily publication of EUR conversion rates against 31+ world\r\ncurrencies, updated every business day.\r\nThe time is used to select which day's rates to use. Specifically, Bedep picks \"last Tuesday's\" exchange rates --\r\nmeaning the preceding week's Tuesday until Thursday, after which it rolls forward to the current week's Tuesday.\r\nOnly rates falling on a Monday boundary (in the algorithm's day-counting scheme) are considered. This gives a\r\nroughly weekly rotation with a deterministic selection rule.\r\nFrom the chosen date, Bedep extracts up to 48 currency rates -- USD, JPY, GBP, CZK, BGN, HUF, and so on.\r\nEach rate is parsed through a slightly broken custom atof() implementation (off by a bit in the least significant\r\ndigits -- a quirk, not a feature), packed into a 64-bit IEEE double, and its low 32-bit dword is extracted for use in\r\nthe algorithm.\r\nHere is what a single run looks like, from the PoC output for April 7, 2015:\r\nparsed 31 currencies from 2015-04-07 (currency date):  USD: 1.0847   JPY: 130.33   BGN: 1.9558\r\n CZK: 27.455   DKK: 7.4714  GBP: 0.7286  HUF: 299.08   PLN: 4.0578  RON: 4.4165  SEK:\r\n9.374    CHF: 1.0438  NOK: 8.73  HRK: 7.619    RUB: 59.8265  TRY: 2.8079  AUD: 1.4192  \r\nBRL: 3.3979  CAD: 1.3563  CNY: 6.7241   HKD: 8.4086  IDR: 14091.77  ILS: 4.267    INR:\r\n67.598  KRW: 1183.28  MXN: 16.1919   MYR: 3.9527  NZD: 1.4423  PHP: 48.3    SGD: 1.4724\r\n THB: 35.335  ZAR: 12.8345\r\nWhy is this clever? Because the ECB rates are:\r\nPublicly available -- anyone can fetch the same XML\r\nGlobally consistent -- every bot on every continent gets the same data\r\nUnpredictable in advance -- nobody can predict next Tuesday's EUR/USD rate\r\nHistorically verifiable -- the ECB publishes archives going back years\r\nImpossible to tamper with -- the ECB is not going to modify their reference rates to help you sinkhole a\r\nbotnet\r\nAnd because the malware fetches this from ecb.europa.eu , the request looks like legitimate traffic. A Snort\r\nsignature (SID 33188) was written to detect it, but it fired constantly on real users checking exchange rates.\r\nThe mathematical core\r\nThis is where Bedep stops being a clever botnet and starts being a discrete mathematics exercise.\r\nPrecomputed subgroup parameters\r\nhttps://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga\r\nPage 2 of 6\n\nThe DLL embeds a lookup table -- 102 entries of 8 dwords each, stored as JSON in the PoC\r\n( transform2_table_varN.json , one per config variant). At first glance it looks like random constants. It is not.\r\nThe transform2 function takes a hash derived from the exchange rates and days-since value, then uses it as an\r\nindex into this table. Each table entry, after being decoded through config-specific XOR and multiply constants\r\n( 0x663d81 * entry[0] ^ value3 and 0x281 * entry[1] ^ value2 ), yields two critical values:\r\np -- a prime number\r\nq -- an integer such that q divides p - 1\r\nThis is the subgroup structure. The table stores primes whose group orders (p - 1) have known factorizations with\r\nsmall prime factors. The malware doesn't need to factor p - 1 at runtime -- that hard work was done offline by the\r\nauthor and baked into the DLL. Dennis Schwarz noted these were \"likely something Fermat number related,\" and\r\nhe was onto something: the primes are chosen so that their totient has a smooth factorization, making generator-finding tractable.\r\nThe number of domains to generate comes directly from q - 1. For the first config, this works out to 22 domains;\r\nfor the second, 28 -- totaling 50 domains per weekly rotation.\r\nFinding generators: A textbook primitive root search\r\nWith p and the prime factorization of p - 1 in hand, Bedep needs a generator of the multiplicative group\r\n((\\mathbb{Z}/p\\mathbb{Z})^*) -- or more precisely, of a specific subgroup of order q.\r\nThe transform8 function does this in two stages. First, it runs a trial-division prime sieve: starting from 3,\r\nstepping by 2 (skipping evens), it tests each candidate against all previously found primes. It collects up to 37\r\nsmall primes, then filters for those that actually divide q (the subgroup order). This gives the prime factorization of\r\nq.\r\nThen transform3 finds the primitive root itself. This is the standard algorithm straight from a number theory\r\ntextbook:\r\ndef find_primitive_root(p, prime_factors_of_order):  for g in range(2, p // 2):    is_generator\r\n= True    for q_i in prime_factors_of_order:      # If g^((p-1)/q_i) ≡ 1 (mod p), then g\r\ngenerates      # a proper subgroup, not the full group. Reject it.      if pow(g, (p - 1)\r\n// q_i, p) == 1:        is_generator = False        break    if is_generator:  \r\n   return g  return None\r\nThe actual code in transform3 is this logic rendered in C-style loop structure with manual index tracking, but\r\nthe mathematics is identical. For each candidate g starting from 2, it checks:\r\n[ g^{(p-1)/q_i} \\not\\equiv 1 \\pmod{p} \\quad \\text{for all prime factors } q_i \\text{ of } p-1 ]\r\nIf g passes all checks, it generates the full group. This is literally the algorithm from Section 11.1 of Shoup's A\r\nComputational Introduction to Number Theory and Algebra, implemented in a malware DLL.\r\nWalking the cyclic group\r\nhttps://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga\r\nPage 3 of 6\n\nNow comes the payoff. With a prime p, a generator g, and a subgroup order q, the transform7 function sets up\r\nthe DGA's iteration state:\r\nmodulus  = p             # field_c seed    = pow(g, random_e, p)    # field_4 -\r\n- starting element step    = pow(g2, random_f, q)    # field_8 -- step exponent\r\nThe starting position seed is g raised to a random exponent (derived from rdtsc , the CPU timestamp counter).\r\nThe step is similarly randomized from a second generator. Both are computed via modular exponentiation.\r\nThen, for each domain, the iteration is simply:\r\nseed = pow(seed, step, modulus)   # one step through the cyclic group domain = seed_to_domain(seed,\r\ncurrencies)\r\nThat single line -- pow(seed, step, modulus) -- is the entire DGA engine. Each call advances the state to the\r\nnext element of the cyclic subgroup.\r\nBecause the subgroup has order q, and step is coprime to q (guaranteed by the generator construction), this walk\r\nvisits every non-identity element of the subgroup exactly once before cycling. The number of domains generated\r\nequals q - 1, which is precisely the number of non-identity elements. No collisions. No wasted iterations. No off-by-one hoping you hit enough domains.\r\nThe key insight: the set of group elements visited is deterministic (fixed by p, g, and the exchange rates), but the\r\norder of visitation depends on the random exponents from rdtsc . Every infected machine walks through the\r\nsame set of q - 1 elements, generating the same q - 1 domain names, but potentially in a different order. Same\r\ndestinations, different paths.\r\nflowchart TB  subgraph cyclic_group [\"Cyclic Subgroup of (Z/pZ)*\"]    E1[\"g^1 mod p\"] --\u003e\r\nE2[\"g^2 mod p\"]    E2 --\u003e E3[\"g^3 mod p\"]    E3 --\u003e E4[\"...\"]    E4 --\u003e Eq[\"g^(q-1) mod\r\np\"]    Eq --\u003e E1  end  subgraph walk [\"DGA Walk (order depends on rdtsc)\"]    S1[\"seed_0\"]\r\n--\u003e|\"pow(s, step, p)\"| S2[\"seed_1\"]    S2 --\u003e|\"pow(s, step, p)\"| S3[\"seed_2\"]    S3 --\r\n\u003e|\"pow(s, step, p)\"| S4[\"...\"]    S4 --\u003e|\"pow(s, step, p)\"| SN[\"seed_(q-2)\"]  end  S1 -.-\u003e|\"=\r\ng^e mod p\\n(random start)\"| cyclic_group  S2 -.-\u003e|maps to| E3  S3 -.-\u003e|maps to| Eq\r\nFrom Group Elements to Domain Names\r\nEach group element is a 32-bit integer. Turning it into a domain name is the final step, handled by transform11 .\r\nThis is less mathematically elegant and more of a traditional mixing function, but it has a nice property: every\r\ncharacter position in the domain uses a different currency rate.\r\nThe algorithm cycles through the full list of parsed currencies (31 in the example above). For a domain of length\r\n18, that means character 0 is influenced by USD, character 1 by HRK, character 2 by MXN, character 3 by GBP,\r\nand so on. Each character is computed by:\r\n1. Multiplying the currency's 3-letter name (packed as a 32-bit int) and its 64-bit floating-point rate by large\r\nconstants\r\nhttps://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga\r\nPage 4 of 6\n\n2. XORing and shifting the result with the current seed and the days-since value\r\n3. Taking the result modulo 26 (for positions 2+) or modulo 36 (for the last two positions, allowing digits)\r\n4. Adding the ASCII offset for 'a'\r\nThe domain length itself is derived from the seed XORed with a rate-dependent value, bounded between 12 and\r\n18 characters. All domains use the .com TLD.\r\nA trace from the PoC output shows the mixing in action:\r\nseed: 0xa13c9652  mixing in USD's rate  mixing in HRK's rate  mixing in MXN's rate  mixing in\r\nGBP's rate  ...  mixing in ZAR's rate  domain: rrpohktjlscncqxvt3.com seed: 0x558af439  mixing\r\nin USD's rate  ...  domain: wjavcjhazzxyxotkbi.com\r\nEach domain is a fingerprint of the entire exchange rate vector, filtered through one specific element of the cyclic\r\ngroup.\r\nThe Full Pipeline\r\nPutting it all together:\r\nflowchart LR  ECB[\"ECB XML\\n31+ currency rates\"] --\u003e DateSelect[\"Select last\\nTuesday's rates\"] \r\nEarth[\"earthtools.org\\nUTC timestamp\"] --\u003e DateSelect  DateSelect --\u003e Mix[\"XOR/multiply\\nrates +\r\ndays_since\"]  Mix --\u003e TableLookup[\"Lookup table\\n→ prime p, order q\"]  TableLookup --\u003e\r\nGenFind[\"Sieve primes,\\nfind generator g\"]  GenFind --\u003e GroupSetup[\"seed = g^rand mod p\\nstep =\r\ng2^rand mod q\"]  GroupSetup --\u003e Walk[\"Iterate:\\nseed = seed^step mod p\"]  Walk --\u003e CharMix[\"Mix\r\nseed with\\ncurrency rates\"]  CharMix --\u003e Domains[\"22-28 domains\\nper config\"]\r\nSeven configs were observed in the wild, each with its own embedded table, XOR constants, and currency count\r\n(36 or 48). Each malware variant embedded two configs that ran per rotation -- the first generating 22 domains,\r\nthe second 28, yielding 50 domains total. Across the observed campaign, domains were registered 2-5 days ahead\r\nof use through \"Domain Context\" registrar with Regway nameservers, all resolving to infrastructure at\r\n5.196.181.244 and 46.105.251.1.\r\nWhy this matters\r\nCompare Bedep's approach with its contemporaries:\r\nFamily DGA Technique Seed\r\nConficker.C Date + simple arithmetic Current date\r\nGamaredon Nested character-range loops Hardcoded ranges\r\nNecurs CRC32 hash of date components Date + constant\r\nMatsnu Dictionary word concatenation Days since epoch\r\nhttps://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga\r\nPage 5 of 6\n\nFamily DGA Technique Seed\r\nBedep Cyclic group walk mod p ECB exchange rates\r\nEvery other DGA on this list can be pre-computed arbitrarily far into the future. Bedep cannot -- because\r\nTuesday's exchange rates don't exist until Tuesday. You can sinkhole Conficker's domains for next year during\r\nyour coffee break. For Bedep, you have to wait for the ECB to publish, then race the botmaster to register the\r\ndomains.\r\nThe mathematical guarantee is real: the cyclic group structure ensures exactly q - 1 distinct domains per rotation\r\nwith no collisions and no wasted iterations. The primitive root construction ensures the generator produces every\r\nelement. The precomputed tables avoid expensive factorization at runtime. The exchange rate seed provides\r\nunpredictable entropy from a trusted, globally-consistent, publicly-verifiable source.\r\nThis is not someone who copy-pasted a DGA from a forum. The author understood multiplicative groups modulo\r\nprimes, knew how to find primitive roots, precomputed smooth-order primes offline, and wired the whole thing to\r\nthe European Central Bank's daily publications. Whether they learned this from a university course, a textbook, or\r\na cryptography library's source code, the result is unmistakable: this is applied algebra, deployed in production, at\r\nscale.\r\nThe original ASERT analysis is preserved on the Wayback Machine. The proof-of-concept implementation was\r\npublished on ASERT's GitHub. The sample analyzed here is e5e72baff4fab6ea6a1fcac467dc4351 (MD5) / \r\nd0fb1b66b6e4da395892327be9f39adb4533e7759ace39f67bdde0bb1cdaef35 (SHA256).\r\nCredits: The DGA was originally reversed by Dennis Schwarz at Arbor Networks / ASERT. This post builds on his\r\nwork, focusing on the mathematical structure he identified but didn't fully unpack.\r\nThreat Research Team\r\nThreat Research Team\r\nA group of elite researchers who like to stay under the radar.\r\nSource: https://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga\r\nhttps://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga\r\nPage 6 of 6",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://www.gendigital.com/blog/insights/research/the-group-theory-inside-bedeps-dga"
	],
	"report_names": [
		"the-group-theory-inside-bedeps-dga"
	],
	"threat_actors": [],
	"ts_created_at": 1778032962,
	"ts_updated_at": 1778033032,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/f69347192ba97b49b4053fc8fe59d17679174581.pdf",
		"text": "https://archive.orkl.eu/f69347192ba97b49b4053fc8fe59d17679174581.txt",
		"img": "https://archive.orkl.eu/f69347192ba97b49b4053fc8fe59d17679174581.jpg"
	}
}