Peering into spam botnets. Peering into spam botnets. Jaroslaw Jedynak Maciej Kotowicz 1 Intro Spam in probably the biggest vector of infection for both commodity and targeted malware. One of the reasons it earned this positions are spam botnets, malware that has only one job, to send as many malicious mails as possible. Most of those beast operates for years and are very resilient to takedowns, mostly due to its complicated infrastructure and protocols. For some more notorious spammers it’s possible to find analysis of their protocols but most of those analyses are quite outdated. In this paper we would like to provide detailed description how the those malware communicate and how we can leverage this to get better understanding of they operations and get more malware directly from the source. We handpicked couple of families that are either engaged with spamming for very long time or their protocol is unique and outstanding. The list is composed of • Necurs • Tofsee • Kelihos • Send Safe • Emotet1 2 Emotet Emotet is an offspring[1] of the long lived malware family that started with cridex and allegedly give birth to such malware as Dridex or Dyre. It appears in June 2014[2] targeting solely clients of German banks. It was and, as far we know still is distributed only by spam emails, that originate from previously infected machines. In the past it was trivial to distinguish Emotet’s malspam from others, those emails were always impersonating DHL shipment orders and had very unique URL patterns under which malware was located. 1its spamming module to be precise, and its newest version 1 Example URL[3] Purpose http://freylau.de/VMfWxYqJme First landing page www.buziaki.gorzow.pl/CgHsoRfvpqGk2/8114721795964851.zip Redirect very long name with lots of dashes2 Final Malware Today they shifted their tactics and are using more generic approach, dropping Word document that contains powershell command responsible for downloading and executing emotet. While we didn’t analyze closely how spamming module operate in the past, based on how general C&C communication change we can assumed that it had very little in common with today’s protocol. During our research we found emotet’s protocol, while rather simple, quite fascinating. Here we wont delve into details of it, since we already described it on our blog[4], just reminded that its and AES-encrypted blob of binary data. Based on educated guesses we discovered that binary blob appearing in commu- nication is in fact an a modified version of google’s proto buffers. At the time of writing we are not sure if modification came from sloppy custom implementation or by other means. For purpose of our analysis we assumed that it was a deliberate move. 2.1 spam operation Most spamming malware are design to behave like SMTP client, they communi- cate directly with email servers of their victims, authors of emotet took a different approach. Many properly configured SMTP servers are either blacklisting or graylisting messages from untrusted or unknown sources, those mechanism were introduced to protect common people from receive bulk of unsolicited messages. To work around that Emotet is using trusted services like gmail,yahoo or live.com as their relays abusing firstly stolen credentials, for which they have a separate module. This is clearly visible in configuration data received from C&C id: 2075010 mail_server: "sd-smartermail.sdsolutions.de" port: 25 login: "info@prodia-bamberg.de" password: "xxxxxxxxx" email: "info@prodia-bamberg.de", id: 2129057 mail_server: "mx1.kabsi.at" port: 587 login: "h2199a00" password: "xxxxxx" email: "h2199a00@kabsi.at", id: 2136311 2Dhl_Status_zu_Sendung_340839262847_____ID_S01_DHL__DEM06_MS02_06_2015___A23_21_15.exe 2 http://freylau.de/VMfWxYqJme mail_server: "host72.kei.pl" port: 587 login: "dyspozytor-pdz@kolprem.pl" password: "xxxxxxxxxx" email: "dyspozytor-pdz@kolprem.pl", 3 Kelihos Kelihos, also known as Hlux, was one of older spam botnets. It was first discovered around December 2010 Finally in April 2017, after many previous attempts to take it down, botnet operator was arrested, and FBI has begun sinkholing the botnet[5]. Because of this, this part of the paper is provided mostly for historical reasons - described techniques probably won’t work, because peers are dead - unless Kelihos comes back from the dead in future. Nevertheless, we think that this botnet was interesting enough that it still deserves a mention here. Additionally, a great write-up on Kelihos communication can be found on Fortinet blog[6]. The scope of this paper is very similar, though we focused more on the implementation side, and provided few bits of code. Additionally we think that Kelihos’s unusual approach to encryption is just interesting to read about. 3.1 Peer handshake Kelihos uses surprisingly solid cryptography in its P2P communication - each bot has his own key pair (generated using Crypto++ library). Communication is encrypted using asymmetric cryptography, and because of this, it’s impossible to decrypt it, even when whole traffic is captured. When Kelihos wants to perform a key exchange with a peer, it generates 16- byte random key, and signs it with its private key with PKCS1 RSA/SHA1. Handshake message contains this random data, a signature for it, and a public key. Kelihos packs this fields using simple structure, presented in Figure 1 Handshake can be generated by with a help of the following Python code: flags = 'e78673536c555545'.decode('hex') # timestamp and magic blocks = '03104840'.decode('hex') # [0x03, 0x10, 0x48, 0x40] # 3 blocks, 16 bytes of random data, 0x48 bytes of public key, 0x40 bytes of signed data hdr = flags + blocks randbytes = Random.new().read(16) pubkey = rsakey.publickey().exportKey(format='DER')[20:] hash = hashlib.sha1(randbytes).digest() pad = '01ffffffffffffffffffffffffffffffffffffffffffffffffffff00'.decode('hex') 3 Figure 1: Peer handshake packet structure sha1_designator = '3021300906052b0e03021a05000414'.decode('hex') signature = rsakey.sign(pad + sha1_designator + hash, '')[0] signature = number.long_to_bytes(signed) handshake_request = hdr + randbytes + pubkey + signature Receiving data is more complicated - data is first encrypted using Blowfish cipher in CBC mode, and after that, we have a similar structure (three blocks, with random data, public key, and signature). Exemplary decryption code: data = sock.recv(100000) rsa_enc, blowfish_enc = kelihos_get_blocks(data) # parse blocks - response has two ones blowtmp = rsakey.decrypt(rsa_enc) blowkey = blowtmp[-16:] print 'KELIHOS_HANDSHAKE_RESPONSE' print ' - rsa_encoded', rsa_enc.encode('hex') print ' - rsa_decrypt', blowtmp.encode('hex') print ' - rsa_result', blowkey.encode('hex') print ' - blowfish_enc', blowfish_enc.encode('hex') iv = blowfish_enc[1:9] cipher = Blowfish.new(blowkey, Blowfish.MODE_CBC, iv) msg = cipher.decrypt(blowfish_enc[9:]) some_data = msg[:12] 4 rsa_key = msg[12:12+72] signeddata = msg[12+72:12+72+64] print ' - blowfish_dec', msg.encode('hex') print ' - some_data', some_data.encode('hex') print ' - rsa_key', rsa_key.encode('hex') print ' - signeddata', signeddata.encode('hex') key = RSA.importKey(rsa_key) sign = key.encrypt(signeddata, 'ignored')[0] print ' - sign', sign.encode('hex') print ' - hash', sign[-20:].encode('hex') This mechanism of key exchange is a good example of correctly used asymmetric crypto - it actually made an analysis of traffic harder, because we needed to dump private keys if we wanted to analyze raw traffic. 3.2 Peer list exchange During peer data exchange, Kelihos serializes all relevant information into a certain big structure, which is then encrypted. In contrast to handshake, most encryption methods used here are homemade by malware authors and not very cryptographically sound. Most interesting and quite unique idea here is using eight encryption algorithms in a random order determined by random 16 bytes from the header. First, a list of encryption functions is created and shuffled (with random generator seeded by 16 byte header). Seeding algorithm looks like this: void crypto_function_list(crypto_function_list *crypt, string_t *seed) { strc = init_temporary_structure(); strc.str = seed; strc.offset = 0; list_insert_by_string_offset(&strc, func_xor_1byte); list_insert_by_string_offset(&strc, func_viscrypt); list_insert_by_string_offset(&strc, func_mathops); list_insert_by_string_offset(&strc, func_bitcrypt1); list_insert_by_string_offset(&strc, func_pairwise_swap); list_insert_by_string_offset(&strc, func_simple); list_insert_by_string_offset(&strc, func_reverse); list_insert_by_string_offset(&strc, func_bitcrypt2); it = strc.list.tail->prev; // cyclic list while (it != strc.list.tail) 5 { append_list(crypt, &it->data); it = it->prev; } free_list(&strc.list); return crypt; } Where list_insert_by_string_offset is using consecutive characters from seed as offsets for inserting functions in random order into function list. void list_insert_by_string_offset(temporary_struct *strc, crypto_function func) { seed = strc->str->length ? *getCharAtPos(strc->str, strc->offset) : strc->list.size; offset = seed % (strc->list.size + 1); list_insert_at_posiion(&strc->list, offset, func); if (++strc->offset >= strc->str->length) strc->offset = 0; } And after that functions are called consecutively on plaintext. name description xor_1byte xor every byte in string with the same byte viscrypt visual crypt algorithm (xor string with string[1:]+chr(len(string))) mathops meaningless mathematical operations on every byte - (see appendix) bitcrypt1 meaningless bitwise operations on every byte (see appendix) bitcrypt2 meaningless bitwise operations on every byte (see appendix) pairwise_swapswap(string[0], string[1]), swap(string[2], string[3]), swap(string[4], string[5]), . . . simple swap nibbles in every byte reverse reverse string Non-obvious encryption methods are shown in appendix 10.1. All these encryption functions are trivially decryptable with a bit of cryptonalysis. It’s possible that malware creators think that combining multiple weak encryption algorithms will make a strong one - but we believe that this is just an attempt at obfuscation and slowing researchers down, not really a proper encryption scheme. Especially that, after that, standard Blowfish encryption is used again (with random 0x10 bytes as a key). Finally, Blowfish key is encrypted with remote peer’s public key. Now malware creates three data blocks: 6 • random bytes determining decryption function order • encrypted Blowfish key • encrypted peer list First block is additionally encrypted with viscrypt and bitcrypt1 methods, then few bytes of random data are prepended to it, and finally, one byte with obfuscated length of that random data is prepended. All three blocks are concatenated, and encrypted with bitcrypt1 method, just in case. After that, length of every block is packed into header. Header contains 6 dwords, with following meaning: • block 1 length = HEADER[1] - HEADER[0] • block 2 length = HEADER[2] - HEADER[1] • block 3 length = HEADER[3] - HEADER[2] • unk1 length = HEADER[4] - HEADER[3] - 95 • message type = HEADER[5] - HEADER[4] - 197 All but first four bytes of the header are additionally encrypted with viscrypt and bitcrypt2 methods. This probably sounds really convoluted and compli- cated - because it is. While using asymmetric cryptography and Blowfish is a good idea, we don’t see any reason for all other complicated steps - unless malware creators just wanted to waste researchers time. Whole encryption process is summarized in Graph 2. Figure 2: Kelihos encryption method If we want to decrypt data we need to go through all this steps, but in reverse. First we should decrypt “header”, and compute block lengths. After that, decrypt all three blocks using bitcrypt1, and recover Blowfish key and random seed. Finally decrypt serialized peers data using that key and seed. Commented routine for most of this operation can by found in appendix sec. 10.1. 4 Necurs Necurs is one of the biggest botnets in the world - with more than 1.5 million infected computers, it has active bots in almost all countries, several hundred 7 thousand of which are online at any given time. Compromised machines usually send spam emails to a large number of recipients, though the botnet has the capability to act as a proxy or perform DDoS attacks. 4.1 High-level overview Necurs communication protocol is complicated, definitely not pretty, and full of strange quirks[7]. For example, three different compression algorithms are used, encryption algorithms are home-made and serve more for obfuscation than securing transmission, and a lot of structures are unnecessarily bloated. Necurs botnet is divided into sub-botnets - each Necurs binary has hardcoded and immutable botnet_id saved in its resources. Sub-botnets have different C&C servers, peer lists, and DGA (botnet_id is used as part of DGA seed). Currently, we know of three botnets: ID 5 (biggest one), 9 (smaller one) and 7 (small, and C2 is long dead) The botnet is an example of a hybrid network, i.e. a mixture of centralized (that simplifies and speeds up management) and peer-to-peer decentralized model (making it much more resistant to takedowns) - and additionally, DGA is implemented. With so many features it’s not surprise that Necurs had survived so long. The malware attempts to connect to the C2 server, whose IP address is retrieved in a number of different ways: • First, a couple of domains or raw IP addresses are embedded in the program resources. • If the connection fails, Necurs runs domain generation algorithm, crafting up to 2048 pseudorandom names, generation of which depends on current date and seed hardcoded in encrypted resources, and tries them all in a couple of threads. If any of them resolves and responds using the correct protocol, it is saved as a server address. • If all these methods fail, C2 domain is retrieved from the P2P network – the initial list of about 2000 peers (in form of IP+port pairs) is hardcoded in the binary. During analysis, Necurs used the last method, since none of the DGA domains was responding. It is, however, possible, that in the future the botnet’s author will start to register these domains – a new list of potential addresses is generated every 4 days. After establishing a successful connection to the C2, Necurs downloads (using a custom protocol over HTTP) needed information, most notably additional modules (spam module, proxy module, rootkit) and additional C2s (for example spam C2). After that, each module is started. Finally, spam module requests templates and variables from spam C2. When all necessary information is downloaded, spam is being sent. This process is summarized by Graph 3. 8 Figure 3: Necurs Communication 9 4.2 Binary resources If we want to start communicating with Necurs, we first have to decrypt its resources. They are stored in binary in encrypted form. To find them, we have to find two consecutive qwords in memory that satisfy the following equation: a * 0x48F1398FECF + 12345678901253 === b (mod 2**64) They mark first bytes of encrypted resources. In Python it’s simple five-liner: def get_base(dump): for i in range(len(dump) - 0x10): a, b = struct.unpack(">= 8 if not id: break res = dump[base + 12:base + 12 + sz] resources.append({"offset": base, "id": id, "content": res}) base += 12 + sz Most interesting resource types are presented in Table 3 Table 3: Necurs resources Id Meaning Example 0x5148b92048028c4e Botnet ID 5 0x59e80beb0279afba Peer list . . . 0x2f26c75348f3f531 P2P communication key . . . 0x7c7b239242b0aec2 C2 communication key . . . 0x6fa46c4146c2c285 C2 url path /forum/db.php 0x7ddd7ae7c4e9d441 C2 domain npkxghmoru.biz 4.3 DGA and P2P Now we need C2 server address. As we noted, there are three ways to get it. If it’s stored in static resources, we already have it. Unfortunately, this is often not the case (or the one stored is obsolete) and we need to resort to other techniques. Second option is DGA algorithm. Domain list changes every four days, and depends only on current date and botnet ID: def dga_mix_and_hash(param): 11 param %= 2 ** 64 for i in range((param & 0x7F) + 21): param = (param + ((7 * param) ^ (param << 15)) + 8 * i - (param >> 5)) % 2**64 return param def dga_generate_one(year, month, day, seed, rnd_param): domain = "" mix = dga_mix_and_hash(year) mix = dga_mix_and_hash(month + mix + 0xAAAA) mix = dga_mix_and_hash((day >> 2) + mix) mix = dga_mix_and_hash(rnd_param + mix) nxt = dga_mix_and_hash(seed + mix) for i in range((mix % 15) + 7): nxt = dga_mix_and_hash((nxt >> 32 << 32) + (nxt % 2**32) + i) domain += chr(ord('a') + nxt % 25) nxt = dga_mix_and_hash((nxt >> 32 << 32) + (nxt % 2**32) + 0xABBEDF) tld_ndx = ((nxt >> 32 << 32) + (nxt % 2 ** 32)) % 43 return domain + "." + [ "tj", "in", "jp", "tw", "ac", "cm", "la", "mn", "so", "sh", "sc", "nu", "nf", "mu", "ms", "mx", "ki", "im", "cx", "cc", "tv", "bz", "me", "eu", "de", "ru", "co", "su", "pw", "kz", "sx", "us", "ug", "ir", "to", "ga", "com", "net", "org", "biz", "xxx", "pro" ][tld_ndx] In practice, malware creators have never used this technique (as far as we know), and it’s used solely by malware researchers for tracking purposes. And finally, most reliable and useful method for getting the C2 address: asking P2P network for it. All P2P communication happens over UDP protocol. The outermost layer of communication looks like this (as C-structure): struct outer_layer{ uint32_t key; uint32_t checksum; uint8_t data[]; }; This data is encrypted using key calculated as a sum of the key field and the first 32 bits of the public key contained in file resources. This homemade encryption algorithm is equivalent to the following Python code: def rolling_xor(outer_layer): msg = outer_layer.data check = outer_layer.key buff = "" for c in msg: c = chr(ord(c) ^ check & 0xff) 12 buff += c check = (check + (8 * (check + 4 * ord(c))) ^ ror4(check, 7)) % 2**32 assert outer_layer.checksum == check return buff Inside outer layer we have real messages, wrapped into another small structure: struct inner_layer { uint32_t size_flags; // packed size and flags (length << 4 | flags) uint8_t data[]; }; The most interesting message type is greeting/handshake: struct greeting{ uint32_t time; // Milliseconds since 1900-01-01 uint64_t last_received; // ID of last received message - zeroes intially uint8_t flags; }; And the response should look like this: struct response{ uint32_t version_low; uint8_t version_high; uint8_t size[3]; // Little Endian resourceList resources; uint8_t signature[]; }; The whole message is signed using a key from file resources. Most important part of this structure is resources - resource list, in the same format as ones stored inside executable. Interestingly, peers don’t send new neighborhood list – these are sent by the C2 itself. The most likely reason for this measure is avoiding P2P poisoning since it is known that peer list received from the main server is authorized and correct. 4.4 C&C communication C2 protocol is vaguely similar to the P2P one, but encryption routines and structures it uses are a bit different – also, the underlying protocol is HTTP (POST payload) instead of raw UDP sockets. The first stage is exactly the same (outer_layer structure), with different constants in encryption algorithm: def xor_encrypt(outer_layer): res = outer_layer.key buf="" for c in outer_layer.data: c = ord(c) ^ res & 0xff 13 res = (res + (2 * (res + 4 * c)) ^ ror4(res, 13)) % 2**32 buf += chr(c) assert res == outer_layer.checksum return buf But after decryption we get another structure: struct cc_structure{ uint64_t random_data; // random 8 bytes, probably to increase entropy uint64_t botID; uint64_t millis_since_1900; uint8_t command; // 0 - get command, 1 - download file, 2 - ping. uint8_t flags; // 1 - RSA sign, 2 - compress, 4 - timePrecision uint8_t payload[]; }; Contents of the payload field (perhaps compressed, depending on the second bit of flags) depends on message type (command field): • If command == 1 (download file), the payload is simply a SHA-1 hash of the requested file. • If command == 0 (get command request), the payload structure is much more complex – again, a list of resources, but with a different structure. Every resource has the following header: struct cc_resource{ uint8_t type; uint64_t id; uint8_t data[]; }; Where id is request/response id - tables 4, 5 contains possible request and responses that bot can send and receive. Table 4: IDs used in HTTP request commands: Id Meaning 0x4768130ffd8b1660 Botnet Id 0x50a29bce1ea74ddc Seconds since start 0x5774f028d11237ac System language 0xc3759a8411bcfb90 Public IP 0xd8cc549b8fb48978 Is user admin? 0x0a8aa0eec8402790 Is win64? 0xa6f73a722b8d2144 Is rootkit installed? 0x9924541302c75f90 Public TCP port for P2P 0x543591d7e21cfc94 Current hash of peer list 14 Table 5: IDs used in HTTP response commands Id Meaning 0x4008cdaf91d42640 P2P Peer list 0x49340b1574c451a4 HTTP C2 domain list 0xd2b3cb6d2757a62c Sleep for N seconds 0xf7485554ea9dfc44 Download and execute module 0x3cae696275cd12c4 Download and execute rootkit Data depends on resource type struct cc_resource_type_0 { uint32_t size; uint8_t data[]; // length=size }; struct cc_resource_type_1 { uint32_t data; }; struct cc_resource_type_2 { uint64_t data; }; struct cc_resource_type_3 { uint64_t data; }; struct cc_resource_type_4 { uint16_t size; uint8_t data[]; // length=size+1 }; struct cc_resource_type_5 { uint8_t data[20]; }; Type 4 is usually used to send text data, which is probably the reason why the resource size is increased by one (for null terminator). A client sends a list of such resources to the C2. We were able to identify the meaning of some of them: • DGA seed • Number of seconds since malware start • Unix timestamp of malware start • OS version and its default language • Computer’s IP (local if behind NAT) 15 • UDP port used to listen for P2P connections • Custom hash of current peer list The server responds with a very similar format, depending on command type: • If command == 1, response is just requested file contents (usually com- pressed, depending on flags) • If command == 0, response is again more complicated - list of resources in the same format as in request. One of more interesting resources that we can receive from server is new peer list (if we sent hash that doesn’t match one in C2) or new DLL announcement. The latter resource again has its own structure for communication purposes, also made of concatenated sub-resources of the following form: struct subresource{ uint32_t size; uint8_t unknown[18]; uint8_t sha1[20]; char cmdline[]; // length=size-42 }; The command should be interpreted as a request for running DLL identified by its SHA-1 with command line parameters stated in cmdline field – in practice, the argument is a newline-separated list of C2 addresses (with HTTP path) to be connected to. 4.5 Spam module - communication The last protocol we will describe (but very important one), is the communication of the downloaded DLL module, whose responsibility is to send spam emails. The information is wrapped in the following structure (sent as POST data over HTTP): struct spam_wrap{ uint8_t data[]; uint32_t crc32; uint32_t key; // 4th bit of key is compression flag. }; The encryption algorithm used: def encrypt(msg, key): key=rol4(key, 0x11) res="" for c in msg: tmp=ror4(key, 0xB) key+=((0x359038a9*key)^tmp)&0xFFFFffff 16 res+=chr( (ord(c)+key) & 0xFF ) return res So messages are packed like this: def send_message(json): rnd = rtdsc() enc = encrypt(json, rnd) checksum = rol4(crc32(enc) ^ rnd, 5) payload = enc + struct.pack("