# Dark Matter: Uncovering the DarkComet RAT Ecosystem ## ABSTRACT where an attacker interacts with each victim individually, scouring greater individual harm they beget. Unfortunately, aside from a paper are the victims of RATs. A considerable challenge of studying ## CCS CONCEPTS - Security and privacy → **Malware and its mitigation.** ## KEYWORDS **ACM Reference Format:** 2020. Dark Matter: Uncovering the DarkComet RAT Ecosystem. In Proceed_ings of The Web Conference 2020 (WWW ’20), April 20–24, 2020, Taipei, Taiwan._ ## 1 INTRODUCTION _WWW ’20, April 20–24, 2020, Taipei, Taiwan_ ----- WWW ’20, April 20–24, 2020, Taipei, Taiwan Farinholt, et al. tively. Most recently, Farinholt et al. [25] studied the behavior of DarkComet operators themselves in the wild, while Rezaeirad et _al. [58] investigated the DarkComet ecosystem by sinkholing thou-_ ◦ **Operator: Miscreant interactively controlling a victim’s com-** ◦ **Victim: User whose computer is infected with a RAT stub, who** ◦ **Controller: Software used by an operator to configure and** ◦ **Stub: Malware on a victim’s computer that communicates with** ❖ We describe a methodology for tracking controllers of Dark ❖ We describe a methodology for identifying real DarkComet ❖ We detail the process by which we collect information about our ethical and legal considerations. Section 4 describes how ## 2 BACKGROUND 2.1 DarkComet RAT 42, 46, 52, 59, 61]. Marczak et al. [42] provided a particularly detailed _2.1.1_ _Downloading Victim Databases. DarkComet allows an oper-_ _any file from the controller, without operator notification. Further_ for research purposes. Breen’s dc-toolkit [7] provides a set of loaded with the dc-toolkit, highlighting some of its sensitive _2.1.2_ _Hack Pack Sharing. DarkComet was initially offered freely_ for download by its author, DarkCoderSc, from an official site [40]. what is known as hacking packs or hack packs, collections of RATs ``` DarkComet.exe, run from a directory that contains its supporting ``` DLLs (e.g., SQLite.dll) that hack pack distributors simply com the victim SQLite database, stored in a file called comet.db. Most hack packs also include this database file, which contains records of ----- Dark Matter: Uncovering the DarkComet RAT Ecosystem WWW ’20, April 20–24, 2020, Taipei, Taiwan ## 3 DATA COLLECTION 2.2 RAT Controller Discovery in industry. Marczak et al. [42] created a scanner that was able Most recently, Farinholt et al. [25] presented a scanner that used ## 2.3 Estimating Infected Population surement studies. Ramachandran et al. [54] proposed a method for Antonakakis et al. [4] used a variety of techniques to gauge the running a so-called milker to obtain attack commands. ## 2.4 Understanding Data Set Pollution Kanich et al. [34] showed that data set pollution caused by interfer of the methodology showcased by Yokoyama et al. [63]. Specifically related to RAT malware, Rezaeirad et al. [58] recently investigated **Figure 1: Illustration of our data collection methodology,** **from sourcing DarkComet configurations from threat feeds** **to downloading databases from detected hosts.** ## 3.1 Controller Discovery addresses. Of the sources in Figure 1, only Shodan does not provide _3.1.1_ _Anonymizing Infrastructure Usage. 16.5% of scanned Dark-_ Relakks VPN.[1] Such a relatively small population of hosts using 1We use MaxMind and Recorded Future to compile a list of anonymized IP ranges. ----- WWW ’20, April 20–24, 2020, Taipei, Taiwan Farinholt, et al. _3.1.2_ _Dynamic DNS Usage. As Dynamic DNS (DDNS) is a popular_ ## 3.2 Victim Database Acquisition Breen released the dc-toolkit [7], a Python tool for blind file of this tool to collect victim databases and DarkComet configuration _files from DarkComet controllers discovered by our scanner._ _3.2.1_ _DarkComet Victim Database. On execution, the DarkComet_ controller executable (DarkComet.exe) creates or loads a file in its working directory named comet.db. This SQLite database man ing Section 3.2.2. We downloaded this file from DarkComet.exe’s tim record (a taint) to its dc_users table (continue to Section 3.2.2 records uniquely in the dc_users table. _3.2.2_ _Victim Database Schema. DarkComet uses a SQLite database,_ stored in a file named comet.db, to manage victim connections **dc_users. This table contains a single row for every unique vic-** this row are self-explanatory. userGroup references the groupId field in dc_groups. UUID is the victim machine’s hardward profile ID, returned by the function GetCurrentHwProfile, sometimes ``` userIP and userName fields before storing them. Prior to hashing ``` **dc_keyloggers. This table stores victim keystrokes. Each row con-** tains the keystokes logged from a victim, denoted by a UUID that references dc_users, on a given day. The name field refers to the daily point all stored daily logs are uploaded at once. The contents **dc_groups. This table allows for attackers to sort and annotate** _3.2.3_ _Databases from Hack Packs. To supplement our data set of_ _3.2.4_ _Database Download Failures. In the first month of operation,_ tacks. Since then, we have used SOCKS5 proxying to anonymize due to SOCKS5 proxying during large file downloads. Additionally, ## 3.3 DarkComet Configuration File DarkComet also uses an INI file named config.ini to manage ----- Dark Matter: Uncovering the DarkComet RAT Ecosystem WWW ’20, April 20–24, 2020, Taipei, Taiwan **Table** **Column** **Format** **Example** dc_keyloggers `dc_users` `UUID` - `{846ee340-7039-11de-9d20-806e6f6e6963-12345678}` dc_users UUID `userIP` / [] : `8.8.8.8 / [10.0.0.5] : 1604` `userName` / `DESKTOP-432AHT11 / Administrator` UUID name `userOS` [] ( ) `Windows 7 Service Pack 1 [7601] 32 bit ( C:\\ )` userIP content `userGroup` `0` userName `dc_keyloggers` `UUID` `{846ee340-7039-11de-9d20-806e6f6e6963-12345678}` userOS `name` -.dc `2015-12-10-5.dc` userGroup dc_groups ``` content ``` groupId `dc_groups` `groupId` `0` `groupTitle` `Webcam` groupTitle **Table 1: The schemas of the tables of importance in the DarkComet database.** uration section header with the victim’s database UUID. Further, ``` config.ini also contains automation information, listing the tasks ## 3.4 Ethical and Legal Considerations ``` **Respect for persons. Since “participation” in this study is not** **Beneficence. We believe that our analysis does not create fur-** **Justice. The benefits of this work are distributed to the wider public,** **Respect for law and public interest. We describe the legal frame-** gal conduct, establishing legal proof of criminal conduct is not the ## 4 DATA ANALYSIS & PROCESSING **However, the raw data we have downloaded is far from** **ready for analysis. We assert that a single IP address is not syn-** database inheritance. Our technique uncovers unexpected operator _behaviors, also detailed in Section 4.1._ may not be real victims. Rezaeirad et al. [58] demonstrated that apply the technique described by Rezaeirad et al., and then improve ## 4.1 Database Attribution ----- WWW ’20, April 20–24, 2020, Taipei, Taiwan Farinholt, et al. downloaded databases to 1,162 controllers. 667 of these controllers for each database in our data set. Therefore, 1,162 is an overestimate ferent hostname, we use the records in dc_users to construct an _4.1.1_ _DarkComet Database Ancestry. The dc_users table in a_ the dc_users table. Returning users are identified by their UUIDs, the order of the records in dc_users describes the order in which ``` dc_users table from a controller, we expect it to have new victims ``` appended to the end, so that the previously downloaded dc_users Furthermore, recall that we add a unique victim record, or taint, to the dc_users table each time we download it because the pro controller’s dc_users table should, therefore, not only contain a Using the monotonic growth property of the dc_users table trollers from 1,162 (identified by hostname only) to 1,029. Thus, _4.1.2_ _Database Divergence. If two controllers start with the same_ and 3.2.3): their dc_users tables will each contain the set of victims victims. We use the term divergence to describe cases where two ``` dc_users tables. ``` containing their common prefix, we infer such an ancestor database with an edge from a parent to child if the dc_users table of the parent is a prefix of the dc_users table of the child, that is if the of convergence in the DarkComet inheritance tree because there is troller reverts to an earlier version of the database. This happens _4.1.3_ _Hack Pack Prevalence. Of note is that 68% of controllers’_ _4.1.4_ _Controller Attrition. We managed to download just a single_ ----- **Figure 2: Fragment of the reconstructed DarkComet database inheritance tree. Open circles are sequences of databases down-** **loaded from a single controller; grey rectangles are known hack packs; black rectangles are inferred hack packs not part of** **our corpus of hack packs; black circles are single-controller reversion points.** _4.2.1_ _Static Anomaly Detection. Rezaeirad et al. [58] indicated that_ to the victim records in our DarkComet databases’ dc_users tables anomalous by these rules. Note that anomalies marked with a † are _4.2.2_ _Hack Pack Victim Detection. In Section 4.1.1 we described_ **Description** **Records** **UUIDs** In hack pack † Missing expected keystrokes † <0.1% <0.1% Anomalous keystrokes † <0.1% <0.1% <0.1% <0.1% <0.1% <0.1% <0.1% <0.1% <0.1% <0.1% New anomalous victims † Total victims in hack packs † **Total unique victims** **Table 2: Records and UUIDs filtered by our anomaly detec-** **tion logic. Many records exhibit more than one anomaly.** ## 4.2 Identifying Victim Pollution tion efforts in terms of records, that is, rows in dc_users tables. In the 6,620 databases’ dc_users tables, there are 7,704,586 total _4.2.3_ _Keylog Validation. Using metadata from the keylog table, we_ ``` dc_keyloggers table, described in Section 3.2.2, contains a file per ``` and dumps them, a file per day, to the controller’s dc_keyloggers leaves a conservative estimate of 57,805 victims. Had we only ----- WWW ’20, April 20–24, 2020, Taipei, Taiwan Farinholt, et al. applied the rules described by Rezaeirad et al. [58], we would have ## 5 OBSERVATIONS & APPLICATIONS 1.0 1.0 Hack Pack 0.9 0.9 No Hack Pack 0.8 0.8 0.7 0.7 0.6 0.6 0.5 0.5 0.4 0.4 0.3 0.3 0.2 0.2 ## 5.1 Operator Takedown 0.1 0.1 **Figure 3: CDF of victim infec-** **tion duration (n=6,354).** **Figure 4: CDF of controller** **age (n=507).** arrested a DarkComet operator with over 2,000 victims, an operator _we had also been tracking. In our data set, we find evidence of several_ **Infection Rate. Over our 213-day measurement period, controllers** from each controller and the last, an aggregate rate of 69 new **victims per day, or a new victim every 20 minutes. Infection rates** of which infected 463 victims in just 23 days at a rate of 20 new _victims per day. Rapidly growing campaigns could be prioritized_ **Total Victims. The total number of victims infected per controller** our data set have over 9,000 victims each; the next closest has 3,468. **Campaign Longevity. We consider the operational age of a con-** ## 5.2 Infection Duration & Cleanup Çetin et al. [12, 13] have demonstrated the effectiveness of ISP their victims using group names, recorded in the dc_groups data ----- Dark Matter: Uncovering the DarkComet RAT Ecosystem WWW ’20, April 20–24, 2020, Taipei, Taiwan ## 5.3 Operator and Victim Geography **Controller** **Country** **Count** **Footprint** **Victims** **Table 3: Sample of controller locations and victim counts.** **Controller footprint is the total number of victims con-** **trolled by all controllers from a given country.** to understand the geographic relationship between them.[2] This arately in the Anonymous VPN row. Controllers whose location we could not determine are counted in the Other row. Figure 5 2We geo-locate IP addresses using MaxMind’s Precision Insights service [44]. Rezaeirad et al. [58] evince that attackers and victims in the the majority of victim hosts (77%) are not in the same country 74% of attackers are in the same country as the majority of their victims. Colocation of operator and victim is the norm for operators ment operations would likely be both successful and high-impact. **Figure 5: Number of victims for each combination of con-** **troller and victim country. Rows denote controller countries** **and columns victim countries, based on geo-located IP ad-** **dress (excluding VPN providers).** ## 5.4 Observed Harm to Victims we detected in our data set. We quantify the harm incurred by the ----- WWW ’20, April 20–24, 2020, Taipei, Taiwan Farinholt, et al. The 2,664 victims in this sample set had 210,835,801 keystrokes captured over 25,315 days, amounting to over 162,098 hours of tim’s webcam. We were able to download the config.ini file from ## 6 DISCUSSION **Campaign Tracking. Our method of obtaining victim databases** **Victim Identification. The pollution reduction heuristics we im-** Çetin et al. [12, 13] demonstrated the potential for victim notifi **Understanding Victim Harm. Understanding the harms incurred** **Study Limitations & Extensibility. Our data collection method-** ## 7 CONCLUSION ACKNOWLEDGMENTS ----- Dark Matter: Uncovering the DarkComet RAT Ecosystem WWW ’20, April 20–24, 2020, Taipei, Taiwan ## REFERENCES understanding the botnet phenomenon. In ACM Internet Measurement _Conference (IMC), 2006._ [2] N. Anderson. How an omniscient internet “sextortionist” ruined the lives of teen Understanding the Mirai Botnet. In USENIX Security Symposium (USENIX), 2017. _Security BSides London, 2015._ gardens. In Fourteenth Symposium on Usable Privacy and Security (SOUPS 2018), violence. In 2018 IEEE Symposium on Security and Privacy (SP), pages 441–458. [16] D. Dagon, C. Zou, and W. Lee. Modeling Botnet Propagation Using Time Zones. In Networked and Distributed System Security Symposium (NDSS), 2006. Scanning and Its Security Applications. In USENIX Security Symposium _(USENIX), 2013._ _Botnet Fighting Conference (Botconf), 2013._ [23] B. Enright, G. M. Voelker, S. Savage, C. Kanich, and K. Levchenko. Storm: When researchers collide. USENIX ;login:, 2008. amateur darkcomet rat operators in the wild. In IEEE Symposium on Security and _Privacy (S&P). IEEE, 2017._ enforcement actions announced. FBI News, 2014. https: syrian activists promises security, delivers spyware. Electronic Frontier _Foundation, 2012._ _Black Hat USA, 2017._ botnets: Overview and case study. In USENIX Workshop on Hot Topics in _Understanding Botnets (HotBots), 2007._ [31] J. Hertz, S. Denbow, and J. Wetzels. Darkcomet server 3.2 remote file download. [34] C. Kanich, K. Levchenko, B. Enright, G. M. Voelker, and S. Savage. The heisenbot uncertainty problem: Challenges in separating bots from chaff. In USENIX _Conference on Large-scale Exploits and Emergent Threats (LEET), 2008._ [35] B. Krebs. ‘luminositylink rat’ author pleads guilty. Krebs on Security, 2018. [36] B. Krebs. Canadian police raid ‘orcus rat’ author. Krebs on Security, 2019. _Network and Distributed System Security Symposium (NDSS), 2017._ [39] S. Le Blond, A. Uritesc, C. Gilbert, Z. L. Chua, P. Saxena, and E. Kirda. A look at targeted attacks through the lense of an ngo. In USENIX Security Symposium _[(USENIX), 2014. https://seclab.ccs.neu.edu/static/publications/sec2014ngo.pdf.](https://seclab.ccs.neu.edu/static/publications/sec2014ngo.pdf)_ governments hack opponents: A look at actors and technology. In USENIX _Security Symposium (USENIX), 2014._ [46] R. McMillan. How the boy next door accidentally built a syrian spy tool. Wired, performing effective botnet takedowns. In ACM Conference on Computer and _Communications Security (CCS), 2013._ _USENIX Workshop on Hot Topics in Understanding Botnets (HotBots), 2007._ Using DNSBL Counter-intelligence. In Steps to Reducing Unwanted Traffic on the _Internet - Volume 2, 2006._ ----- WWW ’20, April 20–24, 2020, Taipei, Taiwan Farinholt, et al. trojan ecosystem. In USENIX Security Symposium (USENIX). USENIX syrian conflict. Trend Micro: TrendLabs Security Intelligence Blog, 2012. Takeover. In ACM Conference on Computer and Communications Security (CCS), [63] A. Yokoyama, K. Ishii, R. Tanabe, Y. Papa, K. Yoshioka, T. Matsumoto, T. Kasama, Malware Sandboxes to Provide Intelligence for Sandbox Evasion. In Research in _Attacks, Intrusions, and Defenses, 2016._ -----