p0sT5n1F3r Security Report 10.2019 Reverse Engineering of a breach SUMMARY 1. Introduction 2. First results 3. An important discovery p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 3 Introduction First results The Yarix team analyzed a very insidious backdoor, recognized today as 100% clean: a complex artifact designed exclusively for the Customer’s environment. The hooks exploited by this module are offered natively by the Apache module. Let’s start from the beginning: what is an Apache module? At high level it can be considered as a sort of library: additional code used to extend native functionalities - in this case of the Web Server. Today, in the standard package distributions, we find some already installed by default, such as mod_ssl or mod_php. In this specific case we found a module, mod_dir.so, which imitates in all respects the functionality of the standard one but adds others, really insidious. The framework provided by Apache provides the developer with a series of hooks. These allow to run additional code during the different states of execution of the process. During a recent engagement our Incident Response Team had the opportunity to analyze a very insidious backdoor implanted in an Apache web server. This allows sniffing HTTPS traffic when it is licitly decrypted by the web-server. What caught our attention was the fact that this component, while revealing many of the features of the Linux/Cdorked.A backdoor, may today be recognized as 100% clean on Virus Total. It is not every day that we find ourselves faced with cases like this: we are definitely facing a complex artifact designed exclusively for the Customer’s environment which, thanks to extensive use of encryption, is not detected by any anti- malware platform, not even the most advanced ones that have conquered the market in recent years with behavioral and / or machine learning algorithms. Apache backdoors are nothing new: unfortunately, those who are victims of malware such as Linux/Cdorked.A should remember its features and how it interacted with the infamous Blackhole exploit kit. The use of external modules or plugins for web-servers is a known persistence technique, used and abused over the years but still utilized today. Even last year, Palo Alto intelligence sources revealed that the OilRig group used the RGDoor module as a backdoor for the IIS web-server in attacks in the Middle East. The case Static analysis2. 1. p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 4 In particular, the hooks exploited by this module (img 1) are offered natively by the Apache framework and therefore we can know what and how they should be used (ref here). Hooks used by the module Image 1 https://httpd.apache.org/docs/2.4/developer/modguide.html p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 5 • ap_hook_child_init: place a hook that executes when a child process is spawned (commonly used for initializing modules after the server has forked). • ap_hook_post_config: place a hook that executes after configuration has been parsed, but before the server has forked. • ap_hook_insert_filter: place a hook that executes when the filter stack is being set up. • ap_hook_handler: place a hook that executes on handling requests. • ap_hook_log_transaction: place a hook that executes when the server is about to add a log entry of the current request. • ap_hook_register_input_filter: place a hook that executes a custom function when input is required. • ap_hook_fixups: place a hook that executes right before content generation. Based on the description of these functions, we have an idea of how the module works (img 2). Request processing in Apache 2 Ref: Apache Tutor Image 2 Data Axis Metadata Processing Output Filters Input Filters Processing AxisLoggingAccept Request Content Generator http://www.apachetutor.org/dev/request p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 6 Analyzing the individual functions that are performed by these hooks we can try to understand what this module actually does. HOOK 1 | ap_hook_child_init The first function performed is the one that is called by ap_hook_child_init. Thanks to IDAPro’s capabilities it is possible to understand which functions of the standard library are called. It is therefore possible to get a high-level idea of the actions performed as soon as Apache creates a child process to handle a request that arrives towards the server port 80 or 443 (img 3). Image 3 Module Initialization routine During initialization the module attempts to create a mutex and, if it fails, generates an error which is then logged by the _ap_log_error function. A sort of debug printf on an error_log. The other function called apr_shm_baseaddr_get is more interesting and gets, through the RDI register, the base address of the shared memory segment. Unfortunately, at the moment we are not able to understand much more from static analysis. HOOK 2 | ap_hook_post_config The second hook ap_hook_post_config that we are going to analyze is executed immediately before the father Apache process reduces its root privileges to www-data. In fact these privileges serve to allocate portions of memory which will then be used during the operation of the module. It is possible to note the portion of code that allocates a new memory pool (img 4). p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 7 Portion of code that allocates a memory pool Image 4 Until now we found nothing really strange or malicious or interesting. The module begins to show its capabilities in the functions called by the other hooks. It is here that we meet for the first time the string that from now on will represent the name of our malware: p0sT5n1F3r= Our attention is immediately struck by this string which is passed as an argument to the function ap_register_input_filter. The official documentation shows that the prototype of this function is: ap_register_input_filter (“filter name”, filter_function, AP_FTYPE_CONTENT) The first parameter is the name of the filter, the second is the function performed by the filter and the third is the type of filter. So we know that: • the module registers an input filter • the filter is called p0sT5n1F3r= • the filter, when invoked, executes the function sub_38C7 • the filter acts on the content of the request and not on its headers Before going into the details of how it works let’s try to get an overview of the other functions used by the module, so as to focus on what is really interesting. Reverse engineering in general is an activity that requires a lot, a lot of time and the risk of getting lost in the technicalities of assembly language is very high. p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 8 HOOK 3 | ap_hook_insert_filter Let’s take a look at the function ap_hook_insert_filter now. Its prototype is: ap_hook_insert_filter(filter_insert, NULL, NULL, APR_HOOK_MIDDLE) Thus, the inserted filter is given by the function passed as the first argument, ie the function sub_34AB. This function takes care of building the filter which will then be activated inside the module and in fact the function ap_regexec is used, whose role is that of “Match to NUL-terminated string against a pre-compiled regex” (img 5). Role of the function ap_regexec Image 5 p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 9 It is clear that the regular expression passed to the function resides in the position of the qword_209628 but it is clearly obtained at runtime, since in static analysis this location is empty. What did we understand until now? From the static analysis of these functions we were able to understand that the module: - is characterized by the string p0sT5n1F3r= - takes care of inserting an input filter within the task of processing requests coming to the web-server - acts on the body of requests and not on headers - the filter is activated only if it meets the exact match of a string that is obtained at runtime. p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 10 An important discovery The approach we are following was useful to begin to understand how the module works because the code was not obfuscated and none of the functions, custom or standard libraries, were resolved at runtime. This unfortunately is no longer true for the two other functions, the most interesting, which are called by the ap_hook_handler hook and by ap_register_input_filter. In the first case we are dealing with a clearly more complex function that makes extensive use of encrypted strings (img 6-7). Continuing analysis3. Function performed when the module is invoked Image 6 p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 11 Encrypted strings Image 7 Runtime resolution of the CALL instruction Image 8 In the second case we find calls to functions dynamically resolved at runtime like this one highlighted below: the CALL instruction is resolved at runtime by calling a memory address placed in the RAX register (img 8). p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 12 Before tackling the reversing of these two functions we decide to change approach. Analyzing the encrypted strings we find one that we highlighted earlier, Dz27Dz27, which has an important feature (img 9). Encrypted string Image 9 It is located at a very precise address within the binary and is called several times within the previous two functions that we have decided not to analyze. For example, we find two calls in the function sub_3D17 (img 10). Functions that call the string Image 10 They are two locations, loc_4471 and loc_4170, which use exactly the same data. Very interesting is also the fact that the dword_209464 is positioned exactly after the string highlighted before, almost as if they were the declarations of two variables closely related in some way. Curiosity obviously pushes us to check out what this function does (img 11). p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 13 The encryption function Image 11 In this shape it does not tell us much. However, if we see the decompiled code (img 12). Decompiled function code Image 12 p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 14 Those who deal with malware reverse engineering and encryption algorithms will have understood that in cases like this - nine times out of ten - this is the RC4 encryption algorithm. This is in fact the part of the algorithm known as KSA (Key Scheduling Algorithm) (img 13). KSA (Key Scheduling Algorithm) Source here Custom implementation of RC4 encryption Source here Image 13 Image 14 The key and its length are identified as arguments of the function. Once this new information is obtained, we are therefore able to trace where this algorithm is used to encrypt and decrypt data saved statically in the binary. We find the first reference in this interesting function in which we were also able to identify the second part of the RC4 algorithm, the one known as Pseudo-random generation algorithm (PRGA) (img 14). https://en.wikipedia.org/wiki/RC4 https://en.wikipedia.org/wiki/RC4 p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 15 Having therefore the RC4 key and the length of the encrypted buffer we can proceed to decrypt the buffer. To do this we will use CyberChef (img 15). CyberChef Source here Image 15 These evidences have been disruptive for the outcome of the incident response and resolution of the case. Analysing these strings we can surely proceed forward in the forensic activity of the machine and therefore obtain Indicators Of Compromise (IoC). Another very important indication of how specifically the module was built for the Customer’s environment is the presence of the string /Home/Acquisto. This information fits exactly with what we have understood from the static analysis: the regular expression created during the initialization of the module would seem to look for the match with this string which, incidentally, is exactly the URL that deals with finalizing the monetary transaction where the user enters the data for payment. The RC4 algorithm is used in many other functions within the module: each data is never in clear text but always encrypted. However, having the encryption key we can proceed a little faster. In particular, a full-bodied buffer inside the track aroused the curiosity of the reverse engineering team (fig.16). http://icyberchef.com/ p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 16 Encrypted buffer Image 16 A buffer of around 12KB that we can decipher with the same methodology as before (img 17). Decrypted buffer Image 17 p0sT5n1F3r // Reverse Engineering of a breach © 2019 Yarix - All rights reserved 17 modIframe Image 18 An html page, saved inside the module, showing a title mod_sniffer and an image called modIframe (img 18). Apart from the nice subtitle the page shows some interesting information: there are in fact variables that are obviously resolved at runtime like the kernel version or uptime and others of which, at the moment, we do not know the meaning. The module is still in the analysis phase. We share the hashes that identify it: MD5 1720aca23d81e0aa6fa28096781294c3 SHA-1 df454026aac01ad7e394c9f5c2bfdb12fea9a0e0 SHA-256 1c55ffee91e8d8d7a1b4a1290d92a58c4da0c509d5d8d2741cec7f4cf6a099bd Author Mario Ciccarelli Head of IR Team Yarix - Digital Security Division Var Group @Kartone Special thanks to Alessandro Amadori Solution Specialist - Digital Security Division Var Group JYARIX a vargroup company Yemen eee