### Ransomware Encryption Internals: A Behavioral Characterization ###### Antonio Cocomazzi Threat Intelligence Researcher, SentinelOne ----- #### whoami ###### ➔ Threat Intelligence Researcher @ ``` SentinelOne ➔ Mainly deal with malware analysis and reverse engineering ➔ Free time = coding offensive tools + deepin into Windows internals ➔ Previously presented at BlueHat, ``` ``` @splinter_code @antonioCoco ``` ----- #### Why this research ###### ➔ Data encryption is the core functionality of every Ransomware and it ``` enables their successful operations to extort money from the victims ➔ Static indicators are acceptable but behavioral indicators are gold ➔ Extracting Behavioral Indicators means deep knowledge -> lots of study -> very time intensive ➔ Providing a behavioral characterization should ease this --^ ➔ Identifying behavioral commonalities can provide detection opportunities generic enough to identify all the most advanced Ransomware families, instead of relying of specific detection for specific families ``` ----- #### Agenda ###### ➔ Defining the data encryption scope ➔ Evolution, Trends and Unique features ➔ The behavioral characterization ➔ Behavioral detection based on overlapping ``` implementations ◆ Cross Drive File Enumeration detection ◆ File Footer Writing detection ◆ Encryption Key Randomization detection ◆ Restart Manager API heavy usage detection ➔ Conclusion ``` ----- ## Defining the data encryption scope ----- #### Defining the data encryption scope ###### ➔ Data Encryption characterization requires a dedicated threat ``` model wide enough to cover Ransomware behaviors in a generic way ➔ Four Macro features: ◆ Files And Directories Enumeration ◆ File Encryption ◆ Encryption Parallelization ◆ Encryption Optimization ➔ Selected Ransomware: ◆ Babuk ◆ BlackMatter ◆ Conti ``` ----- # Some months later… ----- ## Evolution, Trends and Unique features ----- #### The shifts in the encryption schemes ###### ➔ Main shift is the adoption of Elliptic-Curve Diffie-Hellman (ECDH) key ``` exchange algorithms instead of RSA as asymmetric encryption -> main difference the private key is never left on the victim host neither in encrypted form ➔ The evolution of the encryption implementation aims to avoid the usage of the CryptoAPI functionalities offered by the Windows operating system ➔ Ransomware developers prefer to use open-source libraries or custom implementation for their symmetric and asymmetric encryption operations (e.g. curve25519-donna, HC-128, custom ChaCha20...) ➔ All l d f ili d th i f ti i d t t th ``` ----- ##### The shift from RSA to ECDH in Asymmetric Encryption: RSA ``` secret.txt ``` ``` secret.txt.encrypted ``` ``` secret text content ``` ``` Generate Embed RSA public Key ``` ``` Data Input Symmetric Data Output Encryption Input Key Symmetric Key Random Key Generate ``` |Random Key|Col2| |---|---| ||| ``` Ransomware.exe ``` ``` Operator ``` ``` RSA Encrypt ``` ----- ##### The shift from RSA to ECDH in Asymmetric Encryption: RSA ``` secret.txt ``` ``` secret.txt.encrypted ``` ``` secret text content ``` ``` Data Input Data Output ``` ``` Generate Embed RSA public Key ``` ``` Ransomware.exe ``` ``` Operator ``` ``` RSA Encrypt ``` ----- ##### The shift from RSA to ECDH in Asymmetric Encryption: RSA ``` secret.txt ``` ``` secret.txt.encrypted ``` ``` secret text content ``` ``` Data Input Symmetric Data Output Encryption Input Key Symmetric Key Data Output ``` ``` Generate ``` ``` RSA public Key ``` ``` Operator ``` |111001100110 100101010110 101001|Col2| |---|---| |Encrypted Symmetric Key|| |t|| ``` Only this ``` ``` Input Key Data Input RSA Decrypt ``` ----- ##### The shift from RSA to ECDH in Asymmetric Encryption: ECDH ``` secret.txt ``` ``` secret.txt.encrypted 111001100110 100101010110 101001 EC Public Key Ephemeral Public Input Key ECDH Key Agreement EC Public Key Ephemeral Private ``` ``` Operator ``` ----- ##### The shift from RSA to ECDH in Asymmetric Encryption: ECDH ``` secret.txt ``` ``` secret.txt.encrypted ``` ``` secret text content ``` ``` Data Input Symmetric Data Output Encryption ``` ``` Input Key Symmetric Key ECDH Embed Ephemeral Key Setup EC Public Key Master Ransomware.exe ``` ``` Key Agreement EC Public Key Ephemeral ``` ``` Destroy Key ``` |EC Ke Agree|DH y ment| |---|---| ``` Operator ``` ----- ##### The shift from RSA to ECDH in Asymmetric Encryption: ECDH ``` secret.txt ``` ``` secret text content ``` ``` Data Input Data Output ``` ``` secret.txt.encrypted 111001100110 100101010110 101001 EC Public Key Ephemeral Main Difference: The symmetric key never touches the disk neither in encrypted form ``` ``` EC Public Key Master ``` ``` ECDH ``` ``` Operator ``` ``` Private Input Key Key ``` ----- #### Automated discovery of internal resources to target ###### ➔ Every Ransomware implementation bundle automated ways to find ``` and seek for relevant resources to encrypt ➔ The common trend identified is to enumerate all local directories and finding the remote shared resources ➔ Unique implementations that perform a more in-depth seek: ◆ BlackMatter uses LDAP queries to retrieve all the computer names in the domain and build a list of the remote machines to encrypt files from ◆ Conti retrieves the network addresses of the machines connected to the network through the ARP table stored locally ``` ----- #### Automated discovery of internal resources to target ###### ➔ Blackmatter automated LDAP discovery: ``` ADsOpenObject(“LDAP://rootDSE”, ..., IID_IADs, &IADs_object) -> IADs_object::Get(..., &defaultNamingContext) -> ADsOpenObject(wcscat(“LDAP://CN=Computers,”, defaultNamingContext.bstrVal, ..., &IID_IADsContainer, &pADsContainer) -> ADsBuildEnumerator(pADsContainer, &ppEnumVariant) -> ADsEnumerateNext(ppEnumVariant, …, defaultNamingContext, ...) -> IAD bj t::G t( “dNSH tN ” &d H tN V i t) ``` ----- #### Growing focus in performance improvements ###### ➔ One interesting evolution identified is the adoption of tasks ``` parallelization in the Ransomware payloads -> The main motivation around that is to shorten the time of reaction of the security team behind the compromised organization ➔ All ransomware implementations analyzed prefer a native multithreading approach over a multiprocessing approach ➔ The main trends observed for the encryption parallelization is the usage of I/O completion ports ``` ----- #### Growing focus in performance improvements ###### ➔ Some unique performance improvements implementations... ➔ Babuk uses a unique approach with Semaphores and custom ``` management of the thread pools and shared data structure. ◆ Less overhead than using completion ports ➔ BlackMatter uses undocumented Windows functions to increase its process class and IO priority ◆ This instructs the kernel to schedule primarily the execution of the threads running in the Ransomware process thus granting a performance improvement ``` ----- #### Automated discovery of internal resources to target ###### ➔ Blackmatter undocumented functions to increase process priority: ----- #### Additional efforts to maximize the encryption damages ###### ➔ Ransomware developers ensure that the disruptive operations ``` carried out by their Encryptor have a higher impact on the targeted systems ➔ The common trend is to kill a set of processes and services starting from a list of “unwanted” names ➔ Moreover, for unknown processes that hold lock conditions on files, the Restart Manager API are used to identify all the processes that prevent the successful encryption of files already in use ``` ----- #### Additional efforts to maximize the encryption damages ###### ➔ Another common feature is the usage of functions to erase ``` volume backups (i.e. shadow copies) ➔ The methods observed: ◆ Vssadmin.exe (delete shadows, resize shadowstorage) Utility to delete or resize the shadow copies ◆ Using COM (IWbemLocator, IWbemContext, IWbemServices) Out-of-process COM objects to interact with the VSS providers through WMI services ``` ----- #### Automated discovery of internal resources to target ###### ➔ Babuk implementation for killing file lock holders: ----- ## The behavioral characterization ----- #### The behavioral characterization ``` Feature Feature ###### ➔ Files And Directories Enumeration ➔ File Encryption ``` ``` ◆ Mount hidden volumes ◆ Local Drive Enumeration ◆ Remote Drive Enumeration ◆ File Enumeration ``` ``` Sub-features ``` ``` ◆ Asymmetric Encryption ◆ Symmetric Encryption ◆ Key Randomization ◆ Encrypted Block Writing ◆ File Footer Writing ``` ``` Feature Feature ###### ➔ Encryption Optimization ➔ Encryption Parallelization ``` ``` ◆ Kill unwanted Services ◆ Kill unwanted Processes ◆ Shadow Copies Deletion ◆ Kill file lock holders ``` ``` Sub-features ``` ``` ◆ Multi threading ◆ Synchronization ``` ----- #### The behavioral characterization ###### ➔ The various Ransomware families analyzed implements the sub``` features in various ways ➔ By collecting all the details about the implementations it’s possible to map the implementations of each sub-features to the corresponding family ➔ The mapping has been based on the NT/Win32 API usage of the implementations ➔ The goal of this mapping is to provide a way to recognize overlapping implementations across families and ease the development of effective detection to identify Ransomware behaviors commonalities ``` ----- #### The behavioral characterization ###### ➔ Results for “Files And Directories Enumeration”: ----- #### The behavioral characterization ###### ➔ Results for “Files Encryption”: ----- #### The behavioral characterization ###### ➔ Results for “Encryption Optimization”: ----- #### The behavioral characterization ###### ➔ Results for “Encryption Parallelization”: ----- ## Behavioral detection based on overlapping implementations ----- #### Behavioral detection based on overlapping implementations ###### ➔ Overlapping sub-features implementations: ----- #### Cross Drive File Enumeration detection ###### ➔ Every Ransomware analyzed performs the sub-feature “File Enumeration” with ``` the same implementation: ◆ FindFirstFileEx(“[DRIVE]:\[PATH]\*”, ...) ◆ FindNextFile() ➔ The usage of the Win32 Api function FindFirstFileEx() combined with the wildcard ‘*’ char appended at the end of each path found on the system does generate a specific IRP at the kernel level: ``` ----- #### Cross Drive File Enumeration detection ###### ➔ A potential problem with this approach is that it could be prone ``` to a high false positive rate ➔ Here is where it comes into play the concept of the “Cross Drive” file enumeration. ◆ Every Ransomware performs a series of operations to identify all the hidden, local and remote drives on the system prior to the file enumeration operation ➔ The IRP_MJ_DIRECTORY_CONTROL IRP is dispatched to multiple logical drives. This makes the operation quite unique and abnormal for usual benign applications ``` ----- #### Cross Drive File Enumeration detection ----- #### File Footer Writing detection ###### ➔ Every Ransomware analyzed performs the sub-feature “File Footer Writing” ``` with the same implementation: ◆ SetFilePointerEx(hFile, …, FILE_END) ◆ WriteFile(hFile, fileFooterStruct, sizeof(fileFooterStruct), …) ➔ The combination of these Win32 Api functions generate a specific IRP with specific characteristics at the kernel level: ``` ----- #### File Footer Writing detection ###### ➔ In a pre operation callback IRM_MJ_WRITE, if the parameter ``` IrpSp->Parameters.Write.ByteOffset is equal to the actual size of the file in which the write is happening ◆ It means that’s an append operation ◆ Then the value IrpSp->Parameters.Write.Length should be stored for further validation ◆ This value represents the actual size of the struct used by the ransomware to append the footer information needed for the decryption ➔ Unfortunately, the file footer struct size differs between Ransomware implementations ◆ We can aggregate the number of append operations that have the same recurring length ◆ This characterizes the behavior of a Ransomware trying to write its own ``` ----- #### File Footer Writing detection ###### ➔ Babuk example of “File Footer Writing” implementation: ----- #### File Footer Writing detection ###### ➔ By monitoring the file writes performed in this way, it is possible ``` to count how many file markers are appended to files ➔ E.g. We can keep track of these file writes with a dictionary data structure where on the key is stored the Length of the write operation and as a value the counter of how many times that write with that size has been appended to a file ➔ When a Ransomware is executed it should be observed that the counter contained in the value of a specific key of the dict is exceeding a threshold ``` ----- #### Restart Manager API heavy usage detection ###### ➔ Restart Manager API usage common implementation: ``` ◆ RmStartSession() ◆ RmRegisterResource(..., &filePath, …) ◆ RmGetList() ➔ Whenever a call to CreateFile() fails to return a valid file handle, the Ransomware assumes the failure is due to some file locking mechanism held by some process ◆ This generates a heavy usage of the Restart Manager APIs ➔ Lowest NT API to monitor by reversing RmGetList() from RstrtMgr.dll: ◆ RmGetList() ◆ CRestartManager::GetAffectedApplications() ◆ CRestartManager::UpdateInternalData() ◆ RmFileFactory::UniqueAffectedPids() ◆ RMRegisteredFile::AffectedPids() ``` ----- #### Restart Manager API heavy usage detection ###### ➔ The invocation of NtQueryInformationFile() from ``` RmGetList() uses an undocumented FILE_INFORMATION_CLASS value of FileProcessIdsUsingFileInformation ➔ Peak usage of this call performed with the FileProcessIdsUsingFileInformation value (0x2F) could be used to characterize the usage of the Restart Manager API specifically by a Ransomware ➔ The detection spot occurs at userland level :( ``` ----- #### Encryption Key Randomization detection ###### ➔ Private keys are generated through PRNG (pseudo random number ``` generator) either for symmetric or asymmetric encryption ◆ This randomization operation is performed for each file encrypted thus generating a high volume usage of the PRNG functionalities ◆ These implementations rely on the Win32 API calls CryptGenRandom() and CryptGenKey() from advapi32.dll ◆ The observed value “dwLen” for the CryptGenRandom() call do overlap between the different implementations ◆ Usually this value is equal to 16 or 32 that match the size of the private keys for the encryption algorithms implemented (so 128 or 256 bits) ➔ Not very generic and robust like others detection methods... ◆ But it can be used as an opportunistic way to detect implementation based on these APIs usage ``` ----- #### Conclusion ###### ➔ Giving insights of what are the core operations ``` characterizing the data encryption stage makes analysis of these complex threats easier ➔ Identifying commonalities in implementations allows to create behavioral indicators based on the side-effects generated by those operations valid for most Ransomware families ➔ The main reason for preferring behavioral indicators over static indicators is because they are much more reliable and harder to evade ➔ TL;DR Behavioral detection is the right approach for ``` ----- #### Special Thanks ###### ➔ SentinelOne’s team (Claudia, Daniel, Andreas) ➔ Idan Weizman ➔ Chuong Dong (@cPeterr) ----- # Thank You! ``` @splinter_code splintercod3@gmail.com ``` -----