### Mem2Img : Memory-Resident Malware Detection via Convolution Neural Network ###### Aragorn Tseng Charles Li ----- ###### Aragorn Tseng ###### Charles Li ###### Malware Researcher Chief Analyst ----- ## AGENDA ###### Recent Injection Technique used by APT ### ✚ ###### Dataset overview ### ✚ ###### Mem2Img Framework ### ✚ ###### Experiment result ### ✚ ###### Saliency map ### ✚ ###### Zero shot learning ### ✚ ✚ ----- # Recent Injection Technique used by APT ----- ## UUID Shellcode ######  UUidFromStrinA - it takes a string-based UUID and converts it to it’s binary representation. It takes a pointer to a UUID, which will be used to return the converted binary data. ----- ## UUID Shellcode  By providing a pointer to an heap ###### address, this function can be (ab)used to both decode data and write it to memory without using common functions such as memcpy or WriteProcessMemory.  Then use callback ###### function(EnumWindows) to execute shellcode  This vba script was used by Lazarus ----- #### Callback function to execute shellcode  the lpLocaleEnumProc parameter ###### specifies a callback function! By providing the address returned by HeapAlloc, this function can be (ab)used to execute shellcode  There are many callback functions can ###### used to execute shellcode  This case was used in a PE file ----- ## Phantom DLL Hollowing  The target dll is chosen based on the size of its .text section to house the ###### reflective payload and then it could execute the binary within a + RX section in that dll  We have found APT27 used this technique to spread CobaltStrike Beacon ----- ## Phantom DLL Hollowing ###### Find target dll in Find aaclient.dll System32 ###### Modules ###### wpsupdate.exe ###### Phamtom Dll hollowing ----- ## Phantom DLL Hollowing ###### In this case, the DLL used to make the phantom dll hollowing is aaclient.dll, it execute the cobaltstrike stager shellcode within a + RX section in that dll ----- ## Shellcode injection - Waterbear  Generate random junk bytes to envelop real shellcode when decoding ----- ## Shellcode injection - Waterbear  Using beginthreadex() acts as a proxy and starts the new thread at ###### threadstartex(), instead of using the address where the shellcode is located as if using CreateThread() directly ----- # Dataset Overview ----- ###### Memory Resident malware used by APT  APT32 (OceanLotus) - Denis backdoor  APT37 – Rokrat RAT  Tropic Trooper - TClient backdoor  BlackTech (PLEAD) – TSCookie, Capgeld, waterbear, kivars  APT10 – Sodamaster, Lodeinfo, P8RAT, CobaltStrike  Mustang Panda – PlugX  PhamtomIvy  APT27 – Sysupdate, Hyperbro, CobaltStrike  Winnti - CobaltStrike, ShadowPad  Darkseoul – Dtrack  Unknown group – Dropsocks, Dpass  21 malware family ----- ###### Cyber Crime Memory-resident Malware  Emotet  Formbook  Dridex  AgentTesla  Trickbot  QuasarRAT(also used in APT)  6 malware family ----- #### How to find memory-resident malware ###### Tool pe-sieve (hollows_hunter) volatility(malfind) Hollowfind Data source Victim’s PC Triage VirusTotal ----- ## File distribution ###### 800cs_beacon_variantDpass_loaderTSCookieIDshellManuscryptNavRATRokRATagentteslabigpoohcapgeld_lchcapgeld_ratcs_beaconcs_stagercs_stager_loaderdenisdridexdropsocksdtrackemotetemotet_shellcodeformbookkivarsplugx_fastpoisonivypoisonivy_shellcodepolaris_plugxselinasodamastertrickbotwaterbear_x32waterbear_x64xRAT 700 600 500 400 300 200 100 0 ----- #### How to deal with Data Imbalance issue  Use class weights  class_1 has 1000 instances and class_2 has 100 instances  class_weights={"class_1": 1, "class_2": 10}  SMOTE  Data argumentation  Rotate, Flip, Scale  Transfer learning  VGG16  InceptionV3 ----- ## Why Transfer Learning  Some APT Memory-resident malware is a small set of data  Transfer learning uses knowledge from a learned task to improve the ###### performance on a related task, typically reducing the amount of required training data.  They allow models to make predictions for a new domain or task (target ###### domain) using knowledge learned from another dataset or existing machine learning models (source domain). ----- ###### AgentTesla Bigpooh Capgeld_loader Capgeld_RAT CobaltStrike beacon ###### CobaltStrike stager ###### CobaltStrike stager loader ###### CobaltStrike variant ###### Denis RAT Dpass Loader ----- ###### Formbook TSCookie IDShell kivars Manuscrypt PoisonIvy PhatomIvy PlugX RokRAT Selina ----- ###### CobaltStrike stager ###### Non - consistency ###### consistency Denis RAT Dridex ----- # Mem2Img Framework ----- ## Preprocessing Data  Remove continuous bytes(junk bytes) in the binary, ex : NULL bytes, 0xFF ----- ## 1D Array to 2D Array ###### Image width = height 1D array = sqrt(len(1D array))+1 ###### 2D array ###### Binary-to-Dec Conversion 182 62 251 56 ###### Memory-resident PE or Shellcode ###### 107 30 116 87 102 119 84 30 … … … … 164 245 131 87 ###### 8-bit vectors to Images ----- ## Three channel of the image  Red channel : decimal values of each bytes  Green Channel : Shannon entropy values of each bytes  Blue channel : Local entropy values of the image  Use entropy function of skimage library  Local entropy is computed using base 2 logarithm and related to the ###### complexity contained in a given neighborhood  the filter returns the minimum number of bits needed to encode the local ###### gray level distribution. The disk is set to 10 in Mem2Img framework ----- ###### Decimal – Red Channel ###### 0011 1110 1011 0110 1111 1011 0011 1000 0101 0111 0111 0111 0111 0100 0110 1011 0110 0110 0001 1110 0101 0100 0001 1110 0010 0100 1001 1111 0101 0011 0101 0111 0000 1110 0000 1100 1100 1100 1111 0100 ###### 87 119 116 107 102 30 84 30 36 159 86 206 164 245 131 87 ###### with decimal values of each byte ###### with Shannon entropy values of each byte Value*15 ###### Shannon Entropy – Green Channel ###### Convert to grayscale image ###### Count Shannon entropy bytes to bytes, ie:10110111 -> 0.9544 ###### 0.9544 0.9544 0.5436 0.9544 0.8544 0.8113 1 0.9544 1 1 0.9544 1 0.9544 0.8113 1 0.9544 0.9544 0.8113 1 0.9544 ###### with local entropy values of each byte ###### Generate local entropy image Put the value of entropy image to blue channel ###### 3.1521 3.0935 3.0424 3.0606 3.0398 3.0642 3.0241 2.9824 2.8085 2.7159 2.7506 2.6820 2.5863 2.5259 2.4454 2.2180 ###### Local Entropy – Blue Channel ----- ## Local Binary Pattern(LBP) ###### 92 – 83 > 0 0 0 0 1 1 1 1 1 92 93 81 0 0 1 LBP 93 83 63 0 0 1 31 2[4] + 2[3] + 2[2] + 2[1] + 2[0]= 31 76 60 77 1 1 1 76 – 83 < 0 If P = 8 R = 1 0 0 1 92 93 81 0 0 0 1 1 1 1 1 0 0 1 93 83 63 1 1 1 76 60 77 Circular LBP 76 93 92 1 0 0 0 1 1 1 1 1 0 0 60 83 93 1 0 0 |92|93|81| |---|---|---| |93|83|63| |76|60|77| |0|0|1| |---|---|---| |0|0|1| |1|1|1| |Col1|Col2|Col3| |---|---|---| ||31|| |||| |0|0|1| |---|---|---| |0|0|1| |1|1|1| |= 8|R|= 1| |---|---|---| |92|93|81| |93|83|63| |76|60|77| |76|93|92| |---|---|---| |60|83|93| |1|0|0| |---|---|---| |1|0|0| ----- ###### 1 0 ##### LBP Rotational ###### 225 ##### Invariance ###### Rotation 240 120 60 30 15 135 195 mapping Choose the smallest one 15 ----- ## Data Argumentation |Original|Flip|Rotate|Scale| |---|---|---|---| ||||| ||||| ||||| ----- ## Mem2Img ###### Image Resize ###### 224*224*3 ###### M*18432 Local Binary Pattern M*26 ----- ## Mem2Ing(cont.) ###### PCA(0.95) ###### Logistic regression ###### M*94746 ###### Predicted M*1015 result PlugX Waterbear Denis CobaltStrike ... ----- ## CNN Architecture Input: 224*224*3 222*222*32 52*52*64 26*26*64 24*24*128 12*12*128 109*109*64 54*54*64 Conv: 3*3 32 filters Padding:2 Conv: 3*3 128 filters Padding:2 Pool:2*2 Stride:2 Conv: 3*3 64 filters Padding:2 Pool:2*2 Stride:2 Pool:2*2 Stride:2 111*111*32 Conv: 3*3 64 filters Padding:2 Pool:2*2 Stride:2 ----- ## Training parameter  Training : Testing : 5:1  30 class classification  12569 memory blocks image(after data argumentation)  CNN:  activation function : Relu  Batch normalization  Learning rate decay  Training ephocs:32  Logistic regression  Class weight ----- ###### Different Models’s Features ###### Model Accuracy Precision Recall F1 Score Mem2Img 98.36% 98.51% 98.36% 98.38% CNN 96.5% 97.09 96.5% 96.6% Vgg16 96.73% 97.28% 96.7% 96.8% Inception 95.8% 96.2% 95.8% 95.8% V3 LBP 84.8% 86.6% 84.8% 84.6% |Mem2Img|98.36%|98.51%|98.36%|98.38%| |---|---|---|---|---| |CNN|96.5%|97.09|96.5%|96.6%| |Vgg16|96.73%|97.28%|96.7%|96.8%| |Inception V3|95.8%|96.2%|95.8%|95.8%| ----- ## Different image |Model|Accuracy|Precision|Recall|F1 Score| |---|---|---|---|---| |RGB|98.13%|98.3%|98.13%|98.14%| |RG (without Blue channel : Local Entropy)|92.23%|93.2%|92.23%|92.23%| |Gray|88.8%|90.3%|88.8%|88.9%| ----- ## Different Algorithm ###### Model Accuracy Precision Recall F1 Score Logistic 98.36% 98.51% 98.36% 98.38% Regression SVM 98.36% 98.44% 98.36% 98.36% Xgboost 94.17% 94.51% 94.17% 94.15% Random 93.7% 95% 93.7% 93.83% Forest |Model|Accuracy|Precision|Recall|F1 Score| |---|---|---|---|---| |Logistic Regression|98.36%|98.51%|98.36%|98.38%| |SVM|98.36%|98.44%|98.36%|98.36%| |Xgboost|94.17%|94.51%|94.17%|94.15%| ----- ###### Confusion matrix among 30 malware class ----- ## t-SNE ###### Cobaltstrike ###### PLEAD malware ----- ## Saliency map ###### VGG16 InceptionV3 Original CNN ###### Waterbear _x64 Capgeld _loader ----- ## Saliency map ###### Original CNN VGG16 InceptionV3 ###### PoisonIvy PlugX ----- ## Saliency map - Waterbear ###### Config block of the waterbear stager Original CNN ----- ## Saliency map - Capgeld Loader ###### .rdata section of the Capgeld Loader Original CNN ----- ## Saliency map - Phamtom Ivy ###### Some shellcode snippets of Phamtom Ivy Original CNN Yara rules of Phhamtom Ivy ----- #### Saliency map - Mustang Panda PlugX ###### Stack strings of PlugX Original CNN ----- ## Grad-cam Analysis ###### Dridex Raw image Heatmap over raw image Cobalstrike Beacon ###### C2 parsing function And API Spam Bypass Some decode function before .rdata section Part .rdata section and part .data section ----- ## Grad-cam Analysis ###### Dpass loader ###### Unique strings block ###### Raw image Heatmap over raw image Obfuscated stack strings ###### formbook ----- ## Zero-shot Learning ###### After PCA Mem2Img Embedding Unknown Malware ###### Use KD-TREE to find 5-10 nearest neighbors ###### The unknown malware maybe modified from TSCookie and maybe have high connection to the PLEAD APT Group ###### TSCookie TSCookie TSCookie Kivars Kivars … ###### when we input the same unknown malware in to Mem2Img next time, the nearest neighbors may be the unknown malware ----- ## Zero-shot Learning  Jinhospy used by APT37  [RokRAT RokRAT Manuscrypt Selina RokRAT]  plugX_fast  [polaris_plugx polaris_plugx poisonivy poisonivy poisonivy]  Plugx_variant  [polaris_plugx polaris_plugx polaris_plugx polaris_plugx poisonivy]  TEBShell  [APT10’Cs loader APT10’Cs loader …]  P8RAT  [xRAT xRAT xRAT …]  Framecacher used by Chinese APT  [Selina Selina Selina Selina Selina] ----- ## Adversarial Attack  Padding junk bytes to make the file size large  Deliberately put the code of other malware families into the original malware ###### for obfuscation  Pack the malware files  Self Modifying Code  self-modifying code is code that alters its own instructions while it is ###### executing ----- #### Self-Modifying Code - Waterbear ``` H.\$.H.l$.H.t$ WATAUAVAWH..0.....XH..!.....H......H......QPH1......XI. ....!....P...ko....I_....GqI..@.U...#b7.;...-K(4q..)..%.."......... .Z .u...I..C...M).L..PH..ATY.H1..........%h...X.(.eH.`.......PA\A..$=.... ..O. C.U:...z..........gA0%...g@'.3..|'.|...&......qk...qy1.q..8l...(77"l. ...I...A..;+r...H......r..q...H....H....p...H..H..I..H........A..H..H. .}.......d.X.7tF....]...,....l..........?4-.....}.+G+'........d. ..... ...P...H....h...H.\$XH.l$`H.t$hH..0A_A^A]A\_.H.\$.WH.. 3.H..H..H;.t/. ###### Before self-modifying After self-modifying ``` ----- ## Conclusion  More and more advanced methods of process injection have been used  Transfer Learning have great performance on memory-resident malware ###### classification, especially on small set of data  The features extract via Convolutional Network can find out the special area ###### of malware  We have also proposed some attackable methods for Adversarial attack  https://github.com/AragornTseng/Mem2Img ----- # THANK YOU! ###### aragorn@teamt5.org charles@teamt5.org -----