### Mem2Img : Memory-Resident Malware Detection via Convolution Neural Network

###### Aragorn Tseng

 Charles Li


-----

###### Aragorn Tseng


###### Charles Li


###### Malware Researcher Chief Analyst


-----

## AGENDA

###### Recent Injection Technique used by APT
### ✚

###### Dataset overview
### ✚

###### Mem2Img Framework
### ✚

###### Experiment result
### ✚

###### Saliency map
### ✚

###### Zero shot learning
### ✚ ✚


-----

# Recent Injection Technique
 used by APT


-----

## UUID Shellcode

######  UUidFromStrinA - it takes a string-based UUID and converts it to
 it’s binary representation. It takes a pointer to a UUID, which will be used to return the converted binary data.


-----

## UUID Shellcode

 By providing a pointer to an heap

###### address, this function can be (ab)used to both decode data and write it to memory without using common functions such as memcpy or WriteProcessMemory.

 Then use callback

###### function(EnumWindows) to execute shellcode

 This vba script was used by Lazarus


-----

#### Callback function to execute shellcode

 the lpLocaleEnumProc parameter

###### specifies a callback function! By providing the address returned by HeapAlloc, this function can be (ab)used to execute shellcode

 There are many callback functions can

###### used to execute shellcode

 This case was used in a PE file


-----

## Phantom DLL Hollowing

 The target dll is chosen based on the size of its .text section to house the

###### reflective payload and then it could execute the binary within a + RX section in that dll

 We have found APT27 used this technique to spread CobaltStrike Beacon


-----

## Phantom DLL Hollowing

###### Find target dll in
 Find aaclient.dll System32


###### Modules


###### wpsupdate.exe


###### Phamtom Dll hollowing


-----

## Phantom DLL Hollowing

###### In this case, the DLL used to make the phantom dll hollowing is aaclient.dll, it execute the cobaltstrike stager shellcode within a + RX section in that dll


-----

## Shellcode injection - Waterbear

 Generate random junk bytes to envelop real shellcode when decoding


-----

## Shellcode injection - Waterbear

 Using beginthreadex() acts as a proxy and starts the new thread at

###### threadstartex(), instead of using the address where the shellcode is located as if using CreateThread() directly


-----

# Dataset Overview


-----

###### Memory Resident malware used by APT

 APT32 (OceanLotus) - Denis backdoor

 APT37 – Rokrat RAT

 Tropic Trooper - TClient backdoor

 BlackTech (PLEAD) – TSCookie, Capgeld, waterbear, kivars

 APT10 – Sodamaster, Lodeinfo, P8RAT, CobaltStrike

 Mustang Panda – PlugX

 PhamtomIvy

 APT27 – Sysupdate, Hyperbro, CobaltStrike

 Winnti - CobaltStrike, ShadowPad

 Darkseoul – Dtrack

 Unknown group – Dropsocks, Dpass

 21 malware family


-----

###### Cyber Crime Memory-resident Malware

 Emotet

 Formbook

 Dridex

 AgentTesla

 Trickbot

 QuasarRAT(also used in APT)

 6 malware family


-----

#### How to find memory-resident malware

###### Tool
 pe-sieve (hollows_hunter)
 volatility(malfind)
 Hollowfind
 Data source
 Victim’s PC
 Triage
 VirusTotal


-----

## File distribution


###### 800cs_beacon_variantDpass_loaderTSCookieIDshellManuscryptNavRATRokRATagentteslabigpoohcapgeld_lchcapgeld_ratcs_beaconcs_stagercs_stager_loaderdenisdridexdropsocksdtrackemotetemotet_shellcodeformbookkivarsplugx_fastpoisonivypoisonivy_shellcodepolaris_plugxselinasodamastertrickbotwaterbear_x32waterbear_x64xRAT

 700

 600

 500

 400

 300

 200

 100

 0


-----

#### How to deal with Data Imbalance issue

 Use class weights

 class_1 has 1000 instances and class_2 has 100 instances

 class_weights={"class_1": 1, "class_2": 10}

 SMOTE

 Data argumentation

 Rotate, Flip, Scale

 Transfer learning

 VGG16

 InceptionV3


-----

## Why Transfer Learning

 Some APT Memory-resident malware is a small set of data

 Transfer learning uses knowledge from a learned task to improve the

###### performance on a related task, typically reducing the amount of required training data.

 They allow models to make predictions for a new domain or task (target

###### domain) using knowledge learned from another dataset or existing machine learning models (source domain).


-----

###### AgentTesla Bigpooh Capgeld_loader Capgeld_RAT CobaltStrike
 beacon


###### CobaltStrike
 stager


###### CobaltStrike stager loader


###### CobaltStrike
 variant


###### Denis RAT Dpass Loader


-----

###### Formbook TSCookie IDShell kivars Manuscrypt

 PoisonIvy PhatomIvy PlugX RokRAT Selina


-----

###### CobaltStrike
 stager


###### Non - consistency


###### consistency

 Denis RAT Dridex


-----

# Mem2Img Framework


-----

## Preprocessing Data

 Remove continuous bytes(junk bytes) in the binary, ex : NULL bytes, 0xFF


-----

## 1D Array to 2D Array
###### Image width = height
 1D array = sqrt(len(1D array))+1


###### 2D array


###### Binary-to-Dec Conversion 182 62 251 56


###### Memory-resident PE or Shellcode


###### 107 30 116 87

 102 119 84 30

 … … … …

 164 245 131 87


###### 8-bit vectors to
 Images


-----

## Three channel of the image

 Red channel : decimal values of each bytes

 Green Channel : Shannon entropy values of each bytes

 Blue channel : Local entropy values of the image

 Use entropy function of skimage library

 Local entropy is computed using base 2 logarithm and related to the

###### complexity contained in a given neighborhood

 the filter returns the minimum number of bits needed to encode the local

###### gray level distribution. The disk is set to 10 in Mem2Img framework


-----

###### Decimal – Red Channel


###### 0011 1110 1011 0110 1111 1011 0011 1000

 0101 0111 0111 0111 0111 0100 0110 1011

 0110 0110 0001 1110 0101 0100 0001 1110

 0010 0100 1001 1111 0101 0011 0101 0111

 0000 1110 0000 1100 1100 1100 1111 0100


###### 87 119 116 107

 102 30 84 30

 36 159 86 206

 164 245 131 87


###### with decimal values of each byte


###### with Shannon entropy values of each byte Value*15


###### Shannon Entropy – Green Channel


###### Convert to grayscale image


###### Count Shannon entropy bytes to bytes, ie:10110111 -> 0.9544


###### 0.9544 0.9544 0.5436 0.9544

 0.8544 0.8113 1 0.9544

 1 1 0.9544 1

 0.9544 0.8113 1 0.9544

 0.9544 0.8113 1 0.9544


###### with local entropy values of each byte


###### Generate local entropy image Put the value of entropy image to blue channel


###### 3.1521 3.0935 3.0424 3.0606

 3.0398 3.0642 3.0241 2.9824

 2.8085 2.7159 2.7506 2.6820

 2.5863 2.5259 2.4454 2.2180


###### Local Entropy –
 Blue Channel


-----

## Local Binary Pattern(LBP)

###### 92 – 83 > 0

 0 0 0 1 1 1 1 1
 92 93 81 0 0 1
 LBP 93 83 63 0 0 1 31
 2[4] + 2[3] + 2[2] + 2[1] + 2[0]= 31
 76 60 77 1 1 1

 76 – 83 < 0

 If P = 8  R = 1 
 0 0 1
 92 93 81
 0 0 0 1 1 1 1 1
 0 0 1
 93 83 63
 1 1 1
 76 60 77
 Circular LBP
 76 93 92 1 0 0
 0 1 1 1 1 1 0 0
 60 83 93 1 0 0

|92|93|81|
|---|---|---|
|93|83|63|
|76|60|77|

|0|0|1|
|---|---|---|
|0|0|1|
|1|1|1|

|Col1|Col2|Col3|
|---|---|---|
||31||
||||

|0|0|1|
|---|---|---|
|0|0|1|
|1|1|1|

|= 8|R|= 1|
|---|---|---|
|92|93|81|
|93|83|63|
|76|60|77|

|76|93|92|
|---|---|---|
|60|83|93|

|1|0|0|
|---|---|---|
|1|0|0|


-----

###### 1

 0


##### LBP Rotational
###### 225
##### Invariance


###### Rotation

 240 120 60 30 15 135 195

 mapping Choose the smallest one

 15


-----

## Data Argumentation

|Original|Flip|Rotate|Scale|
|---|---|---|---|
|||||
|||||
|||||


-----

## Mem2Img


###### Image Resize


###### 224*224*3


###### M*18432

 Local Binary
 Pattern
 M*26


-----

## Mem2Ing(cont.)


###### PCA(0.95)


###### Logistic regression


###### M*94746


###### Predicted M*1015
 result

 PlugX Waterbear
 Denis CobaltStrike
 ...


-----

## CNN Architecture


Input:
224*224*3 222*222*32


52*52*64 26*26*64 24*24*128 12*12*128


109*109*64


54*54*64


Conv: 3*3
32 filters
Padding:2


Conv: 3*3
128 filters
Padding:2


Pool:2*2
Stride:2


Conv: 3*3
64 filters
Padding:2


Pool:2*2
Stride:2


Pool:2*2
Stride:2


111*111*32

Conv: 3*3
64 filters
Padding:2


Pool:2*2
Stride:2


-----

## Training parameter

 Training : Testing : 5:1

 30 class classification

 12569 memory blocks image(after data argumentation)

 CNN:

 activation function : Relu

 Batch normalization

 Learning rate decay

 Training ephocs:32

 Logistic regression

 Class weight


-----

###### Different Models’s Features


###### Model Accuracy Precision Recall F1 Score

 Mem2Img 98.36% 98.51% 98.36% 98.38%

 CNN 96.5% 97.09 96.5% 96.6%

 Vgg16 96.73% 97.28% 96.7% 96.8%

 Inception
 95.8% 96.2% 95.8% 95.8% V3

 LBP 84.8% 86.6% 84.8% 84.6%

|Mem2Img|98.36%|98.51%|98.36%|98.38%|
|---|---|---|---|---|
|CNN|96.5%|97.09|96.5%|96.6%|
|Vgg16|96.73%|97.28%|96.7%|96.8%|
|Inception V3|95.8%|96.2%|95.8%|95.8%|


-----

## Different image

|Model|Accuracy|Precision|Recall|F1 Score|
|---|---|---|---|---|
|RGB|98.13%|98.3%|98.13%|98.14%|
|RG (without Blue channel : Local Entropy)|92.23%|93.2%|92.23%|92.23%|
|Gray|88.8%|90.3%|88.8%|88.9%|


-----

## Different Algorithm

###### Model Accuracy Precision Recall F1 Score

 Logistic 98.36% 98.51% 98.36% 98.38% Regression

 SVM 98.36% 98.44% 98.36% 98.36%

 Xgboost 94.17% 94.51% 94.17% 94.15%

 Random
 93.7% 95% 93.7% 93.83% Forest

|Model|Accuracy|Precision|Recall|F1 Score|
|---|---|---|---|---|
|Logistic Regression|98.36%|98.51%|98.36%|98.38%|
|SVM|98.36%|98.44%|98.36%|98.36%|
|Xgboost|94.17%|94.51%|94.17%|94.15%|


-----

###### Confusion matrix among 30 malware class


-----

## t-SNE

###### Cobaltstrike


###### PLEAD malware


-----

## Saliency map

###### VGG16 InceptionV3 Original CNN


###### Waterbear
 _x64

 Capgeld
 _loader


-----

## Saliency map

###### Original CNN VGG16 InceptionV3


###### PoisonIvy

 PlugX


-----

## Saliency map - Waterbear

###### Config block of the waterbear stager

 Original CNN


-----

## Saliency map - Capgeld Loader

###### .rdata section of the Capgeld Loader

 Original CNN


-----

## Saliency map - Phamtom Ivy

###### Some shellcode snippets of Phamtom Ivy

 Original CNN

 Yara rules of Phhamtom Ivy


-----

#### Saliency map - Mustang Panda PlugX

###### Stack strings of PlugX

 Original CNN


-----

## Grad-cam Analysis

###### Dridex

 Raw image Heatmap over raw image

 Cobalstrike Beacon


###### C2 parsing function And API Spam Bypass

 Some decode function before .rdata section

 Part .rdata section and part .data section


-----

## Grad-cam Analysis

###### Dpass loader


###### Unique strings block


###### Raw image Heatmap over raw image

 Obfuscated stack strings


###### formbook


-----

## Zero-shot Learning

###### After PCA

 Mem2Img Embedding

 Unknown Malware


###### Use KD-TREE to find 5-10 nearest neighbors


###### The unknown malware
 maybe modified from
 TSCookie and maybe have high connection to
 the PLEAD APT Group


###### TSCookie TSCookie TSCookie
 Kivars Kivars
 …


###### when we input the same unknown malware in to Mem2Img next time, the nearest neighbors may be the unknown malware


-----

## Zero-shot Learning

 Jinhospy used by APT37

 [RokRAT RokRAT Manuscrypt Selina RokRAT]

 plugX_fast

 [polaris_plugx polaris_plugx poisonivy poisonivy poisonivy]

 Plugx_variant

 [polaris_plugx polaris_plugx polaris_plugx polaris_plugx poisonivy]

 TEBShell

 [APT10’Cs loader APT10’Cs loader …]

 P8RAT

 [xRAT xRAT xRAT …]

 Framecacher used by Chinese APT

 [Selina Selina Selina Selina Selina]


-----

## Adversarial Attack

 Padding junk bytes to make the file size large

 Deliberately put the code of other malware families into the original malware

###### for obfuscation

 Pack the malware files

 Self Modifying Code

 self-modifying code is code that alters its own instructions while it is

###### executing


-----

#### Self-Modifying Code - Waterbear

```
H.\$.H.l$.H.t$ WATAUAVAWH..0.....XH..!.....H......H......QPH1......XI. ....!....P...ko....I_....GqI..@.U...#b7.;...-K(4q..)..%.."......... .Z
.u...I..C...M).L..PH..ATY.H1..........%h...X.(.eH.`.......PA\A..$=.... ..O. C.U:<w....{.a.....N{.C...qgB.._.z......q....-N.a.b....s.7..&..s.#
t.u..I......I....XYH..H...H...u...x...H..H........E1.E1.H....N........ cO.31c.d.~......[w"S.-......V..`P...U..z....._#..VF.....U.`..&.A-/.}..
H..(...H........E1.E1.H......|......E1.L......L..h...1.H......H.D$PH.. ......"90...U..O.a1miH.Yr0E.4......Y.0=!...).!..O8.Hd..Y......mq......
vH.........(...I..H........1.9+v3L.L$PL..(...A.M.L........I..$H..tY..I j..v>...z..........gA0%...g@'.3..|'.|...&......qk...qy1.q..8l...(77"l.
...I...A..;+r...H......r..q...H....H....p...H..H..I..H........A..H..H. .}.......d.X.7tF....]...,....l..........?4-.....}.+G+'........d. .....
...P...H....h...H.\$XH.l$`H.t$hH..0A_A^A]A\_.H.\$.WH.. 3.H..H..H;.t<D. ...f}....t.q.D...N.....Y.a-.........q...6......m.K..W.[{ygZ.<).y......
C.H.L$@.D$8..D$9..D$:..D$;........D.8H..0D.?.D.?0D..H...|.H.\$0H.. _.H 8L...N4dx......cc...........^..Z{..3.".a.u.|D..eK..,....@..........p..
.\$.UVWH..@D......H......H......H........H........D......H......H..... .m.[.......kZ.|.<1.._W.Z{.)..kt.t.O.Y_.Z{.).....<1..I..Z{.).qkt.t.y...
.......u.H........H...H.........D......H..8...H........H..H....8...D.. C.zH....y...I....Z3.l....]3.jB......./.".5.0.C.|...../.".tt.y.Y..Z{.).
....H..8...H..................D......H..@...H........L.L$`L.D$hH..H... .kt.t.!{......).....<13..s[[{H..V.u.<y=....Z3.m.F.u.<};.....?l.gZ.<)..
.@...D......H..@...H..............A...D......H......H........D.D$`H.T$ !.......z..j<).q.C.._...].ykt........Z?.%..jt....B......5..jt..u.#.|.~
hH........D......H......H..H........H........D.D$`H.L$h3.......D...... ..)...l.<1..I3.Z{.)..kt.t.y.E/+..I./.......C.._._(...N....!.......j..j
H.. ...H........A.....H..H.... ...D......H.. ...H..H........H..tv3.A.. <)...C.._......j5.41.C.|....]..kt.x.1..........j<).y=.1`|[{H..<....p.C
...H........D......H..0...H......L._.H.G.H.L$(H..L.\$ H.D$0H.|$8...... ......]..ht.x.1..........j</...C..Y.o./h."........]._h...ND..M.{1`|[{H
L.L$ L..H..H....0...D......H..0...H........3.A.....H....8...H.L$h3.A.. ...NT......._......j0)...C.._.KI./...]...C.....H"/.".i...A....7l...+..
.....8...H........H........H.\$p.....H..@_^].@SH.. H..H.........H..D.. .1..Y..Z{.)...t.<1....C.yH.gZ6P..0.C..W.;..r.@SH.. H..H.........H..D..
....H..p...H........L..I..A..p...H.C.H.K.3.A.......8...L.[.H..A....... ....H..p...H........L..I..A..p...H.C.H.K.3.A.......8...L.[.H..A.......
....H.. [.H.\$.H.l$.H.t$.WATAUH..p...D......H..X...H......H..3.......D ....H.. [...m.K.~.~k.+[.r#.hp..O..^{H..P.u.<y=....Z3..'.jt.....1d|[{H.
.g.L..$P...A..H....X...D......H..X...H........D.G.H.T$@E3.H...........Only the wait-for-connection function is left..z8/...A......)...,.<1..M..Z{.)..kt.t.}.]_.Z{./h."...q.p.._...Z-.j.Z.
u.3..F...D......H......H........H.T$@H........D......H......H......... D.p...Y{H..Rnv.<y=.V..Z3.i.B.u.<y;.......1..jt....A......./.".i...B..T
|$D&r...l$DH.T$HE3.D.E.H...........t.D......H..`...H.........U.L..$P.. &_..]ge...u......H.....<)..%............J=1..Ed.[{H......=1.......l.-.
.H.L$E..`...D......H..`...H........A.MZ..fD9\$E..A........H.L$ M...... j</p...]..Z{.).9kt.t.%#......].ykt.}.......B..j..5]....F.\........U.2c
..........k..+.@...@...H.T.@......D......H..X...H........L..$`...H.L$ .4._....,...........U.....n4]...C.._..I./....=1..E<+..I./....\2.C.z.~[
A....X...D......H..X...H..........h...D......H..P.....H..i.<.........H .)...,.<1..M..Z{.)..kt.t.}.]_.Z{.1G.jt...VB......./...........P.....j<
.T$@A.....H....P...D......H..P...H........H.T$FH.L$0A.....M..D.d$@D.l$ /h...v..Z{.)...$.<1..M..Z{.)..kt.t.}.]_.Z{./{.,</p...s..Z{.).....|u>/.

###### Before self-modifying After self-modifying

```

-----

## Conclusion

 More and more advanced methods of process injection have been used

 Transfer Learning have great performance on memory-resident malware

###### classification, especially on small set of data

 The features extract via Convolutional Network can find out the special area

###### of malware

 We have also proposed some attackable methods for Adversarial attack

 https://github.com/AragornTseng/Mem2Img


-----

# THANK YOU!

###### aragorn@teamt5.org
 charles@teamt5.org


-----