apt10deob Defeating APT10 Compiler-level Obfuscations Takahiro Haruyama Threat Analysis Unit Carbon Black Who am I? •Takahiro Haruyama (@cci_forensics) �Principal Threat Researcher � Carbon Black’s Threat Analysis Unit (TAU) �Reverse-engineering cyber espionage malware � linked to PRC/Russia/DPRK �Past public research presentations � binary diffing, Winnti/PlugX malware research � forensic software exploitation, memory forensics Vi ru s B ul le tin 20 19 2 Overview •Motivation and Approach •Microcode •Opaque Predicates •Control Flow Flattening • IDA 7.2 Issues and 7.3 Improvements •Wrap-up Vi ru s B ul le tin 20 19 3 Motivation and Approach Vi ru s B ul le tin 20 19 4 Question Vi ru s B ul le tin 20 19 5 This function just returns the value Question Vi ru s B ul le tin 20 19 Opaque Predicates Control Flow Flattening 6 APT10 ANEL [1][2] •RAT program used by APT10 �observed in Japan uniquely •ANEL version 5.3.0 or later are obfuscated with �opaque predicates � control flow flattening Vi ru s B ul le tin 20 19 7 Examples Vi ru s B ul le tin 20 19 8 We need an automated de-obfuscation method Motivation and Approach •automate ANEL code de-obfuscations �The obfuscations looked similar to the ones described in Hex-Rays blog [3] �The IDA plugin HexRaysDeob [4] didn’t work � It was made for another variant of the obfuscations � I investigated the causes then modified HexRaysDeob to work for ANEL samples [8] Vi ru s B ul le tin 20 19 9 Microcode Vi ru s B ul le tin 20 19 10 Microcode • intermediate representation (IR) used by IDA Pro decompiler • optimized in 9 maturity levels � transformed from low-level to high-level IRs [3] Vi ru s B ul le tin 20 19 11 low high Microcode Explorer [4] Vi ru s B ul le tin 20 19 12 over 150 instructions just 8 instructions Microcode Explorer [4] Vi ru s B ul le tin 20 19 13 over 150 instructions just 8 instructions minsn_t Key Structures [5] Vi ru s B ul le tin 20 19 14 mbl_array_t mblock_t mblock_t mblock_t ..... minsn_t minsn_t minsn_t ..... mop_t (left) HexRaysDeob installs two optimizer callbacks: optblock_t and optinsn_t mop_t (right) mop_t (dest) CFG and Instructions in Microcode Explorer Vi ru s B ul le tin 20 19 15 CFG (mblock_t) nested instructions (minsn_t) top-level instruction sub instructions block number Opaque Predicates Vi ru s B ul le tin 20 19 16 Opaque Predicates Summary •optinsn_t::func replaces an opaque predicate pattern with another expression � called from MMAT_ZERO to MMAT_GLBOPT2 •ANEL samples require 2 more patterns and data- flow tracking Vi ru s B ul le tin 20 19 17 Pattern1: ~(x * (x - 1)) | -2 • In the example below, � dword_745BB58C = either even or odd � dword_745BB58C * (dword_745BB58C - 1) = always even � the lowest bit of the negated value becomes 1 �OR by -2 (0xFFFFFFFE) will always produce the value -1 •The pattern x * (x-1) will be replaced with 2 Vi ru s B ul le tin 20 19 18 Pattern2: read-only global variable >= 10 or < 10 • dword_72DBB588 is always 0 � without a value (will be initialized with 0) � only read accesses • the pattern matching function replaces the global variable with 0 • other variants � the variable - 10 < 0 � the immediate value can be different, not 10 (e.g., 9) Vi ru s B ul le tin 20 19 19 Data-flow tracking for the patterns • trace back the minsn_t / mblock_t linked lists Vi ru s B ul le tin 20 19 20= x * (x - 1) ? Data-flow tracking for the patterns (Cont.) •optinsn_t::func passes a null mblock_t pointer if an instruction is not top-level �An additional code traces from jnz then passes the pointer to setl Vi ru s B ul le tin 20 19 21 = read-only global variable ? Control Flow Flattening Vi ru s B ul le tin 20 19 22 Control Flow Flattening: Summary Vi ru s B ul le tin 20 19 23 Control Flow Flattening: block comparison variable Vi ru s B ul le tin 20 19 24 block comparison variable assignment block comparison variable comparison The unflattening code translates block comparison variables into block numbers (mblock_t::serial) Control Flow Flattening: Modifications • three main modifications �Unflattening in multiple maturity levels �Control flow handling with multiple dispatchers � Implementation for various jump cases Vi ru s B ul le tin 20 19 25 Unflattening in Multiple Maturity Levels •The original implementation works in MMAT_LOCOPT � due to "Odd Stack Manipulations” obfuscation • I had to unflatten the ANEL code in later maturity levels �The block comparison variable heavily depends on opaque predicate conditions Vi ru s B ul le tin 20 19 26 Unflattening in Multiple Maturity Levels (Cont.) • The loop becomes simpler once opaque predicates are broken • Unflattening in later maturity levels makes another problem Vi ru s B ul le tin 20 19 27 In MMAT_LOCOPT, The block comparison variable 0x4624F47C is translated into block #9 Unflattening in Multiple Maturity Levels (Cont.) • The block will be eliminated in later maturity levels • The modified code � Links between block comparison variables and block addresses in MMAT_LOCOPT � Guesses the block numbers in later maturity levels by using each block and instruction addresses Vi ru s B ul le tin 20 19 28 Control Flow Handling with Multiple Dispatchers •The original implementation assumes an obfuscated function has only one control flow dispatcher •Some functions in the ANEL sample have multiple dispatchers �up to seven dispatchers in one function Vi ru s B ul le tin 20 19 29 Control Flow Handling with Multiple Dispatchers (Cont.) •The modified code � catches the hxe_prealloc event then calls the optblock_t::func � This event occurs several times in MMAT_GLBOPT1 and MMAT_GLBOPT2 �utilizes different algorithms � control flow dispatcher / first block detection � block comparison variable validation Vi ru s B ul le tin 20 19 30 Control Flow Handling with Multiple Dispatchers (Cont.) • The modified code detects block comparison variable duplications and applies the most likely variable Vi ru s B ul le tin 20 19 31 Implementation for Various Jump Cases: The Originals Vi ru s B ul le tin 20 19 32 flattened block(s) (dispatcher predecessor) from conditional block (1) goto case for normal block to control flow dispatcher (2) conditional jump case for flattened if-statement block dispatcher predecessor nonJcc endsWithJCC false true flattened blocks Implementation for Various Jump Cases: The Originals (Cont.) Vi ru s B ul le tin 20 19 33 (2) Implementation for Various Jump Cases: The Additions Vi ru s B ul le tin 20 19 34 (3) goto N predecessors case (4) (2)+(3) combination case dispatcher predecessor pred 0 pred 1 pred N... dispatcher predecessor pred 0 pred 1 pred N... nonJcc endsWith JCC false true Implementation for Various Jump Cases: The Additions (Cont.) Vi ru s B ul le tin 20 19 35 (3) Vi ru s B ul le tin 20 19 36 Implementation for Various Jump Cases: The Additions (Cont.) (4) Implementation for Various Jump Cases: The Additions (Cont.) • (5) Block comparison variables are assigned in the first blocks �The modified code reconnects first blocks as successors of the flattened block • I saw up to three assignments of the case in one function Vi ru s B ul le tin 20 19 37 block #1 will be the successor of block #7 IDA 7.2 Issues and 7.3 Improvements Vi ru s B ul le tin 20 19 38 Evaluation on IDA 7.2 •Tested ANEL samples �5.4.1 payload [1] � 3d2b3c9f50ed36bef90139e6dd250f140c373664984b97a97a5 a70333387d18d �5.5.0 rev1 loader DLL [6] � f333358850d641653ea2d6b58b921870125af1fe77268a6fdfed a3e7e0fb636d •The modified tool could deobfuscate 92% of the obfuscated functions that we encountered in the 5.4.1 payload Vi ru s B ul le tin 20 19 39 Evaluation on IDA 7.2 (Cont.) •The causes of the failures �The next block number guessing algorithm failed �Propagations of opaque predicates deobfuscation failed �No method to handle a conditional jump of a dispatcher predecessor with multiple predecessors Vi ru s B ul le tin 20 19 40 resolved in IDA 7.3 resolved in this case IDA 7.3: Propagation of Opaque Predicates Deobfuscation Vi ru s B ul le tin 20 19 41 aliased stack slots always 0xC1A18C30 (signed) 7.2 7.3 IDA7.3: Handling a Conditional Jump of a Dispatcher Predecessor •All jump cases (1)-(5) can be conditional � (2)-(4) cases require a mblock_t duplication • IDA 7.3 provides the option � clear the flag MBA2_NO_DUP_CALLS �use mbl_array_t::insert_block API then copy instructions and other information �adjust destinations of the blocks passing a control to the exit block whose block type is BLT_STOP Vi ru s B ul le tin 20 19 42 Conditional Jump Case (2) Vi ru s B ul le tin 20 19 43 BLT_1WAY BLT_2WAY Conditional Jump Case (3) Vi ru s B ul le tin 20 19 44 preds can be conditional too Conditional Jump Case (4) Vi ru s B ul le tin 20 19 45 not seen in the tested samples :-) preds can be conditional too Workaround in Control Flow Unflattening Failure •The plugin execution with 0xdead deobfuscates only opaque predicates in the current selected function Vi ru s B ul le tin 20 19 46 idc.load_and_run_plugin("HexRaysDeob", 0xdead) idc.load_and_run_plugin("HexRaysDeob", 0xf001) Wrap-up Vi ru s B ul le tin 20 19 47 Wrap-up •The compiler-level obfuscations are starting to be observed in the wild �The automated deobfuscation is needed •The modified code is available publically [7] �1570 insertions(+), 450 deletions(-) � It works for almost every obfuscated function of APT10 ANEL on IDA 7.3 Vi ru s B ul le tin 20 19 48 Acknowledgement •Hex-Rays •Rolf Rolles •TAU members �especially Jared Myers and Brian Baskin Vi ru s B ul le tin 20 19 49 References • [1] https://www.fireeye.com/blog/threat-research/2018/09/apt10-targeting- japanese-corporations-using-updated-ttps.html • [2] https://jsac.jpcert.or.jp/archive/2019/pdf/JSAC2019_6_tamada_jp.pdf • [3] http://www.hexblog.com/?p=1248 • [4] https://github.com/RolfRolles/HexRaysDeob • [5] https://www.hexblog.com/?p=1232 • [6] https://www.secureworks.jp/resources/at-bronze-riverside-updates-anel- malware • [7] https://github.com/carbonblack/HexRaysDeob • [8] https://www.carbonblack.com/2019/02/25/defeating-compiler-level- obfuscations-used-in-apt10-malware/ Vi ru s B ul le tin 20 19 50 Questions? • [Q1] What’s the obfuscating compiler? � [A1] Not sure but it may be Obfuscator-LLVM • [Q2] This tool works for other samples with similar obfuscations? � [A2] Yes only if � Q1 is resolved � the compiler algorithm and implementation have been thoroughly investigated Vi ru s B ul le tin 20 19 51