SmokeLoader Triage Published: 2022-08-25 ยท Archived: 2026-04-05 19:07:51 UTC Stage 2 Opaque predicate deobfuscation From this blog we have a simple jmp fix script. import idc ea = 0 while True: ea = min(idc.find_binary(ea, idc.SEARCH_NEXT | idc.SEARCH_DOWN, "74 ? 75 ?"), # JZ / JNZ idc.find_binary(ea, idc.SEARCH_NEXT | idc.SEARCH_DOWN, "75 ? 74 ?")) # JNZ / JZ if ea == idc.BADADDR: break idc.patch_byte(ea, 0xEB) # JMP idc.patch_byte(ea+2, 0x90) # NOP idc.patch_byte(ea+3, 0x90) # NOP `` Once we fix the jmps we need to nop out the junk code between the code to allow IDA to convert this i ```python import idaapi start = 0x00402DDD end = 0x00402EBF ptr = start while ptr <= end: next_ptr = next_head(ptr) junk_bytes = next_ptr - ptr if ida_bytes.get_bytes(ptr, 1) == b'\xeb': idaapi.patch_bytes(ptr, junk_bytes * b'\x90') ptr = next_ptr Or, we could use this excellent script from @anthonyprintup import ida_ua import ida_name import ida_bytes https://research.openanalysis.net/smoke/smokeloader/loader/config/yara/triage/2022/08/25/smokeloader.html Page 1 of 5 def decode_instruction(ea: int) -> ida_ua.insn_t: instruction: ida_ua.insn_t = ida_ua.insn_t() instruction_length = ida_ua.decode_insn(instruction, ea) if not instruction_length: return None return instruction def main(): begin: int = ida_name.get_name_ea(idaapi.BADADDR, "start") end: int = begin + 0xE2 instructions: dict[int, ida_ua.insn_t] = {} # Undefine the current code ida_bytes.del_items(begin, 0, end) # Follow the control flow and create instructions instruction_ea: int = begin while instruction_ea <= end: if instruction_ea not in instructions.keys(): instruction: ida_ua.insn_t = ida_ua.insn_t() instruction_length: int = ida_ua.create_insn(instruction_ea, instruction) else: instruction: ida_ua.insn_t = decode_instruction(instruction_ea) instruction_length: int = instruction.size if not instruction_length: print(f"Failed to create an instruction at address {instruction_ea=:#x}") return # Append the current instruction address to the list instructions[instruction.ip] = instruction # Handle unconditional jumps current_instruction_mnemonic: str = instruction.get_canon_mnem() next_instruction: ida_ua.insn_t | None = decode_instruction(instruction_ea + instruction.size if next_instruction is not None: next_instruction_mnemonic: str = next_instruction.get_canon_mnem() if (current_instruction_mnemonic == "jnz" and next_instruction_mnemonic == "jz") or \ (current_instruction_mnemonic == "jz" and next_instruction_mnemonic == "jnz"): # Unconditional jump detected assert instruction.ops[0].type == ida_ua.o_near instruction_ea = instruction.ops[0].addr ida_ua.create_insn(next_instruction.ip) instructions[next_instruction.ip] = next_instruction https://research.openanalysis.net/smoke/smokeloader/loader/config/yara/triage/2022/08/25/smokeloader.html Page 2 of 5 continue if current_instruction_mnemonic == "jmp": assert instruction.ops[0].type == ida_ua.o_near instruction_ea = instruction.ops[0].addr else: instruction_ea += instruction.size # NOP the remaining instructions for ea in range(begin, end): skip: bool = False for _, instruction in instructions.items(): if ea in range(instruction.ip, instruction.ip + instruction.size): skip = True break if skip: continue # Patch the address ida_bytes.patch_bytes(ea, b"\x90") if __name__ == "__main__": main() After this we can see that the next function address is built using some stack/ret manipulation. Generic Opaque Predicate Patching There is also this nice generic patching script from Alex: nopme.py. Function Decryption Some functions are encrypted. We can find the first one by following the obfuscated control flow until the first call . This call calls into a function which then calls the decryption function. The decryption function takes a size and a offset to the function that needs to be decrypted. The size is placed in the ecx register, and the function offset follows the call. The decryption itself is a single byte xor but the decryption key is moved into the edx register as a full DWORD (we only used the LSB). From this blog we have a simple deobfuscation script updated for our sample. This script didn't perform well for some reason so we ended up manually decrypting the functions! import idc import idautils https://research.openanalysis.net/smoke/smokeloader/loader/config/yara/triage/2022/08/25/smokeloader.html Page 3 of 5 def xor_chunk(offset, n): ea = 0x400000 + offset for i in range(n): byte = ord(idc.get_bytes(ea+i, 1)) byte ^= 0x50 idc.patch_byte(ea+i, byte) def decrypt(xref): call_xref = list(idautils.CodeRefsTo(xref, 0))[0] while True: if idc.print_insn_mnem(call_xref) == 'push' and idc.get_operand_type(call_xref, 0) == idaapi n = idc.get_operand_value(call_xref, 0) break if idc.print_insn_mnem(call_xref) == 'mov' and idc.get_operand_type(call_xref, 1) == idaapi.o n = idc.get_operand_value(call_xref, 1) break call_xref = prev_head(call_xref) n = idc.get_operand_value(call_xref, 0) offset = (xref + 5) - 0x400000 xor_chunk(offset, n) idc.create_insn(offset+0x400000) ida_funcs.add_func(offset+0x400000) xor_chunk_addr = 0x00401118 # address of the xoring function decrypt_xref_list = idautils.CodeRefsTo(xor_chunk_addr, 0) for xref in decrypt_xref_list: decrypt(xref) API Hashing According to this blog we are expecting to see some API hashing using the djb2 algorithm. We can try to find this function by searching for the constant 0x1505. Though the djb2 algorithm is used for the API hashing the malware also encrypts the hashes with a hard coded XOR key. In our sample the key is 0x76186250. Decryption There is a 32-bit and a 64-bit version of stage 3 stored consecutivly in the binary. The data is encrypted with a hard coded 4-byte XOR key, the decryption must be a multiple of four. The trailing bytes (if any) are then decrypted with a single byte XOR. In our sample the DWORD key is 0x76186250 and the single byte key is 0x50. https://research.openanalysis.net/smoke/smokeloader/loader/config/yara/triage/2022/08/25/smokeloader.html Page 4 of 5 Decompression Once the stage 3 data is decrypted it is also decompressed with the LZSA2 algorithm. We matched this with a blog. The LZSA algorithm is detailed on this github Emmanuel Marty/LZSA. Source: https://research.openanalysis.net/smoke/smokeloader/loader/config/yara/triage/2022/08/25/smokeloader.html https://research.openanalysis.net/smoke/smokeloader/loader/config/yara/triage/2022/08/25/smokeloader.html Page 5 of 5