{
	"id": "cded5544-8085-41a1-a6b1-ca0fa166122d",
	"created_at": "2026-04-06T01:30:51.4978Z",
	"updated_at": "2026-04-10T03:21:22.477977Z",
	"deleted_at": null,
	"sha1_hash": "33ab229144b54f339c90d652e9fc4be76adfdcf6",
	"title": "Hiding In PlainSight - Indirect Syscall is Dead! Long Live Custom Call Stacks",
	"llm_title": "",
	"authors": "",
	"file_creation_date": "0001-01-01T00:00:00Z",
	"file_modification_date": "0001-01-01T00:00:00Z",
	"file_size": 579624,
	"plain_text": "Hiding In PlainSight - Indirect Syscall is Dead! Long Live Custom\r\nCall Stacks\r\nArchived: 2026-04-06 00:39:14 UTC\r\nPosted on 29 Jan 2023 by Paranoid Ninja\r\nNOTE: This is a PART II blog on Stack Tracing evasion. PART I can be found here.\r\nThis is the second part of the blog I wrote 3 days back on proxying DLL loads to hide suspicious stack traces\r\nleading to a user allocated RX region. I won’t be going in depth on how stack works, because I already covered\r\nthat in the previous blog which can be accessed from the above link. We previously saw that we can manipulate\r\nthe call and jmp instructions to request windows callbacks into calling LoadLibrary API call. However,\r\nstack tracing detections go far beyond just hunting DLL loads. When you inject a reflective DLL into local or\r\nremote process, you have to call API calls such as VirtualAllocEx / VirtualProtectEx which indirectly calls\r\nNtAllocateVirtualMemory / NtProtectVirtualMemory . However, when you check the call stack of the legitimate\r\nAPI calls, you will notice that WINAPIs like VirtualAlloc/VirtualProtect are mostly called by non-windows\r\nDLL functions. Majority of windows DLLs will call NtAllocateVirtualMemory / NtProtectVirtualMemory\r\ndirectly. Below is a quick example of the callstack for NtProtectVirtualMemory when you call\r\nRtlAllocateHeap .\r\nThis means that since ntdll.dll is not dependent on any other DLL, all functions in ntdll which require playing\r\naround with permissions for memory regions will call the NTAPIs directly. Thus, it means that if we are able to\r\nreroute our NtAllocateVirtualMemory call via a clean stack from ntdll.dll itself, we wont have to worry about\r\ndetections at all. Most red teams rely on indirect syscalls to avoid detections. In case of indirect syscalls, you\r\nsimply jump to the address of syscall instruction after carefully creating the stack, but the issue here is that\r\nindirect syscalls will only change the return address for the syscall instruction in ntdll.dll. Return\r\nAddress in this case is the location where the syscall instruction needs to return to, after the syscall is complete.\r\nBut the rest of the stack below the return address will still be suspicious as they emerge out from the RX region. If\r\nhttps://0xdarkvortex.dev/hiding-in-plainsight/\r\nPage 1 of 6\n\nan EDR checks the full stack of the NTAPI, it can easily identify that the return address eventually reaches back to\r\nthe user allocated RX region. This means, a return address to ntdll.dll region, but stack originating from RX region\r\nis a 100% anomaly with zero chances of being a false positive. This is an easy win for EDRs who utilize ETW for\r\nsyscall tracing in the kernel.\r\nThus in order to evade this, I spent some time reversing several ntdll.dll functions and found that with a little bit of\r\nassembly knowledge and how windows callbacks work, we should be able to manipulate the callback into calling\r\nany NTAPI function. For this blog, we will take an example of NtAllocateVirtualMemory and we will pick the\r\ncode from our part I blog and modify it. We will take an example of the same API TpAllocWork which can\r\nexecute a call back function. But instead of passing on a pointer to a string like we did in the case of Dll Proxying,\r\nwe will pass on a pointer to a structure this time. We will also avoid any global variables this time by making sure\r\nall the necessary information goes within the struct as we cannot have global variables when we write our\r\nshellcodes. The definition of NtAllocateVirtualMemory as per msdn is:\r\n__kernel_entry NTSYSCALLAPI NTSTATUS NtAllocateVirtualMemory(\r\n [in] HANDLE ProcessHandle,\r\n [in, out] PVOID *BaseAddress,\r\n [in] ULONG_PTR ZeroBits,\r\n [in, out] PSIZE_T RegionSize,\r\n [in] ULONG AllocationType,\r\n [in] ULONG Protect\r\n);\r\nThis means, we need to pass on a pointer for NtAllocateVirtualMemory and its arguments inside a structure to\r\nthe callback so that our callback can extract these information from the structure and execute it. We will ignore the\r\narguments which stay static such as ULONG_PTR ZeroBits which is always zero and ULONG AllocationType\r\nwhich is always MEM_RESERVE|MEM_COMMIT which in hex is 0x3000 . Thus adding in the remaining arguments, the\r\nstructure will look like this:\r\ntypedef struct _NTALLOCATEVIRTUALMEMORY_ARGS {\r\n UINT_PTR pNtAllocateVirtualMemory;\r\n HANDLE hProcess;\r\n PVOID* address;\r\n PSIZE_T size;\r\n ULONG permissions;\r\n} NTALLOCATEVIRTUALMEMORY_ARGS, *PNTALLOCATEVIRTUALMEMORY_ARGS;\r\nWe will then initialize the structure with the required arguments and pass it as a pointer to TpAllocWork and call\r\nour function WorkCallback which is written in assembly.\r\n#include \u003cwindows.h\u003e\r\n#include \u003cstdio.h\u003e\r\ntypedef NTSTATUS (NTAPI* TPALLOCWORK)(PTP_WORK* ptpWrk, PTP_WORK_CALLBACK pfnwkCallback, PVOID Option\r\nhttps://0xdarkvortex.dev/hiding-in-plainsight/\r\nPage 2 of 6\n\ntypedef VOID (NTAPI* TPPOSTWORK)(PTP_WORK);\r\ntypedef VOID (NTAPI* TPRELEASEWORK)(PTP_WORK);\r\ntypedef struct _NTALLOCATEVIRTUALMEMORY_ARGS {\r\n UINT_PTR pNtAllocateVirtualMemory;\r\n HANDLE hProcess;\r\n PVOID* address;\r\n PSIZE_T size;\r\n ULONG permissions;\r\n} NTALLOCATEVIRTUALMEMORY_ARGS, *PNTALLOCATEVIRTUALMEMORY_ARGS;\r\nextern VOID CALLBACK WorkCallback(PTP_CALLBACK_INSTANCE Instance, PVOID Context, PTP_WORK Work);\r\nint main() {\r\n LPVOID allocatedAddress = NULL;\r\n SIZE_T allocatedsize = 0x1000;\r\n NTALLOCATEVIRTUALMEMORY_ARGS ntAllocateVirtualMemoryArgs = { 0 };\r\n ntAllocateVirtualMemoryArgs.pNtAllocateVirtualMemory = (UINT_PTR) GetProcAddress(GetModuleHandleA\r\n ntAllocateVirtualMemoryArgs.hProcess = (HANDLE)-1;\r\n ntAllocateVirtualMemoryArgs.address = \u0026allocatedAddress;\r\n ntAllocateVirtualMemoryArgs.size = \u0026allocatedsize;\r\n ntAllocateVirtualMemoryArgs.permissions = PAGE_EXECUTE_READ;\r\n FARPROC pTpAllocWork = GetProcAddress(GetModuleHandleA(\"ntdll\"), \"TpAllocWork\");\r\n FARPROC pTpPostWork = GetProcAddress(GetModuleHandleA(\"ntdll\"), \"TpPostWork\");\r\n FARPROC pTpReleaseWork = GetProcAddress(GetModuleHandleA(\"ntdll\"), \"TpReleaseWork\");\r\n PTP_WORK WorkReturn = NULL;\r\n ((TPALLOCWORK)pTpAllocWork)(\u0026WorkReturn, (PTP_WORK_CALLBACK)WorkCallback, \u0026ntAllocateVirtualMemo\r\n ((TPPOSTWORK)pTpPostWork)(WorkReturn);\r\n ((TPRELEASEWORK)pTpReleaseWork)(WorkReturn);\r\n WaitForSingleObject((HANDLE)-1, 0x1000);\r\n printf(\"allocatedAddress: %p\\n\", allocatedAddress);\r\n getchar();\r\n return 0;\r\n}\r\nNow this is where things get interesting. In case of DLL proxy, we executed LoadLibrary with only one\r\nargument i.e. the name of the DLL to load which is passed on to the RCX register. But in the case of\r\nNtAllocateVirtualMemory , we have a total of 6 arguments. This means the first four arguments go into the\r\nfastcall registers i.e. RCX, RDX, R8 and R9 . However, the remaining two arguments will have to be pushed to\r\nstack after allocating some homing space for our 4 registers. Make note that our top of the stack currently contains\r\nhttps://0xdarkvortex.dev/hiding-in-plainsight/\r\nPage 3 of 6\n\nthe return value for an internal NTAPI function TppWorkpExecuteCallback at 0ffset 0x130. This is how the\r\ncallstack looks like when the callback function WorkCallback is called.\r\nNow heres the catch. If you modify the top of the stack where the return address lies, add the homing space for the\r\n4 registers and add arguments to it, the whole stack frame will go for a toss and mess up stack unwinding. Thus\r\nwe have to modify the stack without changing the stack frame itself, but by only changing the values within the\r\nstack frame. Each stack frame starts and ends at the blue line shown in the image above. Our stack frame for\r\nTppWorkpExecuteCallback has enough space within itself to hold 6 arguments. So our next step is to extract the\r\ndata from our NTALLOCATEVIRTUALMEMORY_ARGS structure and move it to the respective registers and stack. When\r\nwe call TpAllocWork , we pass on the pointer to NTALLOCATEVIRTUALMEMORY_ARGS structure to the WorkCallback\r\nfunction, this means our pointer to the structure should be in the RDX register now. Each value in our structure is\r\nof 8 bytes (for x64, for x86 it would be 4 bytes). So, we will extract these QWORD values from the structure and\r\nmove it to RCX, RDX, R8, R9 and the remaining values on stack after adjusting the homing space. The calling\r\nconvention for x64 functions in windows as per the msdn documentation would be:\r\n__kernel_entry NTSYSCALLAPI NTSTATUS NtAllocateVirtualMemory(\r\n [in] HANDLE ProcessHandle,\r\n [in, out] PVOID *BaseAddress,\r\n [in] ULONG_PTR ZeroBits,\r\n [in, out] PSIZE_T RegionSize,\r\n [in] ULONG AllocationType,\r\n [in] ULONG Protect\r\n);\r\nConvering this logic to assembly would look like:\r\nhttps://0xdarkvortex.dev/hiding-in-plainsight/\r\nPage 4 of 6\n\nsection .text\r\nglobal WorkCallback\r\nWorkCallback:\r\n mov rbx, rdx\r\n mov rax, [rbx]\r\n mov rcx, [rbx + 0x8]\r\n mov rdx, [rbx + 0x10]\r\n xor r8, r8\r\n mov r9, [rbx + 0x18]\r\n mov r10, [rbx + 0x20]\r\n mov [rsp+0x30], r10\r\n mov r10, 0x3000\r\n mov [rsp+0x28], r10\r\n jmp rax\r\nTo explain the above code:\r\nWe first backup our pointer to the structure residing in the RDX register into the RBX register. We are\r\ndoing this because we are going to stomp the RDX register with the second argument of\r\nNtAllocateVirtualMemory when we call it\r\nWe move the first 8 bytes from the address in RBX register ( struct NTALLOCATEVIRTUALMEMORY_ARGS i.e\r\nUINT_PTR pNtAllocateVirtualMemory ) to rax register where we will jump to later after adjusting the\r\narguments\r\nWe move the second set of 8 bytes ( HANDLE hProcess ) from the structure to RCX\r\nWe move the third set of 8 bytes i.e. pointer to a NULL pointer ( PVOID* address ) stored in the structure\r\ninto RDX . This is where our allocated address will be written by NtAllocateVirtualMemory\r\nWe zero out the R8 register for the ULONG_PTR ZeroBits argument\r\nWe move the 6th argument i.e the last argument which should go to the bottom of all arguments ( ULONG\r\nProtect i.e. PAGE permissions ) to r10 and then move it to offset 0x30 from top of the stack pointer.\r\nTop of the stack pointer = RSP = Return address of TppWorkpExecuteCallback which is 8 bytes\r\nHoming space size for 4 arguments = 4x8 = 32 bytes\r\nSpace for the 5th argument = 8 bytes\r\nThus 32+8 = 40 = 0x28 (this is where the second last 5th argument will go)\r\nThus 32+8+8 = 48 = 0x30 (this is where the last 6th argument will go)\r\nWe finally move the 5th argument value ( ULONG AllocationType ) i.e. 0x3000 -\r\nMEM_COMMIT|MEM_RESERVE to the R10 register and then push it to offset 0x28 from the RSP\r\nCompiling it all together, this is what it looks like before jumping to NtAllocateVirtualMemory :\r\nThe disassembled code shows the asm instructions we wrote. The current instruction pointer is just after\r\nadjusting the stack and before jumping to NtAllocateVirtualMemory\r\nThe registers show the arguments for NtAllocateVirtualMemory\r\nhttps://0xdarkvortex.dev/hiding-in-plainsight/\r\nPage 5 of 6\n\nThe Dump shows the NTALLOCATEVIRTUALMEMORY_ARGS structure in memory. Each 8 byte memory block is\r\nan object relating to the contents of the strucutre\r\nThe stack shows the adjusted stack for NtAllocateVirtualMemory\r\nAnd a quick look at the stack after the execute of NtAllocateVirtualMemory shows a valid callstack which can\r\nbe unwinded perfectly. You can also see that the syscall for NtAllocateVirtualMemory returned zero which\r\nmeans the call was successful.\r\nThe stack is as clear as crystal again with no signs of anything malevolent. Make note that this is not stacking\r\nspooing, because in our case the stack is being unwinded fully without crashing. There are many more such API\r\ncalls which can be used for proxying various functions; which I will leave it out to the readers to use their own\r\ncreativity. The upcoming release of BRc4 will use something similar but with different set of API calls which are\r\nfully undocumented and will be under a different payload option called as stealth++ . The full code for this can\r\nbe found in my github repository.\r\nSource: https://0xdarkvortex.dev/hiding-in-plainsight/\r\nhttps://0xdarkvortex.dev/hiding-in-plainsight/\r\nPage 6 of 6",
	"extraction_quality": 1,
	"language": "EN",
	"sources": [
		"Malpedia"
	],
	"references": [
		"https://0xdarkvortex.dev/hiding-in-plainsight/"
	],
	"report_names": [
		"hiding-in-plainsight"
	],
	"threat_actors": [],
	"ts_created_at": 1775439051,
	"ts_updated_at": 1775791282,
	"ts_creation_date": 0,
	"ts_modification_date": 0,
	"files": {
		"pdf": "https://archive.orkl.eu/33ab229144b54f339c90d652e9fc4be76adfdcf6.pdf",
		"text": "https://archive.orkl.eu/33ab229144b54f339c90d652e9fc4be76adfdcf6.txt",
		"img": "https://archive.orkl.eu/33ab229144b54f339c90d652e9fc4be76adfdcf6.jpg"
	}
}