TTPs: JmpNoCall
A proof of concept demonstration of custom payload and implant implementations that results in clean call stack execution of malicious code
Last updated
A proof of concept demonstration of custom payload and implant implementations that results in clean call stack execution of malicious code
Last updated
Over the past couple of weeks, there has been some interesting work regarding call stack tracing evasion by @NinjaParanoid. His technique used some cool APIs and callback functions to achieve clean call stacks and reduce detectability.
The problem is that most implant implementations execute code out of a RX sections of memory. This functionality can be detected by EDR when API calls or syscalls return to RX sections of memory.
That got me interested in the topic, but I wanted a more customized solution. A significantly advanced threat actor is likely using custom payloads to execute tailored actions specific to their campaign, so we're going to utilize a different technique to achieve clean call stacks.
The technique I developed uses assembly ramps to jmp to our functions, without using the "call" instruction. We're going to do this by using a combination of inline assembly, an assembly onRamp, and a custom payload.
Note: for demonstration purposes, the allocated section of memory we'll use in this writeup uses RWX permissions, but the final code available on my GitHub implements this technique with RX permissions
So to start, we have to develop a way to get the address we want to return to at run time. We develop the following code and run in in x64dbg to validate that we are capturing the correct address:
And x64dbg shows us that our technique works!
We can build a rudimentary assembly onRamp to call our payload
We can implement an implant that uses this ramp and a standard msfvenom calc payload like so:
But even though we can achieve payload execution, we are not able to recover cleanly
This is because the msfvenom payload we're using does not clean up the stack and return properly. There's another issue with this payload. The msfvenom payload uses several "call" opcodes that are going to be problematic for our call stack sanitization, no matter how clever we are with our onRamp.
Luckily for us, there is a robust calc payload implementation developed by @0xboku that we can use and customize with nasm so that we can achieve clean call stack execution.
Now that we have a robust method of payload execution, we can build custom on/off ramps to achieve clean call stack code execution, and customize our payload to leverage the ramps
This custom payload only has two "call" instructions, so we should be able to quickly patch those to achieve code execution without valid call stack traces! We also see that in its existing implementation, we execute our payload
If we take a look at the x64 convention, we can see that the several registers are listed as nonvolatile, which means we can expect any function call we make with to preserve the value(s) stored in those registers
Now that we know that, we also know that our payload does not use the r13 and r15 registers at any point, which seems like the perfect places to store our return values.
Once we've built the on ramp, we need to make a couple of modifications to our payload in order to retain its functionality.
For the time being, we'll use an interrupt offRamp.
The first is replacing the "call r14" instruction with a push+jmp instruction
The second change is to patch up the other call instruction, all the changes are visible in the screenshot below. It's important to remember that you'll have to change some prologues in order to keep the stack organized after removing the call instructions.
And if we run this code, we see that it works!
Now, our offRamp function needs to clean up the stack. Currently the stack looks like this:
But we know that the we can pop everything off the stack until rsp = r13, so lets implement that in our offRamp function
Now when we land on the interrupt, our implant is ready to return to main()
Our final version of offRamp looks like this:
Based on our source code, we know that if we succeeded, we should see the "Exiting implant…" message in our console. And we have a working program!
Scrutinizing our stack throughout program execution, we can see that x64dbg correctly sees that we made this call from our payload, x 0x…2850 but we're returning to a regular .text section of memory!
This methodology can be a little clunky, but it provides a lot of non-standard functionality that may not be immediately obvious. The onRamp() function takes in a return_address variable that could have been pulled from the stack because we properly call onRamp(). That technique is certainly viable, but by deliberately passing in our desired return_address as a function argument we can actually return anywhere that we want and thereby obfuscate control flow analysis of our program.
The technique is not perfect, and it requires custom payloads that work in concert with the onRamp/offRamp functions in order to function properly. This technique will probably never gain significant mainstream attention because of those limitations. However, it's still a cool technique and it's something very possible to implement using the methods above.