TTPs: Embedding Payloads with MSFVenom (x86)

A indepth analysis of the mechanics behind embedded payloads using MSFVenom

Part One: Introduction

Msfvenom is a popular payload generation tool that provides a lot of capability as a past of the Metasploit Framework. One of the lesser known features of this tool is its ability to embed payloads into already existing executables. This additional capability can mask the payload to target end users, and facilitate initial or lateral access. In this article, we'll take a look at how this tool works, why it works, and how to make this tool less detectable in your engagements.

Part Two: Getting Started

Before we get started, we're going to need a "good" program that we'll embed an msfvenom payload into later. You're free to develop your own, or use the template here:

/* 
* By 0xTriboulet
* "good.exe" program
* 12/31/22
* compile with: i686-w64-mingw32-g++ good.cpp -o good.exe -Wl,-subsystem,windows -ansi
*/

#include <stdio.h>
#include <Windows.h>

#pragma comment(lib, "user32.lib")

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, 
    LPSTR lpCmdLine, int nCmdShow) {
	MessageBox(NULL, "This is a safe program!", "Safe!", 0x0);
	return 0;
}

We're using x86 in this example because the documentation states that this this should allow our payload to run on a new thread, and that will help us see a couple of pretty important things.

Now that we have our good program, lets move it over to our Kali machine so that the fun part can begin.

Msfvenom makes it super easy to embed payloads. I used a standard calc payload, but any payload of your choosing should work just the same.

msfvenom -p windows/exec CMD="calc.exe" -x good.exe -f exe -k -o even_better.exe EXITFUNC=thread

Now if we run even_better.exe, we'll see that we get both our safe dialogue box, and our calculator payload running.

Part Three: Where the fun really starts

Lets open good.exe and even_better.exe in BinaryNinja and really get started.

Using BinaryNinja's decompiler, we quickly find WinMain of our good.exe program.

The functionality of our program is pretty straight forward for a decompiler to piece together so we see some pretty reliable results. Normally we wouldn't put too much weight into the output of the decompiler.

Lets take a look at even_better.exe. If you were paying attention at the last few screenshots, you noticed that even_better.exe is smaller than good.exe. That's interesting, shouldn't even_better.exe have MORE code since we injected a payload into it? Lets look around inside even_better.exe and see what exactly is happening.

We once again find our MessageBox, but this time the WinMain function doesn't have a name. In fact looking at these two programs side-by-side we notice that ALL the symbol names have been stripped from our program when we injected our payload.

If we recompile good.exe with the "-s" option, the compiler will strip a lot of information out of our program and actually give us an executable coming in at 15KB, smaller than the one we got from msfvenom. Now that we've solved that mystery, lets move on.

After some quick poking around we find the payload in even_better.exe

We're not going to go over how the payload works in this article, what we're interested in is HOW the eip got to this point, and we have a hint immediately above our payloads location where we see the strings "kernel32" and "CreateThread".

Part Four: Good vs Even Better

If we take a look at the _start() function in good.exe it's a pretty barebones function, it has minimal functionality, and really just serves as a trampoline into some startup functions for our program.

But in even_better.exe, there's a whole lot more going on.

Rather than just jumping to ___tmainCRTStartup, this function loads kernel32, gets the address of CreateThread.

The next part of the code looks a little funky, so it's worth talking about in detail. If you take a close look at the screenshot above, the first thing our function does is "pushad". This instruction pushes all the registers onto the stack.

Based on the x86 calling conventions, we know that once we return from our call to GetProcAddress, the address we're interested in will be stored in eax. Meaning, when we call eax on line 0x00412029, we're going to be calling CreateThread.

So far, our understanding of this function looks like this:

Load kernel32.dll -> GetProcAddress of CreateThread -> Call CreateThread

That's a pretty decent understanding, but we're missing a couple of key steps.

At 0x00412018, we load the address of our payload into edx. Next, a bunch of zeros get pushed onto the stack, which might look weird to you if you're used to working in x86_64 assembly, but in x86 arguments are passed on the stack. Additionally the CreateThread function allows several of its arguments to be passed in as null or 0x0. If we look at the definition of CreateThread on MSDN, we see the variables that are getting passed into the function in the reverse order.

The first push we see at 0x41201e, is the optional lpThreadId argument.

The second push is the dwCreationFlags argument which is listed as mandatory, but accepts 0x0 as a valid input

Then the third push is the lpParameter, which is also optional.
The fourth push is putting the address of our payload onto the stack as the lpStartAddress of CreateThread. That all makes sense as valid input to CreateThread.

The next two arguments are null, and we don't really have to worry about them. The only potentially confusing this is that dwStackSize can be passed in a 0x0, and this will tell CreateThread to use the default stack size.

What IS important is that after we CreateThread our payload, we return to this function and restore the registers to their initial state. Once we've done that, our program jumps to the proper initialization instructions of our program, and continues normal execution.

If we take a peak at the source code for Metasploit available on GitHub, we'll see that this is actually the exact functionality we can expect our of the segment injector.

Part Five: Making this sneakier

Now that we've seen the how and why this works, lets take a look at how detectable this is. Once again, I'll be using ThreatCheck to establish my baseline for detectability and work through any issues we might find.

This first scan tells us that right off the bat, this method of redirected code flow is detected immediately. The detected bytes correspond to the code we just reviewed!

Thankfully, breaking this signature proves relatively easy once we modify the string from "kernel32" to "KERnel32".

Part Six: Conclusion

In this writeup we saw the steps taken my msfvenom to embed a payload into an existing executable and saw the effects of an embedded payload in action. This is a robust capability that the Metasploit Framework provides to the security industry, though the stubs used to achieve the payload delivery are signatured.

We also saw some hints as to why the documentation states the "-k" option is less reliable on newer systems. On x86_64 specifically, the pushad and popad instructions have been removed. This means that the setup and restoration of the registers is at the discretion of the developer. This makes automated code injection techniques like the one discussed in this article complicated and less likely to suceed and retain the original program's functionality.

Future work in this area should focus on improving the reliability of embedded payloads, both in the execution of the payload itself and in retaining the original program's functionality, as well as in more robust methods and techniques of overcoming AV detection.