1.3 Buffer Overflow
Last updated
Last updated
Buffer overflow is a condition in a program where a function attempts to copy more data into a buffer than it can hold. The extra data that cannot be stored then replaces or overrides another piece of data on the stack.
Suppose the computer allocates a buffer of 40 bytes (or pieces) of memory to store 10 integers (4 bytes per integer).
An attacker sends the computer 11 integers (a total of 44 bytes) as input.
Whatever was in the location after the ten 40 bytes (allocated for our buffer) gets overwritten with the 11th integer of our input.
Remember that the stack grows backward. Therefore, the data in the buffer is copied from the lowest memory addresses to the highest memory addresses.
One of the vulnerable functions in C is the strcpy()
function, which allows buffer overflows to happen because it has no limit on how big the copied data should be. Using this function, we can overwrite the stack and alter the code flow. The safer alternative to strcpy()
is strncpy()
.
In the provided example, it seems there are some typos and missing parts in the code. I'll correct and complete it for better understanding:
Now, let's review the stack frame process:
Push the function parameters.
Call the function.
Execute the prologue (which updates EBP and ESP to create the new stack frame).
Allocate local variables.
Before:
After:
As a penetration tester, you have the ability to substitute the EIP with the address of the payload you intend to execute. Familiarity with the memory addresses of specific registers is crucial for this task.
Furthermore, the accompanying Python helper script generates a payload consisting of 'A' characters and concatenates a particular address ("\x48\x15\x40") as the return address. This prepared payload is subsequently employed in a command to execute the susceptible program. It is crucial to emphasize that this illustration is intended solely for educational purposes and ethical hacking scenarios. Unsanctioned exploitation of software vulnerabilities is against the law. Always verify that you possess the necessary permissions before engaging in any penetration testing activities.
These operations are examples of potential vulnerabilities to buffer overflows:
strcpy
strcat
gets / fgets
scanf / fscanf
printf
vsprintf
memcpy
Any function that performs the following operations may be susceptible to buffer overflows:
Fails to properly validate inputs before processing
Does not check input boundaries
It's worth emphasizing that buffer overflows are primarily associated with unsafe languages. Interpreted languages like C#, Visual Basic, .Net, Java, etc., inherently provide greater safety against such vulnerabilities.
Furthermore, various buffer operations can potentially trigger buffer overflows. These operations include:
User input
Data loaded from a disk
Data from the network
Manual identification of buffer overflows can be time-consuming, but there are techniques to streamline the process, such as:
Employing dynamic analysis tools like fuzzers or tracers, which track all executions and data flow, aiding in the identification of problems.
Leveraging fuzzing, a software testing technique that involves supplying invalid or unexpected data as input to a program. Fuzzing helps identify issues such as memory hogging, CPU hogging, and crashes. Fuzzing frameworks, such as Peach Fuzzing Platform, Sulley, Sfuzz, and FileFuzz, can be used in this context.
It's essential to handle these techniques responsibly and ethically, ensuring that proper permissions are obtained before engaging in any penetration testing activities.
In the previous example, finding the right offset to overwrite the EIP address was straightforward. However, in a real exploitation scenario, the required number of characters to crash an application may vary.
To identify the correct offset, various methods like binary search can be employed, or tools can be utilized for efficiency. Two such tools are:
pattern_create and pattern_offset (Ruby):
pattern_create
takes a number as input and outputs a pattern of that length.
When this pattern is used as input for the vulnerable target application, an error message will indicate where the error occurred, typically the value of the EIP register being overwritten.
pattern_offset
receives a memory address and outputs the offset required to overwrite that address.
Mona (Python) - Immunity Debugger plugin:
Mona functions similarly to pattern_create
and pattern_offset
.
It is a plugin for Immunity Debugger and aids in the identification of offsets needed for exploitation.
These tools assist in automating the process of determining the offset for EIP overwrite, saving time and effort during the exploit development phase.
Now that we know the correct size of our payload, the next step is to overwrite the EIP with a value. This value will be utilized by the RET instruction to facilitate the return. The objective is to return to our shellcode, triggering its execution.
Since the address in the stack dynamically changes, we cannot rely on it to build the exploit. Instead, we aim to locate a JMP ESP (or CALL ESP) instruction in a fixed memory location. This allows the program, upon return, to execute the JMP ESP (or CALL ESP) instruction, automatically jumping to the area where our shellcode is stored.
In environments where ASLR is not enabled, kernel32.dll functions are located at fixed addresses in memory. We can leverage this to perform a JMP ESP or CALL ESP to the process address space within kernel32.dll. This technique enables a secure jump to and from kernel32.dll, ensuring execution at the address in ESP that holds the first line of our shell code.
Various tools and techniques can be used to detect the address of a CALL/JMP ESP. One approach involves disassembling the .dll and searching for the desired instruction. Tools like Immunity Debugger (or IDA) can be used for this purpose. Additionally, findjmp2 and Mona are other tools that can help locate CALL ESP and JMP ESP instructions.
Example: In the context of goodpwd.exe, the variable "command" is composed of junk bytes + EIP + shellcode. It's essential to note the inclusion of NOPs (\x90) at the beginning of the shellcode. This ensures that when the JMP ESP is executed, the program starts with NOPs and then progresses to execute the actual shellcode.
Program flow:
| | . . . | | |-|------------------------------| | |
(2)| junk |<-(1) We start injecting here| |I| junk | | |I| . . . | | |I| junk | | |I| junk | | |I| junk (EBP was here) | | |I-->| \x3B\x7D\x26\x77 |
(3)-> (kernelbase.dll) 0x77267D3B; JMP ESP| |ESP->| shellcode (\x90\x90\x90\x90) |<--I
(4) EIP will go here, because
(3)| | | shellcode (\x90\x90\x90\x90) | | | | shellcode (...) | | |
Use Windows XP to simulate buffer overflow without DEP and ASLR protection
EMET (Enhanced Mitigation Experience Toolkit) (Manual): EMET is a utility designed to prevent software vulnerabilities from being successfully exploited. It offers various mitigation technologies, including DEP, ASLR, SEHOP, and more. EMET can enhance system security or selectively disable these mitigations for testing purposes.
Goal: Introduce randomness to executables, libraries, and stack in the process address space, making it challenging for attackers to predict memory addresses.
ASLR Activation: OS loads executables at different locations in memory every time (on each boot).
Verification: Tools like Process Explorer or Immunity Debugger with Mona can be used to check ASLR status.
Non-Randomized Module:
Identify a module without ASLR and utilize a simple JMP/CALL ESP from that module.
Brute Force:
Overwrite the return pointer with plausible addresses until reaching the shellcode.
Success depends on exploit tolerance to variations in address space.
NOP-Sled:
Create a large area of NOPs in the shellcode, increasing chances to jump to the intended area.
Other Methods:
Techniques requiring expertise in reverse engineering and exploit writing.
ASLR and DEP: Achieve maximum defense when both are correctly implemented.
Detailed technical information on implementation here.
Goal: Prevent execution of code from non-executable memory pages.
Return Oriented Programming (ROP):
Utilize small instruction groups (gadgets) in the program's stack, creating a chain of instructions.
Gadgets end with RET, avoiding DEP restrictions.
Examples include calling memory protection functions or executing commands.
ASLR: Introduce randomness to kernel API addresses, making DEP bypass harder.
Security Implementation: Stack Canary places a random integer (canary) before the return instruction to detect buffer overflows.
Retrieve or Guess Canary:
Include canary value in the payload.
David Litchfield's method bypasses canary without guessing or retrieving the value.
SafeSEH:
Introduces a security measure to prevent bypassing by overwriting the Exception Handler Structure (SEH).
Learn more here and find a detailed article on bypassing stack canary here.