1.3 Buffer Overflow

What is a Binary Exploitation?

What is a Binary Exploitation?

Binary exploitation refers to the process of taking advantage of vulnerabilities in binary code, typically executable files, to gain unauthorized access, control, or manipulate a computer system. Binary exploitation is a common technique employed in the field of cybersecurity, particularly in the realm of offensive security and penetration testing.

Here's a breakdown of key concepts related to binary exploitation:

  1. Binary Code: Computers execute instructions in the form of binary code, which consists of sequences of 0s and 1s. Executable files, such as those with extensions like .exe or .elf, contain binary code that the computer's processor can execute.

  2. Vulnerabilities: Binary exploitation often involves identifying and exploiting vulnerabilities in software. These vulnerabilities can range from buffer overflows and format string vulnerabilities to insecure input handling or poor memory management.

  3. Exploits: An exploit is a piece of code or a series of commands that takes advantage of a specific vulnerability to compromise a system's security. Exploits are crafted to manipulate the binary code in a way that allows an attacker to achieve their objectives, such as gaining unauthorized access or executing arbitrary code.

  4. Buffer Overflow: One common type of vulnerability is a buffer overflow, where an attacker overflows a program's buffer to overwrite adjacent memory, potentially leading to the execution of malicious code.

  5. Shellcode: In binary exploitation, shellcode is a small piece of code that is often injected into a vulnerable program to spawn a shell. This shell provides the attacker with a command-line interface and control over the compromised system.

  6. ROP (Return-Oriented Programming): ROP is a technique used in binary exploitation where the attacker combines short sequences of existing code (gadgets) to perform specific actions without injecting new code. This is often used to bypass security mechanisms like Data Execution Prevention (DEP).

Binary exploitation is a skill often honed by security professionals and ethical hackers to identify and address security weaknesses in software. However, it can also be misused by malicious actors for unauthorized access and attacks. It's crucial for software developers to be aware of secure coding practices to minimize the risk of vulnerabilities in their applications.

Buffer Overflow (BoF)

Buffer overflow is a condition in a program where a function attempts to copy more data into a buffer than it can hold. The extra data that cannot be stored then replaces or overrides another piece of data on the stack.

Suppose the computer allocates a buffer of 40 bytes (or pieces) of memory to store 10 integers (4 bytes per integer).

An attacker sends the computer 11 integers (a total of 44 bytes) as input.

Whatever was in the location after the ten 40 bytes (allocated for our buffer) gets overwritten with the 11th integer of our input.

Remember that the stack grows backward. Therefore, the data in the buffer is copied from the lowest memory addresses to the highest memory addresses.

One of the vulnerable functions in C is the strcpy() function, which allows buffer overflows to happen because it has no limit on how big the copied data should be. Using this function, we can overwrite the stack and alter the code flow. The safer alternative to strcpy() is strncpy().

In the provided example, it seems there are some typos and missing parts in the code. I'll correct and complete it for better understanding:

#include <string.h>

int main(int argc, char** argv) {
    argv[1] = (char*)"AAABBBCCCDDDEEEFFFGGGHHHI";
    char buffer[10];
    strncpy(buffer, argv[1], sizeof(buffer));
    return 0;
}

Now, let's review the stack frame process:

  1. Push the function parameters.

  2. Call the function.

  3. Execute the prologue (which updates EBP and ESP to create the new stack frame).

  4. Allocate local variables.

Before:

...
Other local variables
Buffer[10]
EBP
Function return address (EIP)
Parameters of function
Local variables of main
Return address of main
Parameters of main
...

After:

...
Other local variables
AAAB
BBBC
CCDD
DEEE
EFFG
GGGH
HHI
...

As a penetration tester, you have the ability to substitute the EIP with the address of the payload you intend to execute. Familiarity with the memory addresses of specific registers is crucial for this task.

Furthermore, the accompanying Python helper script generates a payload consisting of 'A' characters and concatenates a particular address ("\x48\x15\x40") as the return address. This prepared payload is subsequently employed in a command to execute the susceptible program. It is crucial to emphasize that this illustration is intended solely for educational purposes and ethical hacking scenarios. Unsanctioned exploitation of software vulnerabilities is against the law. Always verify that you possess the necessary permissions before engaging in any penetration testing activities.

Finding BoF

These operations are examples of potential vulnerabilities to buffer overflows:

  • strcpy

  • strcat

  • gets / fgets

  • scanf / fscanf

  • printf

  • vsprintf

  • memcpy

Any function that performs the following operations may be susceptible to buffer overflows:

  • Fails to properly validate inputs before processing

  • Does not check input boundaries

It's worth emphasizing that buffer overflows are primarily associated with unsafe languages. Interpreted languages like C#, Visual Basic, .Net, Java, etc., inherently provide greater safety against such vulnerabilities.

Furthermore, various buffer operations can potentially trigger buffer overflows. These operations include:

  • User input

  • Data loaded from a disk

  • Data from the network

Manual identification of buffer overflows can be time-consuming, but there are techniques to streamline the process, such as:

  1. Utilizing static analysis tools like splint, Cppcheck, etc., especially if you have access to the source code. These tools can detect not only buffer overflows but also other types of errors.

  2. Employing dynamic analysis tools like fuzzers or tracers, which track all executions and data flow, aiding in the identification of problems.

  3. Leveraging fuzzing, a software testing technique that involves supplying invalid or unexpected data as input to a program. Fuzzing helps identify issues such as memory hogging, CPU hogging, and crashes. Fuzzing frameworks, such as Peach Fuzzing Platform, Sulley, Sfuzz, and FileFuzz, can be used in this context.

It's essential to handle these techniques responsibly and ethically, ensuring that proper permissions are obtained before engaging in any penetration testing activities.

Static Analysis Tools

http://www.splint.org/
http://cppcheck.sourceforge.net/

Fuzzing Frameworks

http://peachfuzzer.com/
https://github.com/OpenRCE/sulley
https://github.com/orgcandman/Simple-Fuzzer

Exploiting BoF

Finding the Right Offset

In the previous example, finding the right offset to overwrite the EIP address was straightforward. However, in a real exploitation scenario, the required number of characters to crash an application may vary.

To identify the correct offset, various methods like binary search can be employed, or tools can be utilized for efficiency. Two such tools are:

  1. pattern_create and pattern_offset (Ruby):

    • pattern_create takes a number as input and outputs a pattern of that length.

    • When this pattern is used as input for the vulnerable target application, an error message will indicate where the error occurred, typically the value of the EIP register being overwritten.

    • pattern_offset receives a memory address and outputs the offset required to overwrite that address.

  2. Mona (Python) - Immunity Debugger plugin:

    • Mona functions similarly to pattern_create and pattern_offset.

    • It is a plugin for Immunity Debugger and aids in the identification of offsets needed for exploitation.

These tools assist in automating the process of determining the offset for EIP overwrite, saving time and effort during the exploit development phase.

Overwriting the EIP

Now that we know the correct size of our payload, the next step is to overwrite the EIP with a value. This value will be utilized by the RET instruction to facilitate the return. The objective is to return to our shellcode, triggering its execution.

Since the address in the stack dynamically changes, we cannot rely on it to build the exploit. Instead, we aim to locate a JMP ESP (or CALL ESP) instruction in a fixed memory location. This allows the program, upon return, to execute the JMP ESP (or CALL ESP) instruction, automatically jumping to the area where our shellcode is stored.

In environments where ASLR is not enabled, kernel32.dll functions are located at fixed addresses in memory. We can leverage this to perform a JMP ESP or CALL ESP to the process address space within kernel32.dll. This technique enables a secure jump to and from kernel32.dll, ensuring execution at the address in ESP that holds the first line of our shell code.

Various tools and techniques can be used to detect the address of a CALL/JMP ESP. One approach involves disassembling the .dll and searching for the desired instruction. Tools like Immunity Debugger (or IDA) can be used for this purpose. Additionally, findjmp2 and Mona are other tools that can help locate CALL ESP and JMP ESP instructions.

Example: In the context of goodpwd.exe, the variable "command" is composed of junk bytes + EIP + shellcode. It's essential to note the inclusion of NOPs (\x90) at the beginning of the shellcode. This ensures that when the JMP ESP is executed, the program starts with NOPs and then progresses to execute the actual shellcode.

Program flow:

| | . . . | | |-|------------------------------| | |

(2)| junk |<-(1) We start injecting here| |I| junk | | |I| . . . | | |I| junk | | |I| junk | | |I| junk (EBP was here) | | |I-->| \x3B\x7D\x26\x77 |

(3)-> (kernelbase.dll) 0x77267D3B; JMP ESP| |ESP->| shellcode (\x90\x90\x90\x90) |<--I

(4) EIP will go here, because

(3)| | | shellcode (\x90\x90\x90\x90) | | | | shellcode (...) | | |

Use Windows XP to simulate buffer overflow without DEP and ASLR protection

Security Implementations

Helpful Tool

EMET (Enhanced Mitigation Experience Toolkit) (Manual): EMET is a utility designed to prevent software vulnerabilities from being successfully exploited. It offers various mitigation technologies, including DEP, ASLR, SEHOP, and more. EMET can enhance system security or selectively disable these mitigations for testing purposes.

ASLR (Address Space Layout Randomization)

Goal: Introduce randomness to executables, libraries, and stack in the process address space, making it challenging for attackers to predict memory addresses.

  • ASLR Activation: OS loads executables at different locations in memory every time (on each boot).

  • Verification: Tools like Process Explorer or Immunity Debugger with Mona can be used to check ASLR status.

Bypassing Techniques

  1. Non-Randomized Module:

    • Identify a module without ASLR and utilize a simple JMP/CALL ESP from that module.

  2. Brute Force:

    • Overwrite the return pointer with plausible addresses until reaching the shellcode.

    • Success depends on exploit tolerance to variations in address space.

  3. NOP-Sled:

    • Create a large area of NOPs in the shellcode, increasing chances to jump to the intended area.

  4. Other Methods:

    • Techniques requiring expertise in reverse engineering and exploit writing.

Protective Measures

  • ASLR and DEP: Achieve maximum defense when both are correctly implemented.

  • Detailed technical information on implementation here.

Data Execution Prevention (DEP)

  • Goal: Prevent execution of code from non-executable memory pages.

Bypassing Technique

  • Return Oriented Programming (ROP):

    • Utilize small instruction groups (gadgets) in the program's stack, creating a chain of instructions.

    • Gadgets end with RET, avoiding DEP restrictions.

    • Examples include calling memory protection functions or executing commands.

Protective Measures

  • ASLR: Introduce randomness to kernel API addresses, making DEP bypass harder.

Stack Canary and SafeSEH

  • Security Implementation: Stack Canary places a random integer (canary) before the return instruction to detect buffer overflows.

Bypassing Technique

  • Retrieve or Guess Canary:

    • Include canary value in the payload.

    • David Litchfield's method bypasses canary without guessing or retrieving the value.

Protective Measures

  • SafeSEH:

    • Introduces a security measure to prevent bypassing by overwriting the Exception Handler Structure (SEH).

    • Learn more here and find a detailed article on bypassing stack canary here.

Labs and Exercises

Other Resources

Last updated