1.1 Architecture Foundamentals

Architecture Foundamentals

CPU, ISA, and Assembly

The CPU serves as the component responsible for executing a program's machine code.

CPU instructions are typically represented in hexadecimal (HEX) format. Due to its intricacy, using it in its natural form is impractical for humans.

Hence, the code is translated into a more readable form known as assembly language (ASM). Two widely used assemblers are NASM (Netwide Assembler) and MASM (Microsoft Macro Assembler). In this case, we will be using NASM.

Here is an example of machine language code accompanied by its corresponding assembly code:

helloworld.exe:
Dissasembly of section .text:
00401500 <_main>:
  401500:  55                    push ebp
  401501:  89 e5                 mov ebp, esp
  401503:  83 e4 f0              and esp, 0xffffff0
  401506:  83 ec 10              sub esp, 0x10
  401509:  e8 72 09 00 00        call 401e80 <___main>
  40150e:  c7 04 24 00 40 40 00  mov DWORD PTR [esp],0x404000
  401515:  e8 de 10 00 00        call 4025f8 <_puts>
  40151a:  b8 00 00 00           mov eax,0x0
  40151f:  c9                    leave
  401520:  c3                    ret

Note: Every CPU possesses its own Instruction Set Architecture (ISA).

The ISA constitutes the set of instructions that a programmer or compiler must comprehend and utilize to accurately write a program tailored for that specific CPU and machine. In essence, ISA encompasses what a programmer can perceive: memory, registers, instructions, etc. It furnishes all the essential information for those interested in writing a program in that machine language.

One of the widely adopted ISAs is the x86 instruction set, or architecture, which originated from the Intel 8086. The term 'x86' denotes 32-bit processors, while 'x64' (also known as x86_64 or AMD64) signifies the 64-bit versions.

Registers

The bit count, whether 32 or 64, pertains to the width of the CPU registers.

Every CPU is equipped with a predefined set of registers that it accesses as needed. Registers can be visualized as temporary variables employed by the CPU to retrieve and store data.

Registers are broadly categorized into two types: specific-purpose CPU registers and general-purpose registers (GPRs).

The table below provides a summary of the eight general-purpose registers for the x86 architecture:"

x86 Naming ConventionNamePurpose

eax

Accumulator

Used in arithmetic operation

ecx

Counter

Used in shift/rotate instruction and loops

edx

Data

Used in arithmetic operation and I/O

ebx

Base

Used as a pointer to data

esp

Stack Pointer

Pointer to the top of the stack

ebp

Base Pointer

Pointer to the base of the stack (aka Stack Base Pointer)

esi

Source Index

Used as a pointer to a source in stream operation

edi

Destination

Used as a pointer to a destination in stream operation

The naming convention of the old 8-bit CPU had 16-bit register divided into 2 parts:

  • In the 16-bit naming convention, the combination of the Low (L) and High (H) parts is replaced with an X. However, for Stack Pointer, Base Pointer, Source, and Destination registers, the Low part is simply removed.

  • In the 32-bit representation, the register acronym is prefixed with E, indicating extended. In the 64-bit representation, E is replaced with R.

Summary:

Note : size is in the table not in proportion of the actual register size
------------------------------------------------------------
| Register |Accumulator|  Counter  |    Data   |    Base   |
|----------|-----------|-----------|-----------|-----------|
| 64-bit   |    RAX    |    RCX    |    RDX    |    RBX    |
| 32-bit   |   |  EAX  |   |  ECX  |   |  EDX  |   |  EBX  |
| 16-bit   |     | AX  |     | CX  |     | DX  |     | BX  |
| 8-bit    |     |AH|AL|     |CH|CL|     |DH|DL|     |BH|BL|
------------------------------------------------------------
| Register | Stack Ptr |  Base Ptr |  Source   |Destination|
|----------|-----------|-----------|-----------|-----------|
| 64-bit   |    RSP    |    RBP    |    RSI    |    RDI    |
| 32-bit   |   |  ESP  |   |  EBP  |  |   ESI  |   |  EDI  |
| 16-bit   |     | SP  |     | BP  |     | SI  |     | DI  |
| 8-bit    |     | |SPL|     | |BPL|     | |SIL|     | |DIL|
------------------------------------------------------------

In addition to these general-purpose registers, another register crucial for our purposes is the EIP, following the x86 naming convention. The Instruction Pointer (EIP) plays a pivotal role in controlling program execution by storing a pointer to the address of the next instruction (machine code) that will be executed.

Process Memory

The process memory is segmented into four regions: Text, Data, Heap, and Stack.

0 (Lower memory addresses)

| .text | Instructions |-------|
| .data | Initialized variables |
| BSS   | Uninitialized variables |
| Heap  | 
|       |
| ^     |
| Stack |
0xFFFFFFFF (Higher memory addresses)
  • Text (Code) Region: This fixed segment of the program contains instructions and is marked as read-only since the program should not alter during execution.

  • Data Region: It is further divided into initialized data, which includes static and globally declared variables that are predefined and modifiable, and uninitialized data (BSS) for variables initialized to zero or without explicit initialization (e.g., static int t).

  • Heap: Located after the BSS segment, the Heap dynamically allocates memory during execution. The program can request additional space in memory through system calls like brk and sbrk, commonly used by malloc, realloc, and free.

  • Stack: Positioned as the last region in memory, the Stack is particularly significant for our purposes. It is crucial in managing program execution.

The Stack

The stack functions as a Last-In-First-Out (LIFO) block of memory situated in the higher part of memory. Conceptually, the stack can be envisioned as an array employed for preserving a function's return address, passing function arguments, and holding local variables.

The ESP register (Stack Pointer) serves the purpose of indicating the top of the stack. It undergoes modification each time a value is pushed onto (PUSH) or popped off (POP) the stack.

Crucially, the stack grows in a downward direction, toward lower memory addresses.

https://miro.medium.com/v2/resize:fit:1400/1*tQ9Y11OdaMnhXwbfCF-edA.gif

PUSH

When a PUSH instruction is executed, the ESP register undergoes modification.

  • Starting Value: Initially, the ESP points to the top of the stack.

  • Process: During the PUSH process, the ESP subtracts 4 (in 32-bit) or 8 (in 64-bit) and then writes the data to the memory address at the updated ESP. Subsequently, the ESP is updated to point to the new top of the stack. It's crucial to note that since the stack grows downward, subtracting 4 or 8 is necessary. Failure to do so would overwrite the current location pointed to by ESP (the top), resulting in data loss.

POP

When a PUSH instruction is executed, the ESP register undergoes modification.

  • Starting Value: Initially, the ESP points to the top of the stack, which is the previous ESP value plus 4.

  • Process: The POP operation, in contrast to PUSH, retrieves data from the top of the stack and stores it in a specified register. Consequently, the data at the address location in ESP (representing the top of the stack) is retrieved and stored, usually in another register.

After a POP operation, the ESP value is incremented, typically by 4 in x86 or by 8 in x64.

Importantly, it should be noted that values in the stack are not deleted or zeroed; they are only dereferenced.

Procedures and functions

Procedures and function alter the normal flow of the process. When a function or procedure terminates, it returns control to the statement or instruction that called the function.

Stack Frames

The stack consists of logical stack frames (portions/areas of the stack) that are PUSH ed when calling a function and POP ped when returning a value.

When a subroutine, such a a function or procedure, is started, a stack frame is created and assigned to the current ESP location (top of the stack); this allows the subroutine to operate independently in its own location in the stack.

When subroutine ends, 2 things happen:

  1. The program receives the parameters passed from the subroutine

  2. The Instruction Pointer (EIP) is reset to the location at the time of the initial call.

In other words, the stack frame keeps track of the location where each subroutine should return the control when it terminates.

This process has 3 main operations:

  1. When a function is called, the arguments [(in brackets)] need to be evaluated

  2. The control flow jumps to the body of the function, and the program executes its code

  3. Once the function ends, a return statement is encountered, the program returns to the function call (the next statement in the code)

Example:

  1. Simple Example

int b()){
  return 0;
}
int a(){
  b();
  return 0;
}
int main(){
  a();
  return 0;
}

Memory Address

initial

main() calls a()

a() calls b()

return from b()

return from a()

Lower

Frame for b()

Frame for a()

Frame for a()

Frame for a()

Frame for main()

Frame for main()

Frame for main()

Frame for main()

Frame for main()

Higher

  1. Parameterized function

void functest(int a, int b, int c){
  int test1 = 55;
  int test2 = 56;
}

int main(int arc, char *argv[]){
  int x = 11;
  int z = 12;
  int y = 13;
  functest(30,31,32);
  return 0;
}
Memory AddressStep 1Step 2Step 3Step 4

Lower

old EBP

old EIP

old EIP

old EIP

argc

argc

argc

argc

argv

argv

argv

argv

...

...

...

...

...

Higher

Step 1: Upon program initiation, the function main() parameters (argc, argv) are pushed onto the stack, following a right-to-left order.

Step 2: The program proceeds to CALL the main() function. During this process, the processor PUSHes the content of the EIP (Instruction Pointer) onto the stack, indicating the address of the next instruction after the CALL. This is crucial for proper continuation upon returning from the called function.

Step 3: Control transitions from the caller (the instruction executing the function call, typically the operating system) to the callee (the called function, in this case, the main() function).

Step 4: Inside the main() function, a new stack frame must be established. This stack frame is defined by the EBP (Base Pointer) and the ESP (Stack Pointer). To preserve the information of the old stack frame and avoid its loss upon returning, the current EBP is saved on the stack. Without this step, there would be no reference to the information from the previous stack frame—the function that invoked main(). After storing its value, the EBP is updated and points to the top of the stack.

Prologue

The steps outlined in Example 2 are referred to as the prologue—a sequence of instructions that occurs at the commencement of a function and is common to all functions. Once the callee gains control, it executes the following instructions:

  1. push ebp: This instruction saves the old base pointer onto the stack.

  2. mov ebp, esp: The value of the stack pointer (ESP) is copied to the base pointer (EBP). This action establishes a new stack frame on top of the existing stack. The base of the new stack frame is positioned above the old stack frame.

  3. sub esp, X: The Stack Pointer (top of the stack) is moved by decreasing its value. This step is essential to allocate space for local variables that will be introduced.

The resultant stack configuration at the conclusion of this process is as follows:

FT

56

55

Old ebp (main)

M

old EIP

30

31

32

...

13 (y)

12 (z)

11 (x)

old EBP

OS

old EIP

argc

argv

...

FT : Functest stack M : Main stack OS : OS stack frame

Epilogue

When a code executes a return statement, control reverts to the previous procedure, leading to the destruction of the current stack. This process is known as the epilogue.

The operations performed during the epilogue are as follows:

  1. Return control to the caller.

  2. Replace the stack pointer with the current base pointer: This involves restoring the stack pointer's value to its state before the prologue. It is achieved by popping the base pointer from the stack.

  3. Return to the caller by popping the instruction pointer from the stack: This pointer, stored in the stack, is popped, and the program counter jumps to its location.

The following code snippet represents the epilogue:

codeleave
ret

Alternatively, it can be expressed as:

codemov esp, ebp
pop ebp
ret

When functest() concludes, the mov esp, ebp instruction copies the value of ebp into esp, effectively relocating esp. The pop ebp instruction pops the value from the top of the stack (pointed to by the new esp) and stores it in ebp. As the top of the stack points to the memory address where the old EBP is stored (the EBP of the caller), the caller's stack frame is restored.

The ret instruction pops the value from the top of the stack, which represents the old EIP (the next instruction after the caller). It then jumps to that location, relinquishing control back to the caller. The RET instruction only affects the EIP and ESP registers.

Endianess

Endianness is the way of representing (storing) values in memory.

In Big-Endian representation, the least significant byte (LSB) is stored at the highest memory address, while the most significant byte is stored at the lower memory address.

Example: For the value 0x12345678, it would be represented as follows in memory:

Memory Address:  0x1000   0x1001   0x1002   0x1003
                 +--------+--------+--------+--------+
Stored Value:    |  0x12  |  0x34  |  0x56  |  0x78  |
                 +--------+--------+--------+--------+

In this example, the byte 0x12 (the most significant byte) is stored at the lowest memory address (0x1000), and the byte 0x78 (the least significant byte) is stored at the highest memory address (0x1003).

Small-Endian: In small-endian representation, the least significant byte (LSB) is stored at the lower memory address, while the most significant byte is stored at the highest memory address.

Example:

Memory Address: 0028FEBC Data: 12 34 56 78

In this example:

  • Byte at 0028FEBC is 0x12 (the least significant byte).

  • Byte at 0028FEBC + 1 is 0x34.

  • Byte at 0028FEBC + 2 is 0x56.

  • Byte at 0028FEBC + 3 is 0x78 (the most significant byte).

So, the value 0x12345678 is stored in memory in a small-endian representation. When reading from left to right, you'd see the byte sequence 12 34 56 78.

NOP

NOP (No Operation) Instruction: NOP, which stands for "No Operation," is an assembly language instruction that essentially does nothing when executed. When a program encounters a NOP, it simply moves on to the next instruction in sequence. In Intel x86 CPUs, NOP instructions are represented with the hexadecimal value 0x90.

NOP Sled: A NOP sled, short for "No Operation sled," is a technique commonly used during the exploitation process of buffer overflows. Its primary purpose is to fill a significant portion of the stack (or other areas of memory) with NOP instructions. The NOP sled allows for a degree of flexibility in reaching a specific target instruction.

In the context of buffer overflows, a common scenario involves overwriting a return address on the stack. By placing a NOP sled before the actual malicious payload, the program's execution can slide through the NOPs until reaching the intended payload. This compensates for variations in the exact memory location of the injected code, providing a more robust and reliable exploitation technique.

Security Implementations

ASLR (Address Space Layout Randomization)

The primary objective of ASLR is to introduce randomness to the memory address space for executables, libraries, and stacks. This randomization enhances security by making it challenging for attackers to predict memory addresses accurately, thereby causing exploits to fail and resulting in crashes.

When ASLR is enabled, the operating system loads the same executables at different locations in memory each time a program is executed. This variability adds a layer of complexity for potential attackers attempting to exploit security vulnerabilities.

It's crucial to recognize that ASLR might not be applied uniformly across all modules within a process. Even if ASLR is enabled for a specific process, certain Dynamic Link Libraries (DLLs) in the address space may lack this protection, potentially exposing the process to ASLR bypass attacks.

For verifying the ASLR status on different programs, tools such as Process Explorer can be utilized.

https://learn.microsoft.com/en-us/sysinternals/downloads/process-explorer

Enhanced Mitigation Experience Toolkit (EMET)

Windows provides an additional tool to address exploitation concerns known as the Enhanced Mitigation Experience Toolkit (EMET). EMET is designed to provide supplementary layers of security by applying various mitigation techniques to help safeguard against both known and unknown exploits. It adds an extra defense layer, especially for legacy software or systems facing potential vulnerabilities.

https://msrc.microsoft.com/blog/2010/09/the-enhanced-mitigation-experience-toolkit-2-0-is-now-available/

DEP (Data Execution Prevention):

DEP is a security measure implemented in both hardware and software to thwart the execution of code from memory pages not explicitly designated as executable. By preventing the execution of injected code from non-executable regions, DEP adds a layer of defense that significantly complicates buffer overflow exploitations.

https://support.microsoft.com/en-us/help/875352/a-detailed-description-of-the-data-execution-prevention-dep-feature-in

Stack Cookies (Canary):

A stack cookie, also known as a canary, is a security mechanism that involves placing a specific value next to the return address on the stack.

During the function prologue, a value is loaded into this location, and in the epilogue, it ensures that the value remains unchanged. Consequently, when the epilogue runs, it verifies the presence and correctness of the value.

If the value is not as expected, it signals a potential buffer overflow. This is because buffer overflows typically involve overwriting data in the stack, including the canary value. The stack cookie thus acts as a sentinel, helping to detect and mitigate buffer overflow attacks.

Other Resources

Last updated