πŸ’»Computer Science 101

I am a firm believer that theory (although sometimes boring) is an indispensable building block in understanding technical concepts, especially in a somewhat low-level field like binary exploitation.

In this chapter, we will cover the relevant fundamentals that will serve as the foundation for exploiting the challenges I'll be presenting and help us understand what’s under the hood and why we’re doing what we’re doing.

Computer architecture

A computer’s architecture is made up of various components, CPU, RAM, storage, input/output devices, and buses that connect everything. However, our focus will be on the two key players for binary exploitation: the CPU (the brain) and RAM (the scratchpad).

The CPU

The CPU is where all the processing happens. It follows instructions from programs, using small storage spots called registers to handle data quickly. Every program is essentially a series of instructions that the CPU executes step by step.

Registers

Registers are small, fast memory regions located within the processor that hold data. Low-level instructions use registers because they are extremely fast, being directly integrated into the processor.

Registers typically hold values (such as 0x41, which is the hexadecimal representation of the character 'A'), addresses (pointers to other memory locations), or results from mathematical operations.

There are two types of registers: general-purpose and reserved registers. In the x86 architecture, two important reserved registers are RIP and RSP. RIP holds the address of the next instruction to execute, while RSP holds the address of the top of the stack.

Differences Between x86 and x86_64 CPU Architectures

It's also very important to understand that the CPU itself can have different architectures. Examples include x86 (32-bit) and x86_64 (64-bit).

One key difference is that registers in x86 architecture can only store data that is 32 bits long (4 bytes), while in the 64-bit architecture, registers can store data that is 64 bits long (8 bytes).

In 64-bit architecture, while registers can store 64 bits of data, the actual maximum memory space available for user programs uses only the first 47 bits. This means that the highest possible address in user space is 0x00007fffffffffff. This is also called a canonical addresses.

The RAM

RAM is the computer's temporary workspace, holding data and instructions while a program runs. Unlike storage, RAM is fast but temporary, when the power is off, everything in RAM disappears. Each piece of data in RAM has a memory address for easy access.

The connection between the CPU and the RAM

  • Programs are loaded from storage into RAM.

  • The CPU fetches instructions from RAM, processes them, and stores the results back in RAM.

  • Other components, like storage, handle long-term data and input/output devices manage user interaction.

Number systems

Number systems are essential in computing and mathematics, providing different ways to represent values. The three most common systems are decimal, binary, and hexadecimal.

Decimal (Base 10)

The decimal system is the standard system for denoting integer and non-integer numbers. It uses ten digits: 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9.

Binary (Base 2)

The binary system uses only two digits: 0 and 1. It is the foundation of all binary code used in computer systems.

Hexadecimal (Base 16)

The hexadecimal system uses sixteen symbols: 0-9 for values zero to nine and A-F for values ten to fifteen. It is often used in programming and computer science for its compact representation of binary values.

Binary numbers are often prefixed with 0b, while hexadecimal numbers are typically prefixed with 0x, as seen in the address 0x00007fffffffffff.

We can convert from one base to another using their corresponding mathematical formulas, or you can simply use a calculator or the Python interpreter. We'll see how this is done in the upcoming chapters.

Machine code

You've probably heard this before: computers only understand binary (1s and 0s). These binary instructions, called machine code, are what the computer's CPU processes directly.

However, writing programs directly in binary would be extremely difficult for humans. That’s why we use higher-level languages like C, which are much more readable and easier to work with.

For example, consider this simple C program:

int main() {
    puts("Very cool.");
    return 0;
}

Before this code can actually run on your computer, it has to go through several steps to transform into machine code, which the CPU understands.

Compilation and execution workflow

  1. C source code: You write the program in a high-level language like C.

  2. Compilation: The compiler takes this human-readable code and performs several tasks like syntax checking, type checking, and optimization. The end result is an object file which is a lower-level version of your program, though not yet an executable.

  3. Linking: The object file is then passed to the linker, which combines it with other code, like libraries or system functions (for example, the puts function). The linker outputs the final executable file (think .exe on Windows or .elf on Linux).

  4. Loading: When you run the executable, the loader takes care of putting the program into memory and setting it up so it can be executed by the CPU.

Linker

A linker’s job is to connect your code with external libraries. For example, if you use the printf function in your code, the linker finds printf in the glibc (GNU C Library) and links your source code to the actual implementation of printf in that library.

On Linux, it uses the Procedure Linkage Table (PLT) and Global Offset Table (GOT) to manage function calls to shared libraries dynamically, ensuring that the correct address is used during runtime.

The common linker for Linux is GNU ld, which handles this process behind the scenes to produce the final executable.

Loader

The loader is responsible for loading the executable into memory and setting up the program's environment. It maps different sections of the executable to memory, such as the .data section (for initialized global variables), the .bss section (for uninitialized global variables), and the .text section (for the compiled code).

Each section gets specific permissions: the .text segment is marked as readable (so the CPU can read instructions) and executable (to allow execution). Meanwhile, .data and .bss sections are marked as readable and writable to allow the program to modify global variables. The loader ensures these permissions are correctly set before the program starts running.

C Source Code Compilation and Execution Workflow

Assembly language

At its core, the executable file contains machine code, but it's nearly impossible for humans to read or write directly. This is where assembly language comes in. Assembly acts as a middle layer between human-readable code and machine code. It uses insutrctions such as MOV, ADD, and JMP that correspond directly to binary instructions.

Assembly is still quite low-level, but it's far easier to understand than raw binary. In fact, you can write entire programs in Assembly! For example, someone even managed to create the classic Snake game using only Assembly.

Executing a binary

So we have established the role of the loader right? Now, let's take a look at the order in which the different segments are loaded into memory, along with a brief overview of each segment, as this is very important. The order is as follows:

  1. Dynamic Libraries

    When the program is loaded, the dynamic linker (for example ld.so) loads any dynamically linked libraries (such as libc) into memory. These libraries are mapped into the process's memory space before the actual code begins executing.

  2. Text Segment

    This segment contains the executable code of the program, including all functions and instructions. It is typically read-only to prevent modification during execution, ensuring security and stability.

  3. BSS Segment (Uninitialized global and static variables)

  4. Data Segment (Initialized global and static variables)

  5. Stack Frame

    The stack is used to store function call information, such as:

    • Arguments passed to functions

    • Return address (the address to return to after a function call)

    • Base pointer (EBP/RBP) (marks the beginning of the stack frame for a function)

    • Local variables (variables defined within a function)

    The stack grows downward (from high memory to low memory).

  6. Heap

    This is where dynamically allocated memory resides (using malloc(), calloc(), new, etc.). The heap grows upward (from low memory to high memory) as memory is allocated.

  • The exact order might vary slightly depending on the specific operating system and linker implementation.

  • When a binary is statically linked, all necessary libraries are embedded within the executable itself, eliminating the need for dynamic libraries to be loaded at runtime.

The stack

Finally! we've reached a crucial part of binary exploitation the stack! This is where most of the magic happens.

I'll be referencing the definition from CTF101 The Stack as they explain it very well.

Definition

In computer architecture, the stack is a hardware manifestation of the stack data structure (a Last In, First Out queue).

In x86, the stack is simply an area in RAM that was chosen to be the stack - there is no special hardware to store stack contents. The esp/rsp register holds the address in memory where the bottom of the stack resides. When something is pushed to the stack, esp decrements by 4 (or 8 on 64-bit x86), and the value that was pushed is stored at that location in memory. Likewise, when a pop instruction is executed, the value at esp is retrieved (i.e. esp is dereferenced), and esp is then incremented by 4 (or 8).

N.B. The stack "grows" down to lower memory addresses!

Conventionally, ebp/rbp contains the address of the top of the current stack frame, and so sometimes local variables are referenced as an offset relative to ebp rather than an offset to esp. A stack frame is essentially just the space used on the stack by a given function.

Elements

Within a stack frame you usually have this generic order of variables (pushed from first to last as in lowest (biggest) memory address to the highest (smallest).

  1. Function arguments (such as argc and argv, pushed in reverse order)

  2. The return pointer (EIP/RIP)

    Often referred to as EIP (in 32-bit systems) or RIP (in 64-bit systems), is the address the program will jump back to after a function finishes executing. When a function is called, the current value of EIP/RIP, which points to the next instruction in the calling function, is saved onto the stack as the return address. This saved address allows the program to know where to resume in the calling function once the called function completes. Although it's just an address in the stack, it's commonly referred to as the saved EIP/RIP since it represents the previous instruction pointer that will be restored when the function returns.

  3. The base pointer (EBP/RBP)

  4. Function local variables (pushed in the correct order)

Additionally, don't forget about the stack pointer (ESP/RSP) always pointing to the top of the stack.

Stack Frame Structure in x86 Architecture

Calling conventions

In older x86 calling conventions (like cdecl), arguments are indeed pushed onto the stack, usually from right to left (reverse order) like we've mentioned.

However, in modern x86_64 calling conventions (like the SysV ABI for Linux), the first few arguments are passed through registers (e.g., rdi, rsi, rdx in x64) instead of being pushing directly onto the stack.

Function calls

Now, imagine the scenario where a main() function calls func1(), and func1() in turn calls func2(). The stack will look something like this (assuming each function has 2 arguments, each 4 bytes, plus the return address, base pointer, and 2 local variables, each 4 bytes, making a total of 24 bytes per stack frame, this is a purely theoretical use case of course).

Stack Frame in the case of multiple function calls

GOT and PLT

🚧

Endianness

🚧

These three images below are from the amazing book "Hacking: The Art of Exploitation, 2nd Edition" by Jon Erickson.

The 4 bytes of EIP (c7, 45, fc, 00) when interpreted as little-endian it equals 0x00FC45C7.

This instruction in machine code would equal mov DWORD PTR [ebp-4], 0x0 which we can see here using msf-nasm_shell.

Last updated

Was this helpful?