Fundamentals of Software Exploitation

Course Description

C/C++ software is notoriously difficult to develop securely and therefore highly susceptible to exploitable bugs. In the hands of a talented exploit developer, the right vulnerability can be used to force an otherwise harmless application into executing malicious code.

In this course, students will learn how to analyze 64-bit Linux executables and develop python-based exploits against each one. As students progress, the focus will shift towards bypassing exploit-mitigations commonly used by modern applications.

Students will walk away from this course with a solid understanding of the industry workflow behind identifying, analyzing, and exploiting a broad spectrum of C-based vulnerabilities.


Learning Outcomes

  • Acquire an intimate understanding of C-language internals, and confidence reading or debugging x86-64 assembly code
  • Develop exploits that can bypass ubiquitous exploit mitigations
    • Stack Cookies, Data Execution Prevention, ASLR, PIE
  • Master the contemporary exploitation techniques
    • Shellcoding, ROP, Stack Pivoting, Information leaks
  • Identify exploitable vulnerability patterns in C code
    • Heap Overflows, UAF, Integer Issues, Race Conditions
  • Possess the skills necessary to perform independent vulnerability research against applications without source code

Suggested Prerequisites

  • Working knowledge of C & Python
  • Basic familiarity with the Linux ecosystem
    • ELF files, GDB, the Linux command line
  • Prior exposure to low-level concepts is a plus
    • Assembly code, registers, endianness, syscalls
  • Experience using disassemblers is helpful but not required
    • IDA, Binary Ninja, Ghidra, Cutter, Radare
  C Fundamentals

Your ability to read C code and identify with the perils it exposes developers to is a critical component of this course. This chapter serves as an optional 'review' to strengthen your understanding of C fundamentals, with a particular focus on the language features we find most important for this course.

The lessons offered by this chapter help smoothen the learning curve of reverse engineering, and the rest of the course.
  • A review of exploitation-relevant C language concepts
  • Program Input and Output
  • Mapping C variables to low-level memory layout
  • Integer Types, Arithmetic, and Bitwise operations

  Reverse Engineering

Reverse engineering is the process of examining a compiled executable as a means of understanding how it works. This is an essential skill for exploit-development and requires intimate knowledge of assembly code.

This chapter will walk through the essential concepts of x86-64 assembly, reverse engineering, and assembly-level debugging.

  • Introduction to x86-64 assembly
  • Utilizing disassemblers and debuggers in reversing
  • Analyzing program functionality without source-code access
  • Reverse engineering of simple algorithms ('keygenning')

  Memory Corruption

As a relatively low-level language, C programs are prone to containing bugs that can cause memory corruption. Memory corruption will often cause a program to misbehave or crash in unexpected ways, but also serves as the basis on which most of the binary exploitation field is built.

This chapter will introduce the concepts behind classical binary exploitation through the exploitation of simple buffer overflows, and stack-based memory corruption.

  • Stack concepts, RSP, RBP, frame allocation/deallocation
  • How local function variables are stored/accessed in memory
  • Introduction to memory corruption, buffer overflows
  • Control flow and calling convention on x86-64 Linux

  Shellcoding

Shellcoding is the practice of crafting small pieces of "malicious" assembly-code that can be injected into a running program by an exploit. By hijacking control flow, an exploit can "jump" to its maliciously injected "shellcode".

This chapter will teach you about writing the most common types of shellcode payloads and challenge you to tailor shellcode to fit simple input constraints.

  • Redirecting control flow into regions of injected code
  • Learn how to perform x86-64 Linux system calls in assembly
  • Writing shellcode that can operate under common constraints

  Stack Cookies

Stack cookies were introduced in the early 2000's as one of the first commonly available exploit-mitigation technologies. Inserted by the compiler, stack cookies are designed to detect stack-based buffer overflows and abort execution early in an effort to prevent successful exploitation.

This chapter will explore how impactful cookies were in making most stack-based buffer overflows unexploitable, while analyzing scenarios where it is possible to bypass this mitigation.

  • Introduces the concept of exploit mitigations, checksec
  • Analyze how "Stack Cookies" killed simple buffer overflows
  • Learn about the limitations of cookies and how to bypass them

  Return Oriented Programming

Data Execution Prevention (DEP) was the second major exploit-mitigation made broadly available. DEP was designed to disarm code injection techniques (such as shellcoding) by ensuring that memory marked as "data" could not be executed by the CPU.

This chapter will teach you about the challenges DEP imposes on exploit-development. In turn, we show how the exploitation technique known as Return Oriented Programming (ROP) is used in virtually all contemporary exploits to bypass this mitigation.

  • Introduces Data Execution Prevention (DEP/NX)
  • How to bypass DEP using Return Oriented Programming (ROP)
  • Finding gadgets, writing ROP chains, ret2libc, ret2system
  • How to stack pivot out of constrained ROP scenarios

  IOT Mission

You've learned so much, so let's take a bit of a break and put your skills to the test. We've discovered an open network service to a house full of "Internet of Things" devices. Can you pwn them all?

This Mission challenges you to apply all the skills you have learned so far as a comprehensive test.

  Address Space Layout Randomization

Address Space Layout Randomization (ASLR) is the third major exploit-mitigation found in most modern software. It works by randomizing the layout of runtime memory each time a binary is executed. As a result, binaries that employ both DEP+ASLR can often require two vulnerabilities to exploit reliably.

This chapter will explain how ASLR works and how it can be bypassed with a few different exploitation techniques.

  • Introduces Address Space Layout Randomization (ASLR)
  • Leaking sensitive information from runtime memory
  • Using leaks or partial pointer overwrites to bypass ASLR
  • Other scenario / context-specific weaknesses in ASLR

  Heap Exploitation

Real-world applications often make extensive use of dynamic memory allocations. Dynamic memory is most commonly managed by the developer via calls to malloc/free. These allocations are stored in a region of memory generally referred to as "the heap".

Dynamic memory introduces a number of interesting, highly exploitable vulnerability pattern commonly found in real-world code. In this chapter, you will learn how the inherent behavior of the heap can be (ab)used in binary exploitation.

  • Discuss the fundamentals of dynamic memory and the heap
  • Exploiting heap overflows and Use-After-Free (UAF) vulns
  • Manipulating heap memory layouts to facilitate exploitation

  Miscellaneous Bug Classes

Binary exploitation encompasses many fairly well-formalized techniques, but ultimately comes down to applying an adversarial mindset towards software. The best security researchers are quick to recognize obscure edge-cases to trip up code that otherwise works "most of the time".

In this chapter, we'll cover a mix of different edge cases and how they often lead to broken assumptions or exploitable states of execution in real-world software.

  • Integer issues: under/overflows, truncation, signedness
  • Double fetch style vulnerabilities
  • Uninitialized memory bugs

  Race Conditions

Multi-threaded code is challenging for programmers and security researchers alike. It is no longer enough to understand what a single block of code is doing; you must also consider the side effects that may arise from an unknown number of other blocks of code executing at the same time.

Race conditions manifest when two or more threads do not properly synchronize access to a shared resource (eg, memory). This chapter will teach you about the basics of race conditions and test your ability to identify and exploit them.

  • Introduces simple multi-threaded race-conditions
  • Leverage race conditions to force exploitable states

  Infiltration Mission

Are you ready to put your skills to the test? We have a sensitive operation that we think you'll be perfect for...

This final mission wraps up the course with four challenging binaries that simulate breaching a secure facility. It serves as a comprehensive test of the course material in its entirety.