Commit b987aea1 authored by Joshua Moerman's avatar Joshua Moerman
Browse files

Merge branch 'master' of gitlab.science.ru.nl:moerman/rers-2016

parents f1f63ffb b6eb53e8
......@@ -438,11 +438,18 @@ Therefore, the expected (mean) length of such a sequence is $E(l) = 1/p = x$, an
%Therefore, alternative methods for finding counterexamples might yield better results.
\subsection{Fuzzing} \label{sec:fuzzing}
A \emph{fuzzer} is a program that applies a set of tests (i.e.\ input sequences) to a target program, and then iteratively \emph{mutates} (i.e.\ modifies) these tests to monitor if `something interesting' happens.
A \emph{mutation-based fuzzer} is a program that applies a set of tests (i.e.\ input sequences) to a target program, and then iteratively mutates (i.e.\ modifies) these tests to monitor if `something interesting' happens.
This could be that the target program crashes, or that its output changes.
The \emph{American Fuzzy Lop} (AFL) fuzzer \citep{afl-website} is interesting for its approach in combining mutation-based test case generation with \emph{code coverage} monitoring.
The AFL fuzzer supports programs written in C, C++, or Objective C and there are variants that allow to fuzz programs written in Python, Go, Rust or OCaml.
These programs can be provided as source code, or as compiled binaries (by either gcc or clang).
AFL supports programs written in C, C++, or Objective C and there are variants that allow to fuzz programs written in Python, Go, Rust or OCaml.
AFL works on instrumented binaries of these programs, and supports compile-time or runtime instrumentation.
The tool is bundled with a modified version of gcc (afl-gcc) that can add instrumentation at compile time.
The compile-time instrumentation has the best performance, but requires the source code of the target program to be available.
When the source code is not available, AFL applies runtime instrumentation, which uses emulation (QEMU or Intel Pin) to achieve the result.
This, however, is 2-5{$\times$} slower than compile-time instrumentation \citep{afl-website}.
%This, however, adds a significant overhead.
%Therefore, we do not consider AFL to be a purely white-box tool.
%A screenshot of AFL's interface is shown in \autoref{fig:afl-interface}.
%
......@@ -479,9 +486,9 @@ In the next paragraphs, we will describe in more detail how coverage is measured
\subsubsection*{Measuring coverage}
If a mutated test case results in a higher coverage of the target program, the test case is seen as valuable.
The intuition behind this is simple: we want to cover as much of the target program's code as possible, which gives us the highest chance of discovering all behaviour (i.e.\ interesting test queries) of the program.
%The intuition behind this is simple: we want to cover as much of the target program's code as possible, which gives us the highest chance of discovering all behaviour (i.e.\ interesting test queries) of the program.
In order to measure this coverage, AFL uses either compile-time or runtime instrumentation of the control flow of the program (branches, jumps, etc.), to identify which parts of the target program are used in a given test.
In order to measure this coverage, AFL uses instrumentation of the control flow of the program (branches, jumps, etc.), to identify which parts of the target program are used in a given test.
Using this knowledge, AFL can decide which test cases cover behaviour not previously seen in other test cases, simply by comparing the result of the instrumentation.
Internally, coverage is measured by using a so-called \emph{trace bitmap}, which is a \SI{64}{\kilo\byte} array of memory shared between the fuzzer and the instrumented target.
......@@ -498,22 +505,17 @@ This causes every edge in the control flow to be represented by a different byte
Note that because the size of the bitmap is finite and the values that represent locations in the code are random, the bitmap is probabilistic: there is a change that collisions will occur.
This is especially the case when the bitmap fills up, which can happen when fuzzing large programs with many edges in their control flow.
AFL can detect and resolve this situation by applying instrumentation on fewer edges in the target or by increasing the size of the bitmap.
A trace bitmap for the program in \autoref{fig:fsm} is shown in \autoref{fig:afl-cfg}.
%A trace bitmap for the program in \autoref{fig:fsm} is shown in \autoref{fig:afl-cfg}.
%We see that the input sequence 10 20 20 covers all edges of the program control flow.
\begin{figure}[ht]
\centering
\includegraphics[width=0.8\textwidth]{figures/vending-cfg}
\caption[Visualisation of AFL instrumentation of control flow]{The (instrumented) control flow of the program in \autoref{fig:fsm} (left), and a visual representation of the trace bitmap for several tests (right).
When a test is in execution, the instrumentation fills the bitmap to keep track of the edges that are used.
Every edge in the control flow graph is represented by a position in the bitmap, which is either taken (coloured) or not taken (uncoloured).}
\label{fig:afl-cfg}
\end{figure}
As previously mentioned, AFL supports compile-time or runtime instrumentation.
The compile-time instrumentation has the best performance, but requires the source code of the target program to be available.
When the source code is not available, AFL applies runtime instrumentation, which uses emulation (QEMU or Intel Pin) to achieve the result.
This, however, is 2-5{$\times$} slower than compile-time instrumentation \citep{afl-website}.
%\begin{figure}[ht]
% \centering
% \includegraphics[width=0.8\textwidth]{figures/vending-cfg}
% \caption[Visualisation of AFL instrumentation of control flow]{The (instrumented) control flow of the program in \autoref{fig:fsm} (left), and a visual representation of the trace bitmap for several tests (right).
% When a test is in execution, the instrumentation fills the bitmap to keep track of the edges that are used.
% Every edge in the control flow graph is represented by a position in the bitmap, which is either taken (coloured) or not taken (uncoloured).}
% \label{fig:afl-cfg}
%\end{figure}
\subsubsection*{Mutation strategies}
At the core of AFL is its `engine' to generate new test cases.
......
Supports Markdown
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment