You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
564 lines
26 KiB
564 lines
26 KiB
===================================
|
|
Technical "whitepaper" for afl-fuzz
|
|
===================================
|
|
|
|
This document provides a quick overview of the guts of American Fuzzy Lop.
|
|
See README for the general instruction manual; and for a discussion of
|
|
motivations and design goals behind AFL, see historical_notes.txt.
|
|
|
|
0) Design statement
|
|
-------------------
|
|
|
|
American Fuzzy Lop does its best not to focus on any singular principle of
|
|
operation and not be a proof-of-concept for any specific theory. The tool can
|
|
be thought of as a collection of hacks that have been tested in practice,
|
|
found to be surprisingly effective, and have been implemented in the simplest,
|
|
most robust way I could think of at the time.
|
|
|
|
Many of the resulting features are made possible thanks to the availability of
|
|
lightweight instrumentation that served as a foundation for the tool, but this
|
|
mechanism should be thought of merely as a means to an end. The only true
|
|
governing principles are speed, reliability, and ease of use.
|
|
|
|
1) Coverage measurements
|
|
------------------------
|
|
|
|
The instrumentation injected into compiled programs captures branch (edge)
|
|
coverage, along with coarse branch-taken hit counts. The code injected at
|
|
branch points is essentially equivalent to:
|
|
|
|
cur_location = <COMPILE_TIME_RANDOM>;
|
|
shared_mem[cur_location ^ prev_location]++;
|
|
prev_location = cur_location >> 1;
|
|
|
|
The cur_location value is generated randomly to simplify the process of
|
|
linking complex projects and keep the XOR output distributed uniformly.
|
|
|
|
The shared_mem[] array is a 64 kB SHM region passed to the instrumented binary
|
|
by the caller. Every byte set in the output map can be thought of as a hit for
|
|
a particular (branch_src, branch_dst) tuple in the instrumented code.
|
|
|
|
The size of the map is chosen so that collisions are sporadic with almost all
|
|
of the intended targets, which usually sport between 2k and 10k discoverable
|
|
branch points:
|
|
|
|
Branch cnt | Colliding tuples | Example targets
|
|
------------+------------------+-----------------
|
|
1,000 | 0.75% | giflib, lzo
|
|
2,000 | 1.5% | zlib, tar, xz
|
|
5,000 | 3.5% | libpng, libwebp
|
|
10,000 | 7% | libxml
|
|
20,000 | 14% | sqlite
|
|
50,000 | 30% | -
|
|
|
|
At the same time, its size is small enough to allow the map to be analyzed
|
|
in a matter of microseconds on the receiving end, and to effortlessly fit
|
|
within L2 cache.
|
|
|
|
This form of coverage provides considerably more insight into the execution
|
|
path of the program than simple block coverage. In particular, it trivially
|
|
distinguishes between the following execution traces:
|
|
|
|
A -> B -> C -> D -> E (tuples: AB, BC, CD, DE)
|
|
A -> B -> D -> C -> E (tuples: AB, BD, DC, CE)
|
|
|
|
This aids the discovery of subtle fault conditions in the underlying code,
|
|
because security vulnerabilities are more often associated with unexpected
|
|
or incorrect state transitions than with merely reaching a new basic block.
|
|
|
|
The reason for the shift operation in the last line of the pseudocode shown
|
|
earlier in this section is to preserve the directionality of tuples (without
|
|
this, A ^ B would be indistinguishable from B ^ A) and to retain the identity
|
|
of tight loops (otherwise, A ^ A would be obviously equal to B ^ B).
|
|
|
|
The absence of simple saturating arithmetic opcodes on Intel CPUs means that
|
|
the hit counters can sometimes wrap around to zero. Since this is a fairly
|
|
unlikely and localized event, it's seen as an acceptable performance trade-off.
|
|
|
|
2) Detecting new behaviors
|
|
--------------------------
|
|
|
|
The fuzzer maintains a global map of tuples seen in previous executions; this
|
|
data can be rapidly compared with individual traces and updated in just a couple
|
|
of dword- or qword-wide instructions and a simple loop.
|
|
|
|
When a mutated input produces an execution trace containing new tuples, the
|
|
corresponding input file is preserved and routed for additional processing
|
|
later on (see section #3). Inputs that do not trigger new local-scale state
|
|
transitions in the execution trace (i.e., produce no new tuples) are discarded,
|
|
even if their overall control flow sequence is unique.
|
|
|
|
This approach allows for a very fine-grained and long-term exploration of
|
|
program state while not having to perform any computationally intensive and
|
|
fragile global comparisons of complex execution traces, and while avoiding the
|
|
scourge of path explosion.
|
|
|
|
To illustrate the properties of the algorithm, consider that the second trace
|
|
shown below would be considered substantially new because of the presence of
|
|
new tuples (CA, AE):
|
|
|
|
#1: A -> B -> C -> D -> E
|
|
#2: A -> B -> C -> A -> E
|
|
|
|
At the same time, with #2 processed, the following pattern will not be seen
|
|
as unique, despite having a markedly different overall execution path:
|
|
|
|
#3: A -> B -> C -> A -> B -> C -> A -> B -> C -> D -> E
|
|
|
|
In addition to detecting new tuples, the fuzzer also considers coarse tuple
|
|
hit counts. These are divided into several buckets:
|
|
|
|
1, 2, 3, 4-7, 8-15, 16-31, 32-127, 128+
|
|
|
|
To some extent, the number of buckets is an implementation artifact: it allows
|
|
an in-place mapping of an 8-bit counter generated by the instrumentation to
|
|
an 8-position bitmap relied on by the fuzzer executable to keep track of the
|
|
already-seen execution counts for each tuple.
|
|
|
|
Changes within the range of a single bucket are ignored; transition from one
|
|
bucket to another is flagged as an interesting change in program control flow,
|
|
and is routed to the evolutionary process outlined in the section below.
|
|
|
|
The hit count behavior provides a way to distinguish between potentially
|
|
interesting control flow changes, such as a block of code being executed
|
|
twice when it was normally hit only once. At the same time, it is fairly
|
|
insensitive to empirically less notable changes, such as a loop going from
|
|
47 cycles to 48. The counters also provide some degree of "accidental"
|
|
immunity against tuple collisions in dense trace maps.
|
|
|
|
The execution is policed fairly heavily through memory and execution time
|
|
limits; by default, the timeout is set at 5x the initially-calibrated
|
|
execution speed, rounded up to 20 ms. The aggressive timeouts are meant to
|
|
prevent dramatic fuzzer performance degradation by descending into tarpits
|
|
that, say, improve coverage by 1% while being 100x slower; we pragmatically
|
|
reject them and hope that the fuzzer will find a less expensive way to reach
|
|
the same code. Empirical testing strongly suggests that more generous time
|
|
limits are not worth the cost.
|
|
|
|
3) Evolving the input queue
|
|
---------------------------
|
|
|
|
Mutated test cases that produced new state transitions within the program are
|
|
added to the input queue and used as a starting point for future rounds of
|
|
fuzzing. They supplement, but do not automatically replace, existing finds.
|
|
|
|
In contrast to more greedy genetic algorithms, this approach allows the tool
|
|
to progressively explore various disjoint and possibly mutually incompatible
|
|
features of the underlying data format, as shown in this image:
|
|
|
|
http://lcamtuf.coredump.cx/afl/afl_gzip.png
|
|
|
|
Several practical examples of the results of this algorithm are discussed
|
|
here:
|
|
|
|
http://lcamtuf.blogspot.com/2014/11/pulling-jpegs-out-of-thin-air.html
|
|
http://lcamtuf.blogspot.com/2014/11/afl-fuzz-nobody-expects-cdata-sections.html
|
|
|
|
The synthetic corpus produced by this process is essentially a compact
|
|
collection of "hmm, this does something new!" input files, and can be used to
|
|
seed any other testing processes down the line (for example, to manually
|
|
stress-test resource-intensive desktop apps).
|
|
|
|
With this approach, the queue for most targets grows to somewhere between 1k
|
|
and 10k entries; approximately 10-30% of this is attributable to the discovery
|
|
of new tuples, and the remainder is associated with changes in hit counts.
|
|
|
|
The following table compares the relative ability to discover file syntax and
|
|
explore program states when using several different approaches to guided
|
|
fuzzing. The instrumented target was GNU patch 2.7.3 compiled with -O3 and
|
|
seeded with a dummy text file; the session consisted of a single pass over the
|
|
input queue with afl-fuzz:
|
|
|
|
Fuzzer guidance | Blocks | Edges | Edge hit | Highest-coverage
|
|
strategy used | reached | reached | cnt var | test case generated
|
|
------------------+---------+---------+----------+---------------------------
|
|
(Initial file) | 156 | 163 | 1.00 | (none)
|
|
| | | |
|
|
Blind fuzzing S | 182 | 205 | 2.23 | First 2 B of RCS diff
|
|
Blind fuzzing L | 228 | 265 | 2.23 | First 4 B of -c mode diff
|
|
Block coverage | 855 | 1,130 | 1.57 | Almost-valid RCS diff
|
|
Edge coverage | 1,452 | 2,070 | 2.18 | One-chunk -c mode diff
|
|
AFL model | 1,765 | 2,597 | 4.99 | Four-chunk -c mode diff
|
|
|
|
The first entry for blind fuzzing ("S") corresponds to executing just a single
|
|
round of testing; the second set of figures ("L") shows the fuzzer running in a
|
|
loop for a number of execution cycles comparable with that of the instrumented
|
|
runs, which required more time to fully process the growing queue.
|
|
|
|
Roughly similar results have been obtained in a separate experiment where the
|
|
fuzzer was modified to compile out all the random fuzzing stages and leave just
|
|
a series of rudimentary, sequential operations such as walking bit flips.
|
|
Because this mode would be incapable of altering the size of the input file,
|
|
the sessions were seeded with a valid unified diff:
|
|
|
|
Queue extension | Blocks | Edges | Edge hit | Number of unique
|
|
strategy used | reached | reached | cnt var | crashes found
|
|
------------------+---------+---------+----------+------------------
|
|
(Initial file) | 624 | 717 | 1.00 | -
|
|
| | | |
|
|
Blind fuzzing | 1,101 | 1,409 | 1.60 | 0
|
|
Block coverage | 1,255 | 1,649 | 1.48 | 0
|
|
Edge coverage | 1,259 | 1,734 | 1.72 | 0
|
|
AFL model | 1,452 | 2,040 | 3.16 | 1
|
|
|
|
At noted earlier on, some of the prior work on genetic fuzzing relied on
|
|
maintaining a single test case and evolving it to maximize coverage. At least
|
|
in the tests described above, this "greedy" approach appears to confer no
|
|
substantial benefits over blind fuzzing strategies.
|
|
|
|
4) Culling the corpus
|
|
---------------------
|
|
|
|
The progressive state exploration approach outlined above means that some of
|
|
the test cases synthesized later on in the game may have edge coverage that
|
|
is a strict superset of the coverage provided by their ancestors.
|
|
|
|
To optimize the fuzzing effort, AFL periodically re-evaluates the queue using a
|
|
fast algorithm that selects a smaller subset of test cases that still cover
|
|
every tuple seen so far, and whose characteristics make them particularly
|
|
favorable to the tool.
|
|
|
|
The algorithm works by assigning every queue entry a score proportional to its
|
|
execution latency and file size; and then selecting lowest-scoring candidates
|
|
for each tuple.
|
|
|
|
The tuples are then processed sequentially using a simple workflow:
|
|
|
|
1) Find next tuple not yet in the temporary working set,
|
|
|
|
2) Locate the winning queue entry for this tuple,
|
|
|
|
3) Register *all* tuples present in that entry's trace in the working set,
|
|
|
|
4) Go to #1 if there are any missing tuples in the set.
|
|
|
|
The generated corpus of "favored" entries is usually 5-10x smaller than the
|
|
starting data set. Non-favored entries are not discarded, but they are skipped
|
|
with varying probabilities when encountered in the queue:
|
|
|
|
- If there are new, yet-to-be-fuzzed favorites present in the queue, 99%
|
|
of non-favored entries will be skipped to get to the favored ones.
|
|
|
|
- If there are no new favorites:
|
|
|
|
- If the current non-favored entry was fuzzed before, it will be skipped
|
|
95% of the time.
|
|
|
|
- If it hasn't gone through any fuzzing rounds yet, the odds of skipping
|
|
drop down to 75%.
|
|
|
|
Based on empirical testing, this provides a reasonable balance between queue
|
|
cycling speed and test case diversity.
|
|
|
|
Slightly more sophisticated but much slower culling can be performed on input
|
|
or output corpora with afl-cmin. This tool permanently discards the redundant
|
|
entries and produces a smaller corpus suitable for use with afl-fuzz or
|
|
external tools.
|
|
|
|
5) Trimming input files
|
|
-----------------------
|
|
|
|
File size has a dramatic impact on fuzzing performance, both because large
|
|
files make the target binary slower, and because they reduce the likelihood
|
|
that a mutation would touch important format control structures, rather than
|
|
redundant data blocks. This is discussed in more detail in perf_tips.txt.
|
|
|
|
The possibility that the user will provide a low-quality starting corpus aside,
|
|
some types of mutations can have the effect of iteratively increasing the size
|
|
of the generated files, so it is important to counter this trend.
|
|
|
|
Luckily, the instrumentation feedback provides a simple way to automatically
|
|
trim down input files while ensuring that the changes made to the files have no
|
|
impact on the execution path.
|
|
|
|
The built-in trimmer in afl-fuzz attempts to sequentially remove blocks of data
|
|
with variable length and stepover; any deletion that doesn't affect the checksum
|
|
of the trace map is committed to disk. The trimmer is not designed to be
|
|
particularly thorough; instead, it tries to strike a balance between precision
|
|
and the number of execve() calls spent on the process, selecting the block size
|
|
and stepover to match. The average per-file gains are around 5-20%.
|
|
|
|
The standalone afl-tmin tool uses a more exhaustive, iterative algorithm, and
|
|
also attempts to perform alphabet normalization on the trimmed files. The
|
|
operation of afl-tmin is as follows.
|
|
|
|
First, the tool automatically selects the operating mode. If the initial input
|
|
crashes the target binary, afl-tmin will run in non-instrumented mode, simply
|
|
keeping any tweaks that produce a simpler file but still crash the target. If
|
|
the target is non-crashing, the tool uses an instrumented mode and keeps only
|
|
the tweaks that produce exactly the same execution path.
|
|
|
|
The actual minimization algorithm is:
|
|
|
|
1) Attempt to zero large blocks of data with large stepovers. Empirically,
|
|
this is shown to reduce the number of execs by preempting finer-grained
|
|
efforts later on.
|
|
|
|
2) Perform a block deletion pass with decreasing block sizes and stepovers,
|
|
binary-search-style.
|
|
|
|
3) Perform alphabet normalization by counting unique characters and trying
|
|
to bulk-replace each with a zero value.
|
|
|
|
4) As a last result, perform byte-by-byte normalization on non-zero bytes.
|
|
|
|
Instead of zeroing with a 0x00 byte, afl-tmin uses the ASCII digit '0'. This
|
|
is done because such a modification is much less likely to interfere with
|
|
text parsing, so it is more likely to result in successful minimization of
|
|
text files.
|
|
|
|
The algorithm used here is less involved than some other test case
|
|
minimization approaches proposed in academic work, but requires far fewer
|
|
executions and tends to produce comparable results in most real-world
|
|
applications.
|
|
|
|
6) Fuzzing strategies
|
|
---------------------
|
|
|
|
The feedback provided by the instrumentation makes it easy to understand the
|
|
value of various fuzzing strategies and optimize their parameters so that they
|
|
work equally well across a wide range of file types. The strategies used by
|
|
afl-fuzz are generally format-agnostic and are discussed in more detail here:
|
|
|
|
http://lcamtuf.blogspot.com/2014/08/binary-fuzzing-strategies-what-works.html
|
|
|
|
It is somewhat notable that especially early on, most of the work done by
|
|
afl-fuzz is actually highly deterministic, and progresses to random stacked
|
|
modifications and test case splicing only at a later stage. The deterministic
|
|
strategies include:
|
|
|
|
- Sequential bit flips with varying lengths and stepovers,
|
|
|
|
- Sequential addition and subtraction of small integers,
|
|
|
|
- Sequential insertion of known interesting integers (0, 1, INT_MAX, etc),
|
|
|
|
The purpose of opening with deterministic steps is related to their tendency to
|
|
produce compact test cases and small diffs between the non-crashing and crashing
|
|
inputs.
|
|
|
|
With deterministic fuzzing out of the way, the non-deterministic steps include
|
|
stacked bit flips, insertions, deletions, arithmetics, and splicing of different
|
|
test cases.
|
|
|
|
The relative yields and execve() costs of all these strategies have been
|
|
investigated and are discussed in the aforementioned blog post.
|
|
|
|
For the reasons discussed in historical_notes.txt (chiefly, performance,
|
|
simplicity, and reliability), AFL generally does not try to reason about the
|
|
relationship between specific mutations and program states; the fuzzing steps
|
|
are nominally blind, and are guided only by the evolutionary design of the
|
|
input queue.
|
|
|
|
That said, there is one (trivial) exception to this rule: when a new queue
|
|
entry goes through the initial set of deterministic fuzzing steps, and tweaks to
|
|
some regions in the file are observed to have no effect on the checksum of the
|
|
execution path, they may be excluded from the remaining phases of
|
|
deterministic fuzzing - and the fuzzer may proceed straight to random tweaks.
|
|
Especially for verbose, human-readable data formats, this can reduce the number
|
|
of execs by 10-40% or so without an appreciable drop in coverage. In extreme
|
|
cases, such as normally block-aligned tar archives, the gains can be as high as
|
|
90%.
|
|
|
|
Because the underlying "effector maps" are local every queue entry and remain
|
|
in force only during deterministic stages that do not alter the size or the
|
|
general layout of the underlying file, this mechanism appears to work very
|
|
reliably and proved to be simple to implement.
|
|
|
|
7) Dictionaries
|
|
---------------
|
|
|
|
The feedback provided by the instrumentation makes it easy to automatically
|
|
identify syntax tokens in some types of input files, and to detect that certain
|
|
combinations of predefined or auto-detected dictionary terms constitute a
|
|
valid grammar for the tested parser.
|
|
|
|
A discussion of how these features are implemented within afl-fuzz can be found
|
|
here:
|
|
|
|
http://lcamtuf.blogspot.com/2015/01/afl-fuzz-making-up-grammar-with.html
|
|
|
|
In essence, when basic, typically easily-obtained syntax tokens are combined
|
|
together in a purely random manner, the instrumentation and the evolutionary
|
|
design of the queue together provide a feedback mechanism to differentiate
|
|
between meaningless mutations and ones that trigger new behaviors in the
|
|
instrumented code - and to incrementally build more complex syntax on top of
|
|
this discovery.
|
|
|
|
The dictionaries have been shown to enable the fuzzer to rapidly reconstruct
|
|
the grammar of highly verbose and complex languages such as JavaScript, SQL,
|
|
or XML; several examples of generated SQL statements are given in the blog
|
|
post mentioned above.
|
|
|
|
Interestingly, the AFL instrumentation also allows the fuzzer to automatically
|
|
isolate syntax tokens already present in an input file. It can do so by looking
|
|
for run of bytes that, when flipped, produce a consistent change to the
|
|
program's execution path; this is suggestive of an underlying atomic comparison
|
|
to a predefined value baked into the code. The fuzzer relies on this signal
|
|
to build compact "auto dictionaries" that are then used in conjunction with
|
|
other fuzzing strategies.
|
|
|
|
8) De-duping crashes
|
|
--------------------
|
|
|
|
De-duplication of crashes is one of the more important problems for any
|
|
competent fuzzing tool. Many of the naive approaches run into problems; in
|
|
particular, looking just at the faulting address may lead to completely
|
|
unrelated issues being clustered together if the fault happens in a common
|
|
library function (say, strcmp, strcpy); while checksumming call stack
|
|
backtraces can lead to extreme crash count inflation if the fault can be
|
|
reached through a number of different, possibly recursive code paths.
|
|
|
|
The solution implemented in afl-fuzz considers a crash unique if any of two
|
|
conditions are met:
|
|
|
|
- The crash trace includes a tuple not seen in any of the previous crashes,
|
|
|
|
- The crash trace is missing a tuple that was always present in earlier
|
|
faults.
|
|
|
|
The approach is vulnerable to some path count inflation early on, but exhibits
|
|
a very strong self-limiting effect, similar to the execution path analysis
|
|
logic that is the cornerstone of afl-fuzz.
|
|
|
|
9) Investigating crashes
|
|
------------------------
|
|
|
|
The exploitability of many types of crashes can be ambiguous; afl-fuzz tries
|
|
to address this by providing a crash exploration mode where a known-faulting
|
|
test case is fuzzed in a manner very similar to the normal operation of the
|
|
fuzzer, but with a constraint that causes any non-crashing mutations to be
|
|
thrown away.
|
|
|
|
A detailed discussion of the value of this approach can be found here:
|
|
|
|
http://lcamtuf.blogspot.com/2014/11/afl-fuzz-crash-exploration-mode.html
|
|
|
|
The method uses instrumentation feedback to explore the state of the crashing
|
|
program to get past the ambiguous faulting condition and then isolate the
|
|
newly-found inputs for human review.
|
|
|
|
On the subject of crashes, it is worth noting that in contrast to normal
|
|
queue entries, crashing inputs are *not* trimmed; they are kept exactly as
|
|
discovered to make it easier to compare them to the parent, non-crashing entry
|
|
in the queue. That said, afl-tmin can be used to shrink them at will.
|
|
|
|
10) The fork server
|
|
-------------------
|
|
|
|
To improve performance, afl-fuzz uses a "fork server", where the fuzzed process
|
|
goes through execve(), linking, and libc initialization only once, and is then
|
|
cloned from a stopped process image by leveraging copy-on-write. The
|
|
implementation is described in more detail here:
|
|
|
|
http://lcamtuf.blogspot.com/2014/10/fuzzing-binaries-without-execve.html
|
|
|
|
The fork server is an integral aspect of the injected instrumentation and
|
|
simply stops at the first instrumented function to await commands from
|
|
afl-fuzz.
|
|
|
|
With fast targets, the fork server can offer considerable performance gains,
|
|
usually between 1.5x and 2x. It is also possible to:
|
|
|
|
- Use the fork server in manual ("deferred") mode, skipping over larger,
|
|
user-selected chunks of initialization code. It requires very modest
|
|
code changes to the targeted program, and With some targets, can
|
|
produce 10x+ performance gains.
|
|
|
|
- Enable "persistent" mode, where a single process is used to try out
|
|
multiple inputs, greatly limiting the overhead of repetitive fork()
|
|
calls. This generally requires some code changes to the targeted program,
|
|
but can improve the performance of fast targets by a factor of 5 or more
|
|
- approximating the benefits of in-process fuzzing jobs while still
|
|
maintaining very robust isolation between the fuzzer process and the
|
|
targeted binary.
|
|
|
|
11) Parallelization
|
|
-------------------
|
|
|
|
The parallelization mechanism relies on periodically examining the queues
|
|
produced by independently-running instances on other CPU cores or on remote
|
|
machines, and then selectively pulling in the test cases that, when tried
|
|
out locally, produce behaviors not yet seen by the fuzzer at hand.
|
|
|
|
This allows for extreme flexibility in fuzzer setup, including running synced
|
|
instances against different parsers of a common data format, often with
|
|
synergistic effects.
|
|
|
|
For more information about this design, see parallel_fuzzing.txt.
|
|
|
|
12) Binary-only instrumentation
|
|
-------------------------------
|
|
|
|
Instrumentation of black-box, binary-only targets is accomplished with the
|
|
help of a separately-built version of QEMU in "user emulation" mode. This also
|
|
allows the execution of cross-architecture code - say, ARM binaries on x86.
|
|
|
|
QEMU uses basic blocks as translation units; the instrumentation is implemented
|
|
on top of this and uses a model roughly analogous to the compile-time hooks:
|
|
|
|
if (block_address > elf_text_start && block_address < elf_text_end) {
|
|
|
|
cur_location = (block_address >> 4) ^ (block_address << 8);
|
|
shared_mem[cur_location ^ prev_location]++;
|
|
prev_location = cur_location >> 1;
|
|
|
|
}
|
|
|
|
The shift-and-XOR-based scrambling in the second line is used to mask the
|
|
effects of instruction alignment.
|
|
|
|
The start-up of binary translators such as QEMU, DynamoRIO, and PIN is fairly
|
|
slow; to counter this, the QEMU mode leverages a fork server similar to that
|
|
used for compiler-instrumented code, effectively spawning copies of an
|
|
already-initialized process paused at _start.
|
|
|
|
First-time translation of a new basic block also incurs substantial latency. To
|
|
eliminate this problem, the AFL fork server is extended by providing a channel
|
|
between the running emulator and the parent process. The channel is used
|
|
to notify the parent about the addresses of any newly-encountered blocks and to
|
|
add them to the translation cache that will be replicated for future child
|
|
processes.
|
|
|
|
As a result of these two optimizations, the overhead of the QEMU mode is
|
|
roughly 2-5x, compared to 100x+ for PIN.
|
|
|
|
13) The afl-analyze tool
|
|
------------------------
|
|
|
|
The file format analyzer is a simple extension of the minimization algorithm
|
|
discussed earlier on; instead of attempting to remove no-op blocks, the tool
|
|
performs a series of walking byte flips and then annotates runs of bytes
|
|
in the input file.
|
|
|
|
It uses the following classification scheme:
|
|
|
|
- "No-op blocks" - segments where bit flips cause no apparent changes to
|
|
control flow. Common examples may be comment sections, pixel data within
|
|
a bitmap file, etc.
|
|
|
|
- "Superficial content" - segments where some, but not all, bitflips
|
|
produce some control flow changes. Examples may include strings in rich
|
|
documents (e.g., XML, RTF).
|
|
|
|
- "Critical stream" - a sequence of bytes where all bit flips alter control
|
|
flow in different but correlated ways. This may be compressed data,
|
|
non-atomically compared keywords or magic values, etc.
|
|
|
|
- "Suspected length field" - small, atomic integer that, when touched in
|
|
any way, causes a consistent change to program control flow, suggestive
|
|
of a failed length check.
|
|
|
|
- "Suspected cksum or magic int" - an integer that behaves similarly to a
|
|
length field, but has a numerical value that makes the length explanation
|
|
unlikely. This is suggestive of a checksum or other "magic" integer.
|
|
|
|
- "Suspected checksummed block" - a long block of data where any change
|
|
always triggers the same new execution path. Likely caused by failing
|
|
a checksum or a similar integrity check before any subsequent parsing
|
|
takes place.
|
|
|
|
- "Magic value section" - a generic token where changes cause the type
|
|
of binary behavior outlined earlier, but that doesn't meet any of the
|
|
other criteria. May be an atomically compared keyword or so.
|