Summary:
This adds an optimized debug build mode, which is compiled with
optimizations, and without assertions, but still has tracing enabled.
Reviewed By: kren1
Differential Revision: D16069452
fbshipit-source-id: 445cfa329
Summary:
The report output got disturbed by the change from predicate to
relational Domain, and the tricky control of printing simplified
states. After this diff by default states are printed in full, and in
simplified form with `-t State_domain.pp_simp`.
Also includes some minor output improvements.
Reviewed By: kren1
Differential Revision: D16059780
fbshipit-source-id: b33289887
Summary:
Trivial renamings to use the standard "libFuzzer" name instead of "lib
fuzzer".
Reviewed By: kren1
Differential Revision: D16067881
fbshipit-source-id: 3ff2a8f86
Summary:
On function return add the computed summary (pre/post) condition to a
hashtable.
Reviewed By: jberdine
Differential Revision: D16052136
fbshipit-source-id: 0c5c9bafb
Summary:
Outputting the list of bitcode inputs when no output flag is ok for
`sledge buck bitcode` but does not make sense when it is composed as
part of other commands. So only output to stdout if `-` is given as
the output file name.
Reviewed By: kren1
Differential Revision: D16059782
fbshipit-source-id: abac9c36f
Summary:
To easily monitor and track changes to the help generated by the
command line interface, generate it in full and add it to the repo.
Reviewed By: kren1
Differential Revision: D16059783
fbshipit-source-id: be15f9943
Summary:
This diff enhances `-function-summaries` to remember the frame computed by
the solver and actually execute the function using the summary. Upon
return the frame is added back on the computed post condition.
Reviewed By: ngorogiannis
Differential Revision: D15900318
fbshipit-source-id: 8bb56b771
Summary:
This diff is preparation for function summarization and focuses on
function calls and function summary precondition computation.
It introduces `-function-summaries` flag behind most of functionality is
hidden, when enabled on each call
* A function summary is computed by quantifying all the non-formal/global variables
and removing all the segments that are not reachable from them
* `pre` and `foot` are computed from function summary and the calling context
by replacing formals with actuals again.
* A solver is asked if `pre` entails `foot` and a frame is printed if it
does
Currently this only works for formulas without disjunctions, so when
function summaries are enabled, that state is first moved to dnf and then
the call is done for each disjunct.
Reviewed By: ngorogiannis
Differential Revision: D15898928
fbshipit-source-id: 49d32504c
Summary: For buck targets that contain at least one of the substrings in `buck-target-pattern` option in config, change the buck target to add `_sledge` suffix.
Reviewed By: jberdine
Differential Revision: D15920018
fbshipit-source-id: 44c242e99
Summary: And fix test Makefile to call the C++ compiler on .cpp files.
Reviewed By: kren1
Differential Revision: D15972426
fbshipit-source-id: 719de755f
Summary:
Simplify all conversions between castable types to the identity. The
backend treats castable types as equal, so distinguishing conversions
between them is incomplete.
Reviewed By: kren1
Differential Revision: D15972427
fbshipit-source-id: fa09859ac
Summary:
The entry block contains all locals of the entire function, as
required by the backend. This makes the manipulation of the locals of
each block redundant.
This diff moves the locals from the entry block to the function
itself, removes the Locals frames of the Control.Stack, and adds a
locals field to Return frames.
This is part cleanup and part preparation for removing the
Control.Stack.
Reviewed By: ngorogiannis
Differential Revision: D15963503
fbshipit-source-id: 523ebc260
Summary:
Adds `-mergefunc` and `-dce` passes to `Frontend.translate` to match
the `buck link` flow with `opt`
Reviewed By: ngorogiannis
Differential Revision: D15938641
fbshipit-source-id: 128cb89cd
Summary:
The current handling of the formal return variable scope is not
correct. Since it is passed as an actual argument to the return
continuation, it is manipulated as if it was a local variable of the
caller. However, its scope is not ended with the caller's locals,
leading to clashes.
This diff reworks the passing of return values to avoid this problem,
mainly by introducing a notion of temporary variables during parameter
passing. This essentially has the effect of taking a function spec
{ P } f(x) { λv. Q }
and generating a "temporary" variable v, applying the post λv. Q to it
to obtain the pre-state for the call to the return continuation
k(v). Being a temporary variable just means that it goes out of scope
just after parameter passing. This amounts to a long-winded way of
applying the post-state to the formal parameter of the return
continuation without violating scopes or SSA.
This diff also separates the manipulation of the symbolic states as they
proceed from:
1. the pre-state before the return instruction;
2. the exit-state after the return instruction (including the binding
of the returned value to the return formal variable);
3. the post-state, where the locals are existentially quantified; and
4. the return-state, which is expressed in terms of actual args
instead of formal parameters.
Also in support of summarization, formal return and throw parameters
are no longer tracked on the analyzer's stack.
Note that these changes involve changing the locals of blocks and
functions to no longer include the formal parameters.
Reviewed By: kren1
Differential Revision: D15912148
fbshipit-source-id: e41dd6e42
Summary:
The solver couldn't deal with `∃ a,b . a = b` , so this diff adds
a special case to deal with it.
Reviewed By: ngorogiannis
Differential Revision: D15897953
fbshipit-source-id: d841d3557
Summary:
:
This patch adds several passes that reduce the amount of bitcode
making sledge's job easier, more info:
https://llvm.org/docs/Passes.html
`-mergefunc`
This pass merges functions that do the same thing, this can be because of
templating or casts (ie. same functionality but on 32bit and 64bit ints,
which is the same in machine code). More details at
http://llvm.org/docs/MergeFunctions.html
Note that this pass is currently not available through C/OCaml API.
`-constmerge`
This merges constants that have the same value, this is possible to do
when the constants are internalized.
`-argpromotion`
```
This pass promotes “by reference” arguments to be “by value” arguments.
In practice, this means looking for internal functions that have pointer
arguments. If it can prove, through the use of alias analysis, that an
argument is only loaded, then it can pass the value into the function
instead of the address of the value. This can cause recursive
simplification of code and lead to the elimination of allocas
(especially in C++ template code like the STL).
```
`-ipsccp`
```
Sparse conditional constant propagation and merging, which can be
summarized as:
Assumes values are constant unless proven otherwise
Assumes BasicBlocks are dead unless proven otherwise
Proves values to be constant, and replaces them with constants
Proves conditional branches to be unconditional
```
`-deadargelim`
Removes dead arguments of internal functions, good to run after other inter-procedural
passes. Seems to crash llvm if run directly after `ipsccp`.
Note that while this might look like doing full link-time optimisation,
we are actually picking relatively cheap optimisations that mostly look
at globals and walk their use chains. The main reason link-time
optimisations are expensive is due to inlining and then running the full
optimisation again from there.
Reviewed By: jberdine
Differential Revision: D15851408
fbshipit-source-id: be7191683
Summary:
This diff introduces a `-lib-fuzz` flag to `buck link`, which links in a
simple main that calls the LLVMFuzzerTestOneInput function, which is the
entry point of libFuzzer fuzzer.
Reviewed By: jberdine, jvillard
Differential Revision: D15821512
fbshipit-source-id: cff731ed3
Summary:
Previous change to allow bitcasts in call instructions was too strict
and did not allow for indirect calls.
Reviewed By: jberdine
Differential Revision: D15803262
fbshipit-source-id: 40d828b59
Summary:
Currently printing symbolic heaps is unreadable, because there are too
many quantified variables, that are mostly just equal to other
variables.
This diff tries to replace all variables in an equivalence class with a
single variable and remove the unneccesary variables.
It also introduces two modes for printing state domains:
`-t +State_domain.pp_full` prints the state domain as is
`-t +State_domain.pp` uses the simplification before printing.
Reviewed By: jberdine
Differential Revision: D15738748
fbshipit-source-id: 7c85b580e
Summary:
Print pre- and post- conditions (aka, summaries) when analyzer hits a
function return
- plumbing the precondition through the analyzer
so that it is available when return is hit
Reviewed By: jberdine
Differential Revision: D15713725
fbshipit-source-id: b10b6206f
Summary:
This diff adds a `__llair_alloc` intrinsic which is modeled
as a non-failing malloc. Using it instead of `malloc` increases
the readbility of symbolic heaps, because it removes all the cases
where malloc failed.
Note that `assert(malloc())` does not have the desired effect.
Reviewed By: ngorogiannis
Differential Revision: D15778817
fbshipit-source-id: d02784077
Summary:
Some call instructions in LLVM bitcast the function,
for example
`%call = call i32 (i64, ...) bitcast (i32 (...)* @__llair_alloc to i32 (i64, ...)*)(i64 %conv)`
This would cause sledge to crash in LLVM when build with assertions.
Reviewed By: jberdine
Differential Revision: D15779003
fbshipit-source-id: c273f92db
Summary:
This diff adds a formal parameter to each non-void-returning function
to name the return value, and similarly a formal parameter for the
thrown exception value. These are interpreted as call-by-reference
parameters, so that they can be constrained in formulas to e.g. be
equal to the return value, and are still in scope when the function
returns, and so can be passed to the return block. Prior to
summarizing functions, this means that these formals need to be
tracked on the analyzer's control stack.
This will be needed to express function specs/summaries in terms of
formals, and fixes a bug where in some cases return values were not
tracked correctly.
Reviewed By: kren1
Differential Revision: D15738026
fbshipit-source-id: fff2d107c
Summary:
Previously the locals of a function were computed after backpatching
the blocks in its cfg. This resulted in loss of sharing, and incorrect
locals if queried through the parent of a block.
Reviewed By: kren1
Differential Revision: D15738027
fbshipit-source-id: d7e70530a
Summary:
Disable exceptional control flow
- treat throw as unreachable
- confidence in the correctness of the frontend's treatment of
exception handling is very low, and making summaries that are
expressive enough to talk about exceptions is a complication
that isn't needed for the first iteration
To facilitate, start on a struct that holds all the CL options.
Reviewed By: jberdine, jvillard
Differential Revision: D15713601
fbshipit-source-id: ee92dfbd8
Summary:
This diff adapts the test scripts to the new sledge CLI, and reworks
them to enable checking changes with respect to a baseline. In
particular, now
```
make -C test
```
has exit code 0 if the current test results match the expected ones,
and otherwise prints the diff. Also,
```
make -C test promote
```
promotes the current test results to the new baseline.
Reviewed By: kren1
Differential Revision: D15706573
fbshipit-source-id: 0cbf3231e
Summary:
Include cxa_default_handlers.cpp to bring in definitions for
__cxa_terminate_handler and __cxa_unexpected_handler.
Reviewed By: kren1
Differential Revision: D15712980
fbshipit-source-id: f536930a8
Summary:
Sometimes calls to the `abort` C stdlib function appear as `invoke`
instructions in LLVM. They should be translated to the LLAIR abort
instruction just like the non-raising `call abort` case.
Reviewed By: kren1
Differential Revision: D15706574
fbshipit-source-id: 1509ed0e3
Summary:
Most binary and ternary operations have the same type as their
arguments, so try to compute the type of arguments in these cases.
Reviewed By: kren1
Differential Revision: D15706576
fbshipit-source-id: 4749d6e32
Summary:
When LLVM is built with assertions, it crash
`add_sym` if you try to get the global scope of a non global value.
This patch special cases add_sym, to just do nothing when `llv` is
an `UndefValue`.
Also enhances debuging printout of transalte to include the number of
functions and globals.
Reviewed By: jvillard
Differential Revision: D15669447
fbshipit-source-id: 4b5483810
Summary:
The entry point functions are used in a couple of places, this
puts them in a single source of truth in the config file.
Reviewed By: jvillard
Differential Revision: D15651976
fbshipit-source-id: a572e8d4d
Summary:
This adds a globalopt optimization pass to sledge.
Consider code like:
```
const char *a_string = "I'm a string";
int an_int = 0;
int c() {
return an_int;
}
int main() {
char *c1 = a_string;
return c();
}
```
When compiled there are 2 levels of indirection. For example
`return an_int` Get's compiled as
```
%0 = load i32, i32* an_int1
ret i32 %0
```
Global opt reduces this (if `an_int` is internal) to just
` ret i32 0`.
Similarly and more importantly
`c1 = a_string;` get's compiled into
```
@.str = private unnamed_addr constant [13 x i8] c"I'm a string\00"
a_string = dso_local global i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i32 0, i32 0)
%c1 = alloca i8*, align 8
%0 = load i8*, i8** a_string, align 8, !dbg !25
store i8* %0, i8** %c1, align 8, !dbg !24
```
So there is a level of indirection between `c1` and `.str` where the string is stored.
With global opt, this gets reduced to:
```
@.str = private unnamed_addr constant [13 x i8] c"I'm a string\00"
%c1 = alloca i8*, align 8
store i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i64 0, i64 0), i8** %c1, align 8, !dbg !23
```
and `a_string` variable gets deleted.
On sledge this has the effect of reducing the complexity of the symbolic heap significantly.
Without this optimisation, running
`sledge.dbg llvm analyze -trace Domain.call global_vars.bc`
Gives prints the following segments:
```
∧ %.str -[)-> ⟨13,{}⟩
* %a_string -[)-> ⟨8,%.str⟩
* %an_int -[)-> ⟨4,0⟩
* %c1 -[)-> ⟨8,%.str⟩
* %retval -[)-> ⟨4,0⟩
```
So there are `an_int` and `a_string` segments, which are redundant.
with the optimisation, the heap looks like:
`∧ %.str -[)-> ⟨13,{}⟩ * %c1 -[)-> ⟨8,%.str⟩ * %retval -[)-> ⟨4,0⟩`,
Where we only have the `.str` segment and the `c1` segment, which are the two we need.
Reviewed By: ngorogiannis
Differential Revision: D15649195
fbshipit-source-id: 5f71e56e8
Summary:
Do not implicitly open `Trace`, which shadows `Import.fail`, and
degrades uncaught exceptions. Opening `Trace` was a mistake.
Reviewed By: kren1
Differential Revision: D15653730
fbshipit-source-id: d65277af5
Summary:
Change command line interface to include buck and llvm integration as
separate subcommands.
Reviewed By: kren1
Differential Revision: D15614567
fbshipit-source-id: b7618571b
Summary:
Use the new LLVM bindings to handle internalization.
We would like to use global dead code elimination (gdce) to remove all the code not reachable from an entry point. However in normal compilation most functions aren't globally dead (they can be linked to). At the point of running sledge we won't be linking anymore, therefore we can internalize (make invisible outside of the compilation unit) all the symbols. In that case the whole program is dead with respect to gdce, therefore we need to preserve the entry point as external.
Previously we could only preserve `main` as the entry point (through a boolean flag). This patch uses a newer API that lets us preserve functions that satisfy a given predicate function. This enables us to have arbitrary entry points (not just main). Currently only the 3 entry points from `src/control.ml` are used, but this patch makes it easy to change.
Reviewed By: ngorogiannis
Differential Revision: D15561461
fbshipit-source-id: 88e054411
Summary:
LLVM.global_initializer casues a cast assertion failure if the value
passed to it is not a GlobalVariable. So we first check if it is a
GlobalVariable and only then ask for an initiliazer.
This is hidden if LLVM is built without assertions.
Reviewed By: jberdine
Differential Revision: D15601632
fbshipit-source-id: e9db23a12
Summary:
Sledge does not terminate on programs with recursion, because
functions get "infinitely inlined" and therefore recursion is not
treated as retreating edge.
This patch bounds the number of times the same function can "inlined"
to respect the bound (`-b` option). On each call we check the number of
occurances of the called function in the call stack. If that is higher
than the bound, we skip it.
Reviewed By: jvillard
Differential Revision: D15577134
fbshipit-source-id: 4cd3b62c6