Summary:
In some cases inlining pure expressions into their use sites causes
code blowup. This diff changes the frontend to inline expressions only
if there is a single use, and otherwise adds a move instruction.
Reviewed By: ngorogiannis
Differential Revision: D17071770
fbshipit-source-id: d866a0622
Summary:
This has been out of date since arithmetic was changed from a purely
uninterpreted treatment to having a solver.
Reviewed By: jvillard
Differential Revision: D16985159
fbshipit-source-id: 39e42069c
Summary:
While SSA can be useful for code transformation purposes, it offers
little for semantic static analyses. Essentially, such analyses
explore the dynamic semantics of code, and the *static* single
assignment property does not buy much. For example, once an execution
visits a loop body that assigns a variable, there are multiple
assignments that the analysis must deal with. This leads to the need
to treat blocks as if they assign all their local variables, renaming
to avoid name clashes a la Floyd's assignment axiom. That is fine, but
it makes it much more involved to implement a version that is
economical with respect to renaming only when necessary. Additionally
the scoping constraints of SSA are cumbersome and significantly
complicate interprocedural analysis (where there is a long history of
incorrect proof rules for procedures, and SSA pushes the
interprocedural analysis away from being able to use known-good
ones). So this diff changes Llair from a functional SSA form to a
traditional imperative language.
Reviewed By: jvillard
Differential Revision: D16905898
fbshipit-source-id: 0fd835220
Summary:
Before this diff symbolic execution of instructions assumed that
assigned variables were unconstrained in the precondition. This is
ensured by symbolic execution of control flow, which renames all local
variables of a block when it is entered.
This diff changes symbolic execution of instructions to rename
modified variables that appear in the precondition when necessary, and
accounts for the modified variable occurrence condition on the frame
rule. This will enable more economically renaming variables, as most
of the time it is not needed.
Reviewed By: jvillard
Differential Revision: D16905893
fbshipit-source-id: 3a53525d7
Summary:
Each variable now contains its type, alongside its name. This is more
uniform than in LLVM, where the name is usually paired with a type, but
not always, for example, the register type of the result of an
extractvalue is left implicit.
Reviewed By: jberdine
Differential Revision: D16984630
fbshipit-source-id: 1c3bc4985
Summary:
HOL now lets us omit quotations on Datatypes and make them look more
like the other new-style HOL definitions.
Reviewed By: jberdine
Differential Revision: D16983934
fbshipit-source-id: f8ef3abb5
Summary:
This sketches out how translation can be approached. It is partially
based on the Sledge code.
For basic blocks, isn't based on the Sledge code, but just my own
thoughts as a starting point. Essentially, we are trying to build up
larger expressions, and so not assigning to temporary registers that
don't live past the end of the block. This does remove sharing, so a
fancier approach could check for multiple uses of end-of-block dead
registers, or look at the sizes of expressions. The approach should be
flexible enough to accommodate such changes.
Fix icmp syntax
Using finite maps is elegant in the semantics, but awkward for writing
the translation function. Refactor the mappings from labels to functions
and from labels to blocks to use association lists instead.
To remove phi nodes, the translation takes every edge in the control
flow graph and makes a new basic block that contains a single parallel
move instruction that corresponds to the action of the phi node of the
target block.
Reviewed By: jberdine
Differential Revision: D16831051
fbshipit-source-id: 005663e26
Summary:
The AST is not complete on expressions, but it should have most of the
important features.
The representation is in some ways very different from the OCaml
implementation, because the OCaml code uses mutation to build the CFG as
an actual pointer graph in memory, and also because the expression
representation is optimised for the backend. For the former, it should
be easy to see that the AST here is isomorphic, representing the CFG
with finite maps from block labels. The correspondence is less clear in
the latter case, but the point here is not to model or verify
implementation optimisations, but to give a semantics to llair as a
language.
Reviewed By: jberdine
Differential Revision: D16807132
fbshipit-source-id: b0f64b3ec
Summary:
Change loc_var (for local variable) to reg (for register) because
loc_var looks too much like a location tagged variable.
Reviewed By: jberdine
Differential Revision: D16827920
fbshipit-source-id: 5b11f1065
Summary:
There could very well still be bugs in the semantics, since the
invariant here doesn't say all that much, and it completely ignores
local registers. But most trivial things and typos are probably fixed.
Reviewed By: jberdine
Differential Revision: D16803281
fbshipit-source-id: 48ba2523b
Summary:
Made progress on the sanity checking lemma (that the step relation
preserves some simple invariants on the state). Proved the Ret
instruction case of the state invariant lemma. To do this, I fixed a few
bugs in the definition, and strengthened the invariants.
Reviewed By: jberdine
Differential Revision: D16786900
fbshipit-source-id: 6fa8cb170
Summary:
Global variables need allocating and initialising before the machine can
start. The definition here shouldn't constrain how and where they are
allocated. For example, they don't all need to have separate
allocations. We also tag allocated blocks so that the allocation for a
global can never be deallocated.
Start working on a sanity checking invariant on states.
Reviewed By: jberdine
Differential Revision: D16735068
fbshipit-source-id: 0d5e60e7a
Summary:
Start working on a simple model of LLVM with the ultimate goal of
handling relevant and/or tricky aspects of LLVM and LLAIR and then
formalising the translation from LLVM to LLAIR.
This is a complete initial model of everything that we are interested in
except for exceptions, which should be tricky. Also no thought has gone
into the treatment of poison and the undefined value, so the treatment
is naive, which is at least partially justified because we are
interested in the semantics of LLVM IR after the optimisation passes
have run.
Include some sanity checking theorems.
Reviewed By: jberdine
Differential Revision: D16731885
fbshipit-source-id: fd53949fe
Summary:
Currently bitcode produced with `sledge buck link` can have missing
symbols that are clearly defined in the source. For example consider a
symbol `awesome_function` that is defined in the libraries linked in but
not in the produced binary (despite being reachable from main).
`llvm-nm` of the bitcode produced by `llvm-link` might look like:
```
U awesome_function
t awesome_function.1892
```
Some our `awesome_function` is undefined and its definition is called
`awsome_function.1892` for some reason and is local. I think this is because symbol get internalized too early and then they get renamed and somehow lost. Not sure why `llvm-link` behaves this way sometimes.
This patch removes internalization from `llvm-link` and puts it into `opt`, where it doesn't cause problems.
Reviewed By: jvillard
Differential Revision: D16494153
fbshipit-source-id: aad9053a4
Summary:
`__llair_alloc` is meant to be a drop-in non-failing replacement for
`mallco`. Currently `__llair_alloc(1)` allocates 8 bytes instead of 1 as
`malloc(1)` would. This is because handling of `__llair_alloc` was
merged with handling of `new`. This patch reverts changes to handling of
`new` in D15778817 and adds a new case for `__llair_alloc`.
Reviewed By: jvillard
Differential Revision: D16356865
fbshipit-source-id: 3878d87c3
Summary:
When using summaries we first garbage collect the precondition and then
ask the solver to infer the frame of the precondition with respect to
grabage-collected footprint.
Currently if the solver fails to show the frame, we just give it an
empty frame. This is bad, because if grabage collection removed some
segments, they don't get added back on.
This patch throws an exception instead to be very explicit when the
solver cannot show the frame in this case.
Reviewed By: ngorogiannis
Differential Revision: D16339587
fbshipit-source-id: b88d0689c
Summary:
The actual implementation of folly::usingJEMalloc() tests if malloc is
jemalloc using internal knowledge of the jemalloc implemenation of
malloc. This internal behavior is not reflected in the analyzer's
spec, so the detection fails.
Additionally, folly::usingJEMalloc is implemented using mallctl to
query internal state of jemalloc. Depending on the key string passed
to mallctl, it might return a pointer to jemalloc internal state, or a
scalar, which means that the spec needs to essentially allocate that
state in those cases.
Since the jemalloc detection fails, and the analyzer is not always
able to reason precisely about string equality, this diff models
folly::usingJEMalloc directly (as nondet).
Reviewed By: kren1
Differential Revision: D16059776
fbshipit-source-id: 7e7156d7d
Summary:
It seems that functions internalized by llvm no longer have valid
mangled names, and instead have a `.<int>` suffix. This diff removes
these unpredictable suffixes when checking if a called function is a
specified/modeled intrinsic.
Reviewed By: kren1
Differential Revision: D16059781
fbshipit-source-id: a4b9f6c73
Summary:
A frame inference query `Minuend ⊢ ∃xs. Subtrahend` returns a
`∃zs. Remainder` formula such that `Minuend ⊢ ∃xs. Subtrahend *
∃zs. Remainder` when successful. Currently if the subtrahend is itself
existentially quantified, its existentials are treated trivially: they
must witness themselves. This diff allows the solver to find witnesses
as the `xs`. They are still existentially quantified in the remainder,
so clients that need to constrain them should still name them before
calling the solver.
Reviewed By: kren1
Differential Revision: D16269630
fbshipit-source-id: 65136edd1
Summary:
Add a global merge pass that merges globals into a single big global. It
replaces the uses of globals merged, with offsets into the big global.
Function summarisationis a big benefactor of this as it greatly reduces
the number of implicit formals (ie. globals).
Reviewed By: jvillard
Differential Revision: D16260098
fbshipit-source-id: 1b936f02f
Summary:
Fix a bug where summaries would be created even if summarisation option
is disabled.
Reviewed By: jvillard
Differential Revision: D16259761
fbshipit-source-id: f7319ef03
Summary:
If function summaries are enabled calling a function first tries to
apply a summary, if succesful, it directly jumps to the return site of
the call. Otherwise it proceeds as before.
Reviewed By: jvillard
Differential Revision: D16201251
fbshipit-source-id: cec52e0e5
Summary:
Define a new function summary type and compute it on function return.
As an intermediary step also apply the just computed summary to function
pre so it can be compared to what was actually computed.
Reviewed By: jvillard
Differential Revision: D16149833
fbshipit-source-id: b826c17e8
Summary:
Fix a crash that occurs when subtrahend has an existential variable that
was renamed as in the test.
The crash is due to an assertion in `Sh.exists` that says only variables
in the vocabulary can be existentialy quantifed out.
The problem was `Sh.exists` call in Solver.ml:611. Where `ws`
(existentials of the subthrehend) are not present in the vocabulary of
the remainder. This is because remainder "inheirts" the vocabulary of
the minued.
This fix simply extends the vocabulary of minued with `ws`, which
means the remaainder has the correct vocabulary. This should have no
externally visible effect as `ws` are then existentialed out.
Another option would be to try to change all the `excise_seg` functions,
to keep the vocabulary, but that looked annoying to implement.
Reviewed By: jvillard
Differential Revision: D16201423
fbshipit-source-id: b88c3abc4
Summary:
Add a `-color` option to sledge, that prints variable that are
existentially bound as bold.
Reviewed By: ngorogiannis
Differential Revision: D16088750
fbshipit-source-id: bd21cb8a0
Summary:
This fixes two bugs:
* All local variables would get existentially quantified out, that means
the the local variables of the caller couldn't be restored properly
* Frame was added back on after the formals were killed. Which meant
that if frame talked about formals (in pure for example), those
formals would remain to be free variables.
Reviewed By: ngorogiannis
Differential Revision: D16091157
fbshipit-source-id: dfe12ed82
Summary:
This fixes two issues with function summarization when calling a
function multiple times.
* Previously on return, the actuals wouldn't get added back in, so
their name would be "lost" (that is existentially quantified out),
this patch adds the formals to actuals equalities back on return,
before quantifying the formals out.
* Previously the entry state of the function would be lost if there were
multiple calls to other functions.
Reviewed By: jberdine
Differential Revision: D16071656
fbshipit-source-id: 9df7b1d4b
Summary:
Currently alarms are reported to stdout while the debug trace is
written to stderr. This makes synchronizing the two difficult. With
this diff, the alarm reports can also be included in the debug trace,
and analysis can be stopped when an alarm is encountered by tracing
the `Stop` module, e.g.:
```
sledge -trace Report+Stop.on_invalid_access
```
Reviewed By: kren1
Differential Revision: D16072611
fbshipit-source-id: 32c3639a2
Summary:
There are many assumptions on the behavior of mutexes, condition
variables, etc. in the implementation of the cxxabi with threads
support. So compile with `_LIBCXXABI_HAS_NO_THREADS` defined to select
the much simpler code paths for the single-threaded case.
Reviewed By: kren1
Differential Revision: D16069454
fbshipit-source-id: 9f975e0e6
Summary:
Each call to __cxa_allocate_exception, in practice, is shortly
followed by raising an exception. With -skip-throw, execution will not
proceed past the throw. Since the concrete implementation of
__cxa_allocate_exception and the following initialization of the
exception object is very low-level code that plays tricks, the
analyzer has trouble with it. So model __cxa_allocate_exception as
unreachable to avoid (needlessly) executing that code and potentially
failing spuriously.
Reviewed By: kren1
Differential Revision: D16069451
fbshipit-source-id: bea1dae09
Summary:
Allow intrinsics to return an inconsistent state, to specify that they
do not return.
Reviewed By: kren1
Differential Revision: D16069453
fbshipit-source-id: deb5d2a22
Summary:
This adds an optimized debug build mode, which is compiled with
optimizations, and without assertions, but still has tracing enabled.
Reviewed By: kren1
Differential Revision: D16069452
fbshipit-source-id: 445cfa329
Summary:
The report output got disturbed by the change from predicate to
relational Domain, and the tricky control of printing simplified
states. After this diff by default states are printed in full, and in
simplified form with `-t State_domain.pp_simp`.
Also includes some minor output improvements.
Reviewed By: kren1
Differential Revision: D16059780
fbshipit-source-id: b33289887
Summary:
Trivial renamings to use the standard "libFuzzer" name instead of "lib
fuzzer".
Reviewed By: kren1
Differential Revision: D16067881
fbshipit-source-id: 3ff2a8f86
Summary:
On function return add the computed summary (pre/post) condition to a
hashtable.
Reviewed By: jberdine
Differential Revision: D16052136
fbshipit-source-id: 0c5c9bafb
Summary:
Outputting the list of bitcode inputs when no output flag is ok for
`sledge buck bitcode` but does not make sense when it is composed as
part of other commands. So only output to stdout if `-` is given as
the output file name.
Reviewed By: kren1
Differential Revision: D16059782
fbshipit-source-id: abac9c36f
Summary:
To easily monitor and track changes to the help generated by the
command line interface, generate it in full and add it to the repo.
Reviewed By: kren1
Differential Revision: D16059783
fbshipit-source-id: be15f9943
Summary:
This diff enhances `-function-summaries` to remember the frame computed by
the solver and actually execute the function using the summary. Upon
return the frame is added back on the computed post condition.
Reviewed By: ngorogiannis
Differential Revision: D15900318
fbshipit-source-id: 8bb56b771