Summary:
- Fix clamping of percentage change of memory quantities
- Improve sorting of coverage changes relative to unchanged error
results
- Filter out results for invalid LLVM code
- Fix printing multiple status values
- Fix passing args to sledge, different subcommands need different
args
- Fix cleaning commands
- Add an unknown error status to the report if sledge exits without
producing a status in the report
- Internalize globals for tests, as otherwise C++ tests are unreadable
due to the runtime system models
Reviewed By: jvillard
Differential Revision: D25756567
fbshipit-source-id: 3b4c003f4
Summary:
When a stem has an occurrence of an existential, say x, and x is
substituted out by different witnesses, say a and b, in two disjuncts,
then the connection to other occurrences of a and b are lost.
This diff fixes this problem by propagating the solved variables that
survive normalization down to subformulas, and not removing those
variables from subformulas.
Reviewed By: jvillard
Differential Revision: D25756583
fbshipit-source-id: dabda743f
Summary:
Trade a bit more code for lowering complexity from linear to
logarithmic.
Reviewed By: jvillard
Differential Revision: D25756569
fbshipit-source-id: 83ad575fe
Summary:
Composing two substitutions does not need to require that their
domains are disjoint.
Leave disjointness check for composing a single mapping just to check
expected usage.
Reviewed By: jvillard
Differential Revision: D25756557
fbshipit-source-id: 04e92b864
Summary:
The overall operation of the quantifier elimination algorithm in
Sh.simplify is to first descend through the disjunctive structure of a
symbolic heap formula, process the leaves, and then proceed back up
the formula. At each point on the way back up, the first-order solver
is used to compute a solution substitution representing witnesses of
existential variables. This substitution is then used to normalize the
entire subformula at that point. Note that this results in
substituting and normalizing each subformula a number of times
proportional to its depth in the disjunctive structure.
This diff removes this redundancy and substitutes through each
subformula once. This is done by changing from an overall bottom-up to
top-down algorithm, and involves composing solution substitutions
rather than substituting multiple times. This change also heavily
relies on the strengthened Context.apply_and_elim operation, where the
bottom-up many-substitutions approach relies on the internal details
of repeated substitutions.
Reviewed By: jvillard
Differential Revision: D25756549
fbshipit-source-id: ca7cd814a
Summary:
The current implementation of quantifier elimination in Sh.simplify is
tightly coupled with the details of what the Context operations
support. In particular, successfully eliminating variables with
Context.elim effectively relies on being given a context that has been
transformed by Context.apply_subst. These operations are sound
independently, but achieving the desired result is delicate.
To simplify this situation, this diff refactors the tightly coupled
usage into a Context.apply_and_elim operation that hides the details
of the interaction inside the Context module. This enables an accurate
specification of apply_and_elim to be given much more simply than can
be done for the separate operations. This also simplifies the
implementation of Sh.simplify.
Reviewed By: jvillard
Differential Revision: D25756577
fbshipit-source-id: b344b3da6
Summary:
The implementation of quantifier elimination in Sh.simplify currently
relies on a subtle implementation detail of Context.apply_subst to
remove some identity mappings. Now that Context.elim has been
strengthened and generalized, it can be used to to give a much clearer
implementation, that is also more robust to representation changes in
Context.
Reviewed By: jvillard
Differential Revision: D25756554
fbshipit-source-id: 6be0de2f3
Summary:
The current implementation of Context.elim crudely removes oriented
equations from terms involving the given variables. This is easy to
use in a way that violates the representation invariants of Context,
as well as destroys completeness. This diff resolves this by making
Context.elim remove the desired variables by rearranging the existing
equality classes, in particular promoting a new representative term
when the existing representative is to be removed. Also, since this
basic approach is incomplete for interpreted terms, they are detected
and not removed. As a result, the interface changes to return the set
of variables that have been removed.
Reviewed By: jvillard
Differential Revision: D25756573
fbshipit-source-id: 0eead9281
Summary:
Context.fold_uses_of should enumerate the transitive subterms of a
term rather than only the immediate subterms.
Reviewed By: jvillard
Differential Revision: D25756553
fbshipit-source-id: a3911d9f5
Summary:
Previously Sh had a representation invariant which ensured that
existentials of a subformula did not shadow universals of its
superformulas. The current implementation of Sh.freshen_nested_xs
mistakenly still assumes this invariant, but it no longer holds. This
diff fixes freshen_nested_xs to freshen nested existentials with
respect to not only ancestor universals in addition to existentials.
Reviewed By: jvillard
Differential Revision: D25756551
fbshipit-source-id: 54c48b067
Summary:
Sh.simplify propagates first-order constraints over the formula's
propositional structure, and can reveal inconsistency. It is therefore
sometimes beneficial to check consistency after simplifying rather
than before.
Reviewed By: jvillard
Differential Revision: D25756585
fbshipit-source-id: 2a205c965
Summary:
Simplifying a symbolic heap can reveal inconsistencies, either of
individual disjuncts or (therefore) of toplevel formulas. This is due
to the first-order contexts being propagated from a formula to its
descendents, and for the intersection of the contexts of disjunctions
being propagated to their parent. The later stages of eliminating
redundant quantifiers, etc. do not reveal further
inconsistencies. Therefore this diff moves the inconsistency detection
earlier, to just after the contexts are strengthened. This is clearer,
and also a minor optimization.
Reviewed By: jvillard
Differential Revision: D25756564
fbshipit-source-id: d3acf875b
Summary:
Sh operations usually detect inconsistency where needed. But some
operations such as Context.inter must treat the unsat case
specially. This diff increases robustness by not relying on Sh to
detect inconsistency in order to get the correct context or pure
constraints, or to use the fast-paths through Context or Formula
operations.
Reviewed By: jvillard
Differential Revision: D25756550
fbshipit-source-id: 091439ff6
Summary:
Sh.is_false checks if the pure constraints are inconsistent with the
first-order consequences of the spatial constraints. This involves
some, though not very expensive, logical reasoning. But is it not
checking only testing for the representation of Sh.false_, which one
might expect from its name. This renaming also makes room for the
operation that does test for the representation of Sh.false_.
Reviewed By: jvillard
Differential Revision: D25756580
fbshipit-source-id: 30510f45a
Summary:
Context.apply_subst erroneously resets everything in the
representation of a context except the solution substitution when
applying a substitution.
Reviewed By: jvillard
Differential Revision: D25756548
fbshipit-source-id: 949a34ced
Summary:
This is a simple CFG transformation that compresses jumps to jumps
into jumps to the final destination. LLVM's simplifycfg pass seemingly
should have eliminated these, but does not.
Reviewed By: jvillard
Differential Revision: D25196728
fbshipit-source-id: 3288dbd8d
Summary:
Currently the symbolic execution options, obtained from the command
line flags and preanalysis, are passed to the Control entry points as
a record, and then passed around within Control as needed. This diff
simplifies the code in Control by replacing the record mechanism with
an additional functor parameter containing the options. The main
benefit is that the bound option no longer needs to be part of the
analysis state.
This change does make the code in Sledge_cli slightly more
complicated, as it now performs a functor application using a
first-class module computed from the command line flags.
Reviewed By: jvillard
Differential Revision: D25196735
fbshipit-source-id: 80e5d5d62
Summary:
Revise the control-flow exploration scheduling algorithm to fix
several issues. The main difference is to change the priority queue to
keep the control edges on the frontier of exploration in sync with the
states that are waiting to be propagated. This fixes several sorts of
issue where the decision of which control and state joins to perform
was unexpected / wrong. Part of keeping the frontier edges and waiting
states in sync is that the waiting states are associated not only with
a destination block, but the stack of that block. This fixes several
issues.
Combined, these changes lead to the algorithm only attempting joins
for which the pointwise max join on depth maps is correct (with the
caveat of no mathematical proof yet).
Reviewed By: jvillard
Differential Revision: D25196733
fbshipit-source-id: db007fe1f
Summary:
The first-order solver sometimes needs to generate fresh variables to
express the solution of equations. It needs to ensure that these
generated variables do not clash. Before this diff, there was a
confusion where new variables were fresh with respect to only the
current set of "universal" variables. This is wrong, and this diff
adds the full set of variables instead.
Reviewed By: jvillard
Differential Revision: D25196732
fbshipit-source-id: afc56834a
Summary:
It can happen that the actual return variable, which is the name of
the variable in the caller's scope that receives the returned value,
clashes with the formals of the callee. The scope of the return
formula was wrong in this case, as the actual return was outscoped via
existential quantification with the rest of the formals after the
return values was passed.
Reviewed By: jvillard
Differential Revision: D25196727
fbshipit-source-id: 94cf25418
Summary:
A context with valid but extraneous equations was being kept. The
extra equations are valid, but might violate vocabulary invariants.
Reviewed By: jvillard
Differential Revision: D25196734
fbshipit-source-id: a8001a075
Summary:
Currently there is a symbolic execution option to ignore exceptional
control flow. This hack does not fit well, and it is unclear how much
backend functionality should take it into consideration. This diff
removes this option and replaces it with an option during model
compilation. This has the advantage of clarifying and simplifying the
backend, with the disadvantage of no longer supporting switching
between exceptions and no-exceptions modes at analysis time. Since the
possibility of ignoring exceptional control flow is due to it not being
ready yet, this is a good trade to make.
Reviewed By: jvillard
Differential Revision: D25146148
fbshipit-source-id: 1f1299ee1
Summary:
Use the equality class information in the symbolic state to resolve
callees of indirect calls.
Reviewed By: jvillard
Differential Revision: D25146160
fbshipit-source-id: a1c39bbe1
Summary:
The callee function of a Call can often be resolved
statically. Currently this is resolution is only done dynamically
during symbolic execution by checking if the callee expression is a
function name and looking up the function in the program. This is
wasted and redundant work. Also, the static resolution code is
duplicated in all the domains.
This diff resolves this by resolving known callees statically at
translation time. This involves:
- add an ICall terminator that is the same as Call is currently
- change Call to use a func callee instead of Exp.t
- make callee field mutable since recursive calls can create cycles
- change the Llair.Term.call constructor to return a thunk to perform
the backpatching once the callee has been translated
- modify the Frontend
+ to determine whether to emit Call or ICall depending on whether
the callee in LLVM is already a Function
+ to record the LLVM function -- backpatch thunk pairs encountered during translation
+ record the mapping of LLVM to LLAIR functions during translation
+ to enumerate the calls to backpatch after all functions have been
translated, and find the LLAIR function corresponding to each LLVM
function and backpatch the call to use it as the callee
+ to handle direct calls to undefined functions, when backpatching
translate such function declarations into undefined functions
Reviewed By: jvillard
Differential Revision: D25146152
fbshipit-source-id: 47d2ca1ff
Summary:
Rename the existing exec_intrinsic that works for calls to intrinsic
functions to exec_intrinsic_func to make room for an exec_intrinsic
that works on intrinsic instructions.
Reviewed By: jvillard
Differential Revision: D25146166
fbshipit-source-id: 80ae3aac9
Summary:
Currently intrinsics are treated as functions, with Call instructions
to possibly-undefined functions with known names. This diff adds an
Intrinsic instruction form to express these more directly and:
- avoid the overhead of intrinsics needing to end blocks
- avoid the overhead of the function call machiery
- avoid the complexity of doing the string name lookup to find their
specs, repeatedly
This diff only adds the backend support, the added Intrinsic
instructions are not yet generated by the frontend.
Reviewed By: jvillard
Differential Revision: D25146155
fbshipit-source-id: f24024183
Summary:
Previously, when LLAIR was in SSA form, blocks took parameters just
like functions, and it was sometimes necessary to partially apply a
block to some of the parameters. For example, blocks to which function
calls return would need to accept the return value as an argument, and
sometimes immediately jump to another block passing the rest of the
arguments as well. These "trampoline" blocks were partial applications
of the eventual block to all but the final, return value,
argument.
This partial application mechanism meant that function parameters and
arguments were represented as a stack, with the first argument at the
bottom, that is, in reverse order.
Now that LLAIR is free of SSA, this confusion is no longer needed, and
this diff changes the representation of function formal parameters and
actual arguments to be in the natural order. This also brings Call
instructions in line with Intrinsic instructions, which will make
changing the handling of intrinsics from Calls to Intrinsic less
error-prone.
Reviewed By: jvillard
Differential Revision: D25146163
fbshipit-source-id: d3ed07a45
Summary:
It was possible for the scope of a local to be incorrectly restored
when entering it for the first time in a caller after is was shadowed
by a callee. This could happen because scope management in the
analysis relies on renaming variables to adjust the vocabulary of
symbolic states. This was usually done, but optimizations of renaming
with a substitution whose domain is disjoint from the vocabulary of a
formula inadvertantly violated this vocabulary-adjustment
assumption. (Yes, this is too easy to get wrong.)
Reviewed By: jvillard
Differential Revision: D25146162
fbshipit-source-id: 30f2d657f
Summary:
These are not simply handled by `@deriving` since:
- These types are recursive and so some fields need to be ignored
- Some are uniquely identified by one of their fields
- Some fields are mutable, but set only during construction, and need
to be considered by compare, equal, hash, but ppx_hash refuses.
Reviewed By: jvillard
Differential Revision: D24989064
fbshipit-source-id: 7f8d699e5
Summary:
The use of realpath on paths obtained from debug info and the current
working directory is application-usage-specific behavior that does not
belong in the backend library. This diff moves these uses to the
frontend and cli, respectively. Also, the use of realpath in the
frontend is memoized along the same lines as the other frontend
translation functions.
This was also the last use of `core` in the `sledge` library, so the
dependency is moved to `sledge_cli` and `sledge_report`.
Reviewed By: ngorogiannis
Differential Revision: D24989070
fbshipit-source-id: c21b275f5
Summary:
Instructions that have multiple specs sometimes have inconsistent
footprints for one or more of their specifications. In such cases, the
handling of the existential and universal vocabularies was sometimes
incorrect before this diff. In particular, the ghosts of a spec need
not appear in the pure approximation of the footprint, which the
previous code incorrectly assumed. Also, the unit of disjunction is
`false` with an empty vocabulary, not the precondition's vocabulary as
the previous code incorrectly used.
Reviewed By: ngorogiannis
Differential Revision: D24989067
fbshipit-source-id: eca3bff55
Summary:
The `seq` name of this field refers to the expected sort of it's
value, where the others refer to the role they play. So rename
seq (for sequence) to cnt (for contents).
Reviewed By: ngorogiannis
Differential Revision: D24951507
fbshipit-source-id: fd6640517
Summary: It is too easy to mix up multiple arguments of the same type.
Reviewed By: jvillard
Differential Revision: D24934116
fbshipit-source-id: 6e595b26e
Summary:
Global constants have reliable types, and their sizes can be used
instead of storing the size of the initializer separately.
Reviewed By: jvillard
Differential Revision: D24934114
fbshipit-source-id: 2426ab5be
Summary:
In practice this has not been observed to matter so far, since
treating `Splat N` as interpreted or uninterpreted does not matter
when `N` is a literal constant, and code seen so far only uses `Splat`
for zero initializers or memset with literal constant bytes.
Reviewed By: jvillard
Differential Revision: D24934118
fbshipit-source-id: 213e9724e
Summary:
0 and Splat 0 need to be treated the same since code relies on knowing
that 0 consists of all-0 bytes, and extracting a subsequence of a
Splat 0 yields 0. For example, initializing a struct to all-zeros and
then reading a member of pointer type out of it needs to produce the
null pointer. Therefore 0 and Splat 0 are redundant representations,
and all uses of Splat need to be updated to also handle 0.
This unfortunately leads to some near code duplication that seems to
be necessary. The issue is that 0 and Splat 0 are, from the backend's
perspective, constants in two distinct theories. Since 0 is chosen
over Splat 0 as the representation, the sequence theory solver needs
to treat 0 as if it was Splat 0, which duplicates some code handling
the general Splat cases.
Reviewed By: jvillard
Differential Revision: D24920758
fbshipit-source-id: 7c02be62b
Summary:
Global variables and function names in LLAIR are constant and so do
not need to be handled like normal assignable or shadowable
variables. This diff does this by changing the translation from LLAIR
to FOL to map globals and functions to uninterpreted constants instead
of variables.
Reviewed By: jvillard
Differential Revision: D24886571
fbshipit-source-id: efb8c9f49
Summary:
Localizing the entry of a procedure needs the globals (that the
procedure uses), but later creating a summary does not.
Reviewed By: jvillard
Differential Revision: D24886570
fbshipit-source-id: 8a7b18c58
Summary:
The computation of provable reachability through the heap currently
uses a set of variables whose values are either determined by the
desired roots or by the heap constraints. This requires globals to be
treated as variables. In preparation for distinguishing globals from
variables, this diff changes the reachability computation to use a set
of atomic terms instead of variables.
Reviewed By: jvillard
Differential Revision: D24886573
fbshipit-source-id: c0e6763b6
Summary:
No functional change, only simplifiying and making easier to
generalize.
Reviewed By: jvillard
Differential Revision: D24886572
fbshipit-source-id: e487b815d
Summary:
Distinguish expressions that name globals from registers. This leads
to clearer code, and globals are semantically distinct from general
registers. In particular, they are constant, so any machinery for
handling assignment does not need to consider them. This diff only
adds the distinction to LLAIR, it is not pushed through to FOL, which
will come later.
Reviewed By: jvillard
Differential Revision: D24846676
fbshipit-source-id: 3aca025bf
Summary:
This module represents the definition of a global constant, rather
than the global itself.
Reviewed By: jvillard
Differential Revision: D24846673
fbshipit-source-id: d47e67984
Summary:
Distinguish expressions that name functions from registers. This leads
to clearer code, and function names are semantically distinct from
general registers. In particular, they are constant, so any machinery
for handling assignment does not need to consider them. Unlike general
globals, they never have initializer expressions, and in particular
not recursive initializers. This diff only adds the distinction to
LLAIR, it is not pushed through to FOL, which will come later.
Reviewed By: jvillard
Differential Revision: D24846672
fbshipit-source-id: 2101f353f
Summary:
LLVM and Llair use a form of records, in particular for values of
constant structs and arrays. In Llair, these use standard `select` and
`update` operations a la McCarthy's theory of functional arrays, with
a compact `record` operation for constructing complete records. This
is fine and logically well-understood. The issue is that once
constructed, these values are accessed using instructions that (may)
operate over byte-ranges, rather than struct member indices. The
backend uses a theory of sequences to represent such values (the
contents of memory). So some code depends on high fidelity
interoperation between these two views.
This diff resolves this by removing the record theory from the backend
and instead encoding them using the sequence theory. The approach
taken keeps records in Llair and translates them to sequences in
Llair_to_Fol. This choice is made since the encoding into the sequence
theory involves terms that do not have types that are expressible in
terms of the source types. In particular, `(update r i e)`, is encoded
as the concatenation of the prefix of `r` up to the offset of index
`i`, followed by `e` (possibly with padding), and then the suffix of
`r` from index `i+1` on. The prefix and suffix sequences do not
necessarily have source-expressible types.
Reviewed By: jvillard
Differential Revision: D24800866
fbshipit-source-id: e7238c558
Summary:
The support for recursive references to globals from within their
initializers is enough to handle all the cases of recursive structs
that have been encountered so far. Therefore this diff removes the
complication of recursive records entirely.
Reviewed By: jvillard
Differential Revision: D24772955
fbshipit-source-id: f59f06257
Summary:
It happens so seldomly that it is not worth it to optimistically
assume that linking will make opaque types sized. In particular, it is
incongruent for `Typ.is_sized` to hold and then `Typ.size_of` to
raise.
Reviewed By: jvillard
Differential Revision: D24772956
fbshipit-source-id: 96a72a5cf
Summary:
This information is needed to mediate between index-based
operations (such as on records) and offset-based operations (such as
load/store). Since it is fragile to recompute, the approach here is to
query llvm during translation and store the result.
Reviewed By: jvillard
Differential Revision: D24772954
fbshipit-source-id: ad22c3ecf
Summary: Do not fail when resolving the realpath of a debug info path.
Reviewed By: jvillard
Differential Revision: D24746237
fbshipit-source-id: b9dc35176
Summary:
Change Arith.map to not descend through non-interpreted arithmetic
operators. For example, in `2×(x × y) + 3×z + 4`, `map ~f` will apply
`f` to the subterms `x × y` and `z`, but not `x` or `y`.
The logical notion of "subterm" that is needed by the solver does not
coincide with the representation. This is essentially due to not
"flattening" or "purifying" terms. That is, traditionally `x × y`
would not be permitted as an indeterminate of a polynomial. Instead, a
new variable would need to be introduced: `v = x × y` and then the
polynomial would be expressed as `2×v + 3×z + 4`. Taking maximal
non-interpreted subterms as the definition of "subterm" results in
subterms in the non-flattened representation that are equivalent to
those that would result from flattening the representation.
Reviewed By: jvillard
Differential Revision: D24746235
fbshipit-source-id: d8fcf46a1
Summary:
The implementation of Arithmetic relies on the partial projection from
terms to polynomials. This diff enables it to also embed polynomials
back into terms.
Reviewed By: jvillard
Differential Revision: D24746223
fbshipit-source-id: b6010e7b7
Summary:
Add a distinction between interpreted and uninterpreted arithmetic
terms, and use it in Context.classify. This enables correctly
classifying non-linear terms such as `x × y` as uninterpreted.
Reviewed By: ngorogiannis
Differential Revision: D24746228
fbshipit-source-id: 1a4b0e3bd
Summary:
It was possible for the scope of existentials to be violated. In
particular, before this diff the order of re-quantifying the
existentials and conjoining the non-eliminated equations from the
solution of solving for existentials was wrong.
Reviewed By: ngorogiannis
Differential Revision: D24746231
fbshipit-source-id: d96cc60a6
Summary:
In the process of computing `Context.solve`, fresh variables can be
generated. Not all of these end up in the final solution
substitution. Currently all of the freshly generated variables are
returned to the client, which leads to extraneous existentials. This
diff trims the returned fresh variables to only those that appear in
the final solution.
Reviewed By: ngorogiannis
Differential Revision: D24746241
fbshipit-source-id: 59a2f221b
Summary:
Since floats of any width are interpreted the same (as exact rationals
where possible and uninterpreted constants otherwise), this does not
introduce additional infidelity.
Reviewed By: da319
Differential Revision: D24746225
fbshipit-source-id: bc8e7bdb9
Summary:
Since non-integral address spaces are not currently supported anyhow,
this does not introduce additional infidelity.
Reviewed By: da319
Differential Revision: D24746234
fbshipit-source-id: 1f6887a78
Summary: Just to make the source and destination types of the conversion more clear.
Reviewed By: da319
Differential Revision: D24746239
fbshipit-source-id: 592c7d0f1
Summary:
Since Context treats only equality directly, formulas involving other
literals can normalize to false when the context is not unsat. This
diff changes Sh.star to check this case, and return the canonical
false symbolic heap.
Reviewed By: da319
Differential Revision: D24746227
fbshipit-source-id: 50a51b8a6
Summary:
When exceptions are used due to the lack of goto, use `raise_notrace`
instead of `raise` to avoid the overhead of populating the backtrace.
Reviewed By: ngorogiannis
Differential Revision: D24630525
fbshipit-source-id: c5051d9c4