Summary:
Currently the free variables of the equality relation of a formula are
contained in the free variables of the rest of the formula, so Sh.fv
ignores them. Propagating equality facts across the star-or structure
of a formula, as necessary for quantifier elimination, breaks this
invariant. Some use cases, such as detecting which variables survive
applying a witness substitution, need to ignore the variables that
appear in the equality relation. This diff adds an argument to
conditionally ignore the variables in the equality relations.
Reviewed By: ngorogiannis
Differential Revision: D19580430
fbshipit-source-id: 2d417d89b
Summary:
In an equation such as `x = ⟨n,a⟩`, `x` is implicitly an aggregate of
size `n` (or else the equation is ill-typed). Make this explicit by
normalizing such equations to e.g. `⟨|⟨n,a⟩|,x⟩ = ⟨n,a⟩`.
Reviewed By: ngorogiannis
Differential Revision: D19358546
fbshipit-source-id: 77f67a0da
Summary:
Strengthen Equality.solve_for_vars so that it will solve cases such as
```
∃ a. ⟨n,a⟩ = ⟨m,b⟩^⟨o,c⟩
```
Reviewed By: ngorogiannis
Differential Revision: D19356324
fbshipit-source-id: a57625ba6
Summary:
Due to paying closer attention to which terms are already normalized
by the accumulated solution substitution.
Reviewed By: ngorogiannis
Differential Revision: D19356323
fbshipit-source-id: e162414a0
Summary:
The function transforming terms passed to Sh.map might produce trivial
constraints, filter them out.
Reviewed By: ngorogiannis
Differential Revision: D19286628
fbshipit-source-id: e3d9926ce
Summary:
In order for Equality.solve to generate fresh variables, it needs to
be passed the universal context with respect to which variables must
be chosen fresh.
Reviewed By: ngorogiannis
Differential Revision: D19286630
fbshipit-source-id: ebbedd954
Summary:
Aggregate args of Extract and Concat must be aggregate terms, in
particular, not variables. This maintains the property that the size
of any aggregate can be computed.
Reviewed By: ngorogiannis
Differential Revision: D19286625
fbshipit-source-id: 1af1e4183
Summary:
The byte-array theory used for the contents of memory is strong
enough to express all the constraints arising during symbolic
execution without the ability to extract a slice out of a byte-array.
However, without Extract, it is not possible to solve some equations
for some variables, for example solving ⟨n,α⟩^⟨m,β⟩ = ⟨l,γ⟩ for α.
Solving such equations is needed for quantifier elimination and to
formulate the byte-array theory as a Shostak theory.
Reviewed By: ngorogiannis
Differential Revision: D19286632
fbshipit-source-id: 07dc112d0
Summary:
Revise the ad hoc treatment to drop tautologous constraints such as
∃x,y. x = y to only apply when x and y do not appear elsewhere.
Reviewed By: ngorogiannis
Differential Revision: D19282633
fbshipit-source-id: 7fb9951ec
Summary:
Equality.solve_for_var_contexts can produce solution substitutions
that witness existentials in more cases than the
single-existential-occurrence heuristic, and is more conservative in
cases where the latter is incomplete.
Reviewed By: ngorogiannis
Differential Revision: D19282634
fbshipit-source-id: f86f0a9cb
Summary:
```
val trim : bound:Var.Set.t -> Var.Set.t -> t -> t
(** [trim bound kills subst] is [subst] without mappings that mention
[kills] or [bound ∩ fv x] for removed entries [x ↦ u] *)
```
To be used for existential witnessing and quantifier elimination.
Reviewed By: ngorogiannis
Differential Revision: D19282644
fbshipit-source-id: d981a1cb4
Summary:
```
val solve_for_vars : Var.Set.t list -> t -> Subst.t
(** [solve_for_vars \[v₁;…\] r] is a solution substitution that is
entailed by [r] and consists of oriented equalities [x ↦ u] such that
[fv x ⊈ vᵢ ⊇ fv u] where [i] is minimal such that [vᵢ]
distinguishes [fv x] and [fv u], if one exists. *)
```
To be used for existential witnessing and quantifier elimination.
Reviewed By: ngorogiannis
Differential Revision: D19282636
fbshipit-source-id: c5b006cea
Summary:
It is necessary to normalize subterms of Memory and Concat terms or
else Equality.entails_eq is incomplete. They ought to be Interpreted,
but the solver for the byte-array theory is not yet ready for that.
Reviewed By: ngorogiannis
Differential Revision: D19282635
fbshipit-source-id: c06b6ca6d
Summary:
The equality relation is implied by the pure part, so cannot involve
more variables. Also, Sh.invariant checks that the equality relation
does not contain unbound variables.
Reviewed By: ngorogiannis
Differential Revision: D19282641
fbshipit-source-id: 21dd37a3b
Summary:
If `Term.solve_zero_eq` is passed `for_`, then that subterm is solved
for.
Reviewed By: ngorogiannis
Differential Revision: D19282647
fbshipit-source-id: 5d5b76af5
Summary:
Match the `x` suffix naming convention of Term pp functions that take
a classification function.
Reviewed By: ngorogiannis
Differential Revision: D19282639
fbshipit-source-id: fc340e4bc
Summary:
Equality relies on the result of solving an equation to be a "solution
substitution". In constrast to unconstrained Map's, solution
substitutions are idempotent and have constraints on the terms that
may appear in their domain (they must be "maximal solvables", that is,
variables or uninterpreted function applications, which would be
variables if explicit "variable abstraction" was done).
This diff factors out the manipulation of concrete Map's into a
Equality.Subst module, and uses these for the result of `solve`.
Reviewed By: ngorogiannis
Differential Revision: D19282637
fbshipit-source-id: 4fc825e59
Summary:
In preparation for constructing solution substitutions in solve, which
are closely tied to Equality.
Reviewed By: ngorogiannis
Differential Revision: D19282640
fbshipit-source-id: ca0f8ae29
Summary:
Identifying and separating one of the monomials in a polynomial, and
solving an equality for it, is much more dependent on the
representation of polynomial terms than the rest of solve.
Reviewed By: ngorogiannis
Differential Revision: D19282645
fbshipit-source-id: 645191ae0
Summary:
The exposed constructors for Memory and Concat Terms are only used in
a very special idiom: to construct an equality between a single Memory
chunk and the Concat of multiple Memory chunks. This diff specializes
and simplifies by exposing a Term.eq_concat constructor for this
idiom, and removes the underlying Term.memory and Term.concat
constructors.
Reviewed By: ngorogiannis
Differential Revision: D19221866
fbshipit-source-id: 4842737d2
Summary:
Trace.infok is like Trace.info but accepts a polymorphic printf
continuation instead of directly taking a format string and its
args. This is useful to write wrappers such as:
```
let trace k = [%Trace.infok k]
```
Reviewed By: ngorogiannis
Differential Revision: D19221883
fbshipit-source-id: 88e939b26
Summary:
The size of Splats is redundant, as they always appear as subterms of
a Memory chunk or a heap segment, both of which are sized.
Reviewed By: ngorogiannis
Differential Revision: D19221870
fbshipit-source-id: 74044d1ad
Summary:
Now that they are uncurried, congruence closure does not need the
order of subterms to be preserved. Sorting them reduces redundancy in
case the same equality in different orders is encountered, and
improved printing.
Reviewed By: ngorogiannis
Differential Revision: D19221875
fbshipit-source-id: c6bf4ccad
Summary:
Equality.classes was assuming a simpler representation, and was
incomplete as a result.
The 'representative' map is not kept in a normalized form, where
subterms are necessarily representatives. Therefore, applying the
representative map to subterms of terms in a class can reveal new
elements of the class. This mirrors how the `lookup` function in
`normalize` works.
Reviewed By: ngorogiannis
Differential Revision: D19221868
fbshipit-source-id: 4a2ed6d3f
Summary:
Reduce redundancy by printing adjacent segments as if they had been
concatenated together.
Reviewed By: ngorogiannis
Differential Revision: D19221881
fbshipit-source-id: 613105864
Summary:
Also, previous code was sometimes inconsistent regarding whether to
enumerate all subterms or only toplevel terms.
Reviewed By: ngorogiannis
Differential Revision: D19221873
fbshipit-source-id: e8644098b
Summary: It is easier to understand the order of args with diff_inter.
Reviewed By: ngorogiannis
Differential Revision: D19221869
fbshipit-source-id: b29ac83c8
Summary:
Add some test cases from Reuss and Shankar for equality that are
mishandled by Shostak's original algorithm.
Reviewed By: ngorogiannis
Differential Revision: D19221880
fbshipit-source-id: a6f9d51e3
Summary:
This diff enables parsing and auto-formatting documentation
comments (aka docstrings).
I have looked at this entire diff and manually made some changes to
improve the formatting. In some cases it looked like it would take too
much time, or benefit from someone more familiar with the code doing
it, and I instead disabled auto-formatting docstrings in those files.
Also, there are some source files where the docstrings are invalid,
and some where the structure detected by the parser appears not to
match what was intended. Auto-formatting has been disabled for these
files.
Reviewed By: ezgicicek
Differential Revision: D18755888
fbshipit-source-id: 68d72465d
Summary:
OCaml 4.08 supports a form of signature-local bindings, to that a type
can be defined in order to be used in other definitions, without
being part of the signature itself.
Reviewed By: ngorogiannis, jvillard
Differential Revision: D18736380
fbshipit-source-id: 0bb043de6
Summary:
OCaml 4.08 has a new warning (66) on unused `open!` statements. This
has a suboptimal interaction with `ppx_let`'s `let%map_open` which
leads to triggering the warning if any of a group of such let bindings
does not need the open.
In this case, the refactor is easy.
But, warning 66 is very dubious, so also just switch it off.
Reviewed By: jvillard
Differential Revision: D18708466
fbshipit-source-id: 77618ab6e
Summary:
It seems to be effectively unmaintained, as it still doesn't support
4.08.
Reviewed By: jvillard
Differential Revision: D18708467
fbshipit-source-id: dcb3361fc
Summary:
Term.solve makes the assumption that all distinct normalized constants
denote distinct values. This is fragile at best, and it is better to
enumerate the cases where solve discovers inconsistency.
Reviewed By: jvillard
Differential Revision: D18459619
fbshipit-source-id: 71f52557c
Summary:
Equality.or_ assumed a simpler representation of equality relations,
and was incomplete as a result.
Reviewed By: jvillard
Differential Revision: D18298138
fbshipit-source-id: cf91229f6
Summary:
The treatment of type conversions is too complicated, non-uniform,
etc. This diff attempts to simplify things by separating integer to
integer conversions, which are interpreted, from others, which are
essentially just uninterpreted functions. Integer conversions are now
handled using two expression and term forms: Signed and
Unsigned. These each interpret their argument as either a signed or
unsigned number of a given bitwidth:
```
| Signed of {bits: int}
(** [Ap1 (Signed {bits= n}, dst, arg)] is [arg] interpreted as an
[n]-bit signed integer and injected into the [dst] type. That is,
it two's-complement--decodes the low [n] bits of the infinite
two's-complement encoding of [arg]. The injection into [dst] is a
no-op, so [dst] must be an integer type with bitwidth at least
[n]. *)
| Unsigned of {bits: int}
(** [Ap1 (Unsigned {bits= n}, dst, arg)] is [arg] interpreted as an
[n]-bit unsigned integer and injected into the [dst] type. That
is, it unsigned-binary--decodes the low [n] bits of the infinite
two's-complement encoding of [arg]. The injection into [dst] is a
no-op, so [dst] must be an integer type with bitwidth greater than
[n]. *)
| Convert of {src: Typ.t}
(** [Ap1 (Convert {src}, dst, arg)] is [arg] converted from type [src]
to type [dst], possibly with loss of information. The [src] and
[dst] types must be [Typ.convertible] and must not both be
[Integer] types. *)
```
Reviewed By: ngorogiannis
Differential Revision: D18298140
fbshipit-source-id: 690f065b4
Summary:
Extend the APRON-backed interval analysis to handle a wider range
of LLAIR expressions.
Reviewed By: jvillard
Differential Revision: D17858072
fbshipit-source-id: c50f5bf20
Summary:
In some cases the result of an integer conversion needs to be
truncated by a bit.
Differential Revision: D18271179
fbshipit-source-id: e80740045
Summary:
Add a new interval abstract domain. This domain uses the APRON
numerical analysis library to keep track of the range of values held
by llair variables where possible. This works by translating LLAIR
expressions into APRON tree expressions, so only handles the
subset of the LLAIR expression language that can be embedded.
Note also that function summarization is not yet implemented.
Future commits will add summarization and improve coverage of
LLAIR's expression language.
Reviewed By: jberdine
Differential Revision: D17763517
fbshipit-source-id: 826ce4cc5
Summary:
The Used globals (pre-)analysis produces results queried by
Control. This diff adds a type definition for these and moves the
query into the Used_globals module.
Reviewed By: bennostein
Differential Revision: D17856879
fbshipit-source-id: 0211b82d7
Summary:
To avoid code explosion, the frontend emits move instructions for
expressions with more than one use. This diff relaxes this slightly by
allowing duplication of casts.
Reviewed By: bennostein
Differential Revision: D17856384
fbshipit-source-id: 6f6c496ef
Summary:
The frontend translation of exceptional control flow is untrusted
enough that it makes sense to disable it by default.
Reviewed By: bennostein
Differential Revision: D16061018
fbshipit-source-id: 65dca36ae
Summary:
The CFG of a function is implicit in the blocks themselves, so it is
possible to remove the explicit represention as a vector of
blocks. The only uses are fold or iter, and since the cycles are
detected during construction, these can be simple depth-first
traversals.
Reviewed By: bennostein
Differential Revision: D17821845
fbshipit-source-id: fc7a02151
Summary:
Fix a bug where the actual return variable was not scoped correctly in
cases where its name clashed with a local or formal of the
callee. Also comment and simplify to attempt to make more
understandable.
Reviewed By: bennostein
Differential Revision: D17801944
fbshipit-source-id: 286739241
Summary:
Some code that is otherwise benignly scalar still uses the
ExtractElement and InsertElement vector operations, so translate them
as if they were array operations.
Reviewed By: ngorogiannis
Differential Revision: D17801949
fbshipit-source-id: 89f3666bd
Summary:
By some unfortunate logic, OCaml often decides to use
`sexp_list`/`sexp_option` instead of just `list`/`option`. Sometimes
these get copy/pasted in interface files.
It would be good to tell OCaml not to do that in the first place but in
the meantime: this diff.
Reviewed By: ngorogiannis
Differential Revision: D17907938
fbshipit-source-id: 7546834a2
Summary:
For test scripting purposes, when the analysis finishes successfully,
report the number of alarms.
Reviewed By: ngorogiannis
Differential Revision: D17801947
fbshipit-source-id: 1660866df
Summary:
In a spec, it currently may be that foot.us does not contain xs. So
exec_specs needs to extend the vocabulary of foot before existentially
quantifying out xs.
Reviewed By: ngorogiannis
Differential Revision: D17801933
fbshipit-source-id: 7b4b9262a
Summary:
Previously it was added to the locals before calling Dom.call, but
this results in the scope of freturn ending too early.
Reviewed By: ngorogiannis
Differential Revision: D17801939
fbshipit-source-id: 739ec8981
Summary:
Some globals have 'appending' linkage, where linking modules results
in appending the arrays from each module. These can appear even when
empty, leading to useless and somewhat troublesome 0-length arrays. So
drop them.
Reviewed By: ngorogiannis
Differential Revision: D17801927
fbshipit-source-id: d2dc180d7
Summary:
While BitCasts are the identity function on the bitwise
representation, they are not necessarily so in the semantics or the
logical representation. So be more conservative about eliding them in
the Exp language. Those that are actually semantic identities are
still omitted in the Term language.
Reviewed By: ngorogiannis
Differential Revision: D17801950
fbshipit-source-id: bf9ae57b5
Summary:
The analyzer (currently) hard-codes some assumptions about sizes of
basic types such as Typ.bool, Typ.siz, etc. Check that these
assumptions are satisfied by the input llvm datalayout, and give
reasonable error messages otherwise.
Reviewed By: ngorogiannis
Differential Revision: D17801941
fbshipit-source-id: 4fe484ee0
Summary:
Now that expression types and type sizes can be computed, it is not
necessary to store the sizes of globals separately.
Reviewed By: ngorogiannis
Differential Revision: D17801932
fbshipit-source-id: f746e506b
Summary:
- The `Llvm_target.DataLayout.size_in_bits` needs to be used for checking casts
e.g. it is ok to `bitcast <16 x i1> to i16`: they both have 16 bits, but they have sizes 16 vs 2 bytes
- The `Llvm_target.DataLayout.abi_size` needs to be used for the size of memory blocks containing values
e.g. for the size of memory segments containing the initial values of globals
- The example above shows that we can't compute the byte size from the bit size without knowing the target specific datalayout
- So we need both in each sized type
- Also add checks that Convert exps and terms are not no-ops
- Simplifications of size manipulating code
Reviewed By: ngorogiannis
Differential Revision: D17801928
fbshipit-source-id: 8c8ce6128
Summary:
In order to type-check casts, it is necessary to have the size of each
sized type. This size information is also useful in a few other places.
Reviewed By: bennostein
Differential Revision: D17801931
fbshipit-source-id: f8ef53276
Summary:
This is needed since expressions distinguish between the integer or
pointer zero value and zero-initialized array/tuple/struct aggregates
based on type, and the backend distinguishes them semantically.
Reviewed By: bennostein
Differential Revision: D17801938
fbshipit-source-id: ac8665e65
Summary:
Linking can lead to opaque types becoming identified with a known
types. Assertions in various places that types should be sized can be
triggered by such opaque types. Until there is a distinction between
processing fully-linked versus incomplete code, these checks need to
be relaxed to permit opaque types where sized ones are expected.
Reviewed By: bennostein
Differential Revision: D17801929
fbshipit-source-id: c5e62f7c8
Summary: Integer terms need to compare higher than any monomial.
Reviewed By: bennostein
Differential Revision: D17725607
fbshipit-source-id: c64fd52d5
Summary:
Also weaken definition of Typ.castable to permit casting between
floats and ints of the same size.
Reviewed By: bennostein
Differential Revision: D17725611
fbshipit-source-id: 5e8114e26
Summary:
Typ.equivalent is currently defined the same as Typ.castable, but
conceptually they are different and castable needs to be
weakened. They are different since for example it is possible to cast
from an i64 to a f64, but those types denote different sets of values
in the semantics, and the bitcast is modeled using a conversion
function.
Reviewed By: bennostein
Differential Revision: D17725615
fbshipit-source-id: 973574f2a
Summary:
For function calls where the callee is a cast expression, previous the
wrong type would be used for the callee. This could lead to crashes in
llvm, or asserting in sledge.
Reviewed By: bennostein
Differential Revision: D17725610
fbshipit-source-id: 938b49a49
Summary:
Some called functions are represented in llvm as a global variable
with e.g. external linkage, and so they do not appear as
'functions'. It is still valid to call such functions, though the
analyzer does not know their definitions.
Reviewed By: bennostein
Differential Revision: D17725609
fbshipit-source-id: 333d19c0d
Summary:
Improve Trace.fail to log the error and raise informative exceptions.
Eliminate the confusion between Import.fail and Trace.fail by removing
Import.fail.
Reviewed By: bennostein
Differential Revision: D17725608
fbshipit-source-id: 79fdfbd86
Summary:
By default all functions except those specified as entry points in the
config file are "internalized". Internal functions are removed if they
are not called. It is sometimes necessary to disable internalization,
e.g. to analyze the llvm tests.
Reviewed By: bennostein
Differential Revision: D17725614
fbshipit-source-id: 4b13501f5
Summary:
Sometimes the models for the C/C++ runtime and standard libraries are
not needed. Furthermore, sometimes, e.g. when analyzing llvm tests,
trying to link them fails.
Reviewed By: bennostein
Differential Revision: D17725616
fbshipit-source-id: 76a4bcf90
Summary:
The `(t, unit) result` type is no more informative than `t option` and
less convenient.
Reviewed By: bennostein
Differential Revision: D17665244
fbshipit-source-id: fa969d8b7
Summary:
This puts the mediation between Exp and Term together in Sh_domain
rather than being spread across the two.
Reviewed By: bennostein
Differential Revision: D17665235
fbshipit-source-id: edf277d45
Summary:
The move instruction takes a vector of assignments to perform in
parallel, so generalize exec_move from one to a vector.
Reviewed By: bennostein
Differential Revision: D17665248
fbshipit-source-id: 52aae5ff9
Summary:
Extend the encoding using `id` from 0 indicating a program variable to
also -1 indicating a global program variable.
Reviewed By: bennostein
Differential Revision: D17665229
fbshipit-source-id: 848b8a31e
Summary:
The sorting of heap blocks when printing formulas was broken by the
change to the direct representation of polynomials.
Reviewed By: bennostein
Differential Revision: D17665246
fbshipit-source-id: 4ebea9f20
Summary: It is not necessary to have both < and >=, and similarly for <= and >.
Reviewed By: bennostein
Differential Revision: D17665232
fbshipit-source-id: 01b3511f5
Summary:
Now that terms operate over unbounded, signed, integers rather than
bounded integers, and Boolean operations are treated uniformly with
bitwise operations, it is not necessary to propagate types throughout
arithmetic term manipulation.
Reviewed By: bennostein
Differential Revision: D17665257
fbshipit-source-id: 5236b101d
Summary:
Z.numbits ignores the sign, which allows 2^(N - 1) as representable
within N bits, while it is not. So check explicitly.
Reviewed By: bennostein
Differential Revision: D17665231
fbshipit-source-id: 0d3940517
Summary:
Instead of having separate signed and unsigned operations, use the
signed operations applied to explicit conversion of the arguments
using an unsigned integer interpretation.
Reviewed By: bennostein
Differential Revision: D17665267
fbshipit-source-id: 0b3271e71
Summary:
Add an Extract term form to interpret an integer with given signedness
and bitwidth.
Reviewed By: bennostein
Differential Revision: D17665263
fbshipit-source-id: 1d8917f3c
Summary:
Be more explicit about semantics of unsigned vs. signed conversions,
and fix a few related corner cases.
Reviewed By: bennostein
Differential Revision: D17665268
fbshipit-source-id: 67fecdf34
Summary:
With terms using unbounded two's complement arithmetic, it is not
necessary to special-case 1-bit integers as Booleans.
Reviewed By: ngorogiannis
Differential Revision: D17665228
fbshipit-source-id: a2f280fc3
Summary:
Remove the guards that prevent normalizing in some cases where the
corresponding instruction in LLVM would produce a poison
value. Usefully tracking poison values will be more involved.
Reviewed By: ngorogiannis
Differential Revision: D17665230
fbshipit-source-id: 59fb25042
Summary:
Revise program expressions based on the changed constraints now that
Term is separate from Exp. In particular:
- Add types to all application, indicating how the operation
interprets its arguments
- Change to a simpler uncurried form
- Remove now-unneeded normalizations
Reviewed By: bennostein
Differential Revision: D17665236
fbshipit-source-id: 1bcf2efd6
Summary:
Boolean and bitwise negation of `e` is represented using `-1 xor
e`. Since Equality can only maintain and propagate equality
constraints, Boolean negation `-1 xor b` is normalized to `b =
false`. This diff delays this normalization from being part of
expression construction to part of symbolic heap formula
construction. This makes the normalization done as part of expression
construction independent of the distinction between bitwise and
boolean operations.
Reviewed By: bennostein
Differential Revision: D17665254
fbshipit-source-id: 0a0722865