Summary:
Make the relationship between Sh.is_empty and Sh.pure_approx stronger
and more precise. In particular:
> If [is_empty q], then [pure_approx q] is equivalent to
> [pure (pure_approx q)].
This enables replacing Solver.excise_pure with a simpler pure_entails
function. In particular, the heavy reliance on normalization of pure
formulas to true or false literals is eliminated, and only pure
entailment is needed.
Reviewed By: jvillard
Differential Revision: D22571146
fbshipit-source-id: 2fca64a61
Summary:
Generalize Fol interface to allow checking if a context implies any
formula, rather than restricting to only equalities.
Reviewed By: jvillard
Differential Revision: D22571144
fbshipit-source-id: 726bd87fd
Summary:
There is nothing specific to the Ses representation in the
implementation, and no uses within Ses.
Reviewed By: jvillard
Differential Revision: D22571150
fbshipit-source-id: 8952f0301
Summary:
In Ses, the constant term of a polynomial is represented as a
redundant multiplication by 1. Fix Fol.of_ses to recognize and
normalize this.
Reviewed By: jvillard
Differential Revision: D22571131
fbshipit-source-id: 3e1a12e5f
Summary:
The Ses constructors might simplify terms when called from
Fol.to_ses. Fix Fol.ses_map to account for this.
Reviewed By: jvillard
Differential Revision: D22571151
fbshipit-source-id: 1d573ac5f
Summary:
There is nothing specific to the Ses representation in the
implementation, and no uses within Ses.
Reviewed By: jvillard
Differential Revision: D22455725
fbshipit-source-id: 6f0059873
Summary:
In preparation for more smoothly interoperating with ICS's functional
array theory.
Reviewed By: jvillard
Differential Revision: D22401039
fbshipit-source-id: 4de39c38a
Summary:
It is not needed or very meaningful since the addition of type-based
disambiguation.
Reviewed By: jvillard
Differential Revision: D22401035
fbshipit-source-id: 31996f946
Summary:
The first-order context is induced by the pure part, so no need to
compare it.
Reviewed By: jvillard
Differential Revision: D22381645
fbshipit-source-id: 29fff13a3
Summary:
In order to allow implementations of the single Fol interface using
multiple backend first-order logic solvers, add explicit definitions
of terms and formulas in the Fol module, and implement Context in
terms of them.
The Fol interface supports freely mixing Terms and Formulas, in
particular there is `Term.ite : cnd:Formula.t -> thn:Term.t ->
els:Term.t -> Term.t` which allows Formulas to appear in Terms. The
Fol implementation performs enough normalization to enable using an
internal representation of terms that is strictly partitioned into
"theory terms" and "formulas", which are stratified below "conditional
terms" and then below "general terms". This partitioning and
stratification enables using backend solvers that do not support
mixing formulas in terms.
Reviewed By: jvillard
Differential Revision: D22170506
fbshipit-source-id: a014ee7d7
Summary:
The eventual aim is for the conversion of Llair to Fol to be external
to Fol. Fol should not need to depend on Llair, nor vice versa. This
is not yet possible, but a step forward is to move the conversion
functions into separate modules outside the core Fol modules.
Reviewed By: ngorogiannis
Differential Revision: D22170522
fbshipit-source-id: 4860b4c07
Summary:
The `Subst` module is not used by `Ses`, so move it out of the
Ses-internal APIs.
Reviewed By: ngorogiannis
Differential Revision: D22170513
fbshipit-source-id: a189a1440
Summary:
Change Term and Equality rename functions to accept a partial
application of Var.Subst.apply to the renaming substitution.
Reviewed By: ngorogiannis
Differential Revision: D22170520
fbshipit-source-id: 003d8b27e
Summary:
The convention is for modules that are intended to be `open`ed, that
define syntax and infix operations, are named `Import`. This diff
combines the `Option.Monad_infix` and `Option.Monad_syntax` modules
into `Option.Import` to follow this convention.
Reviewed By: ngorogiannis
Differential Revision: D22170507
fbshipit-source-id: 44378fd56
Summary:
Exponents in polynomials are represented by rationals for code reuse
purposes, but only integral exponents are supported. Strengthen the
invariant check to clarify this.
Reviewed By: ngorogiannis
Differential Revision: D22170517
fbshipit-source-id: 81ae38f4e
Summary:
Refactor `Equality.ppx_classes_diff` into `diff_classes` to construct
the difference between classes maps explicitly, and change
`ppx_classes` to accept such a difference instead of computing it
internally.
This more flexibly composable interface allows elimination of extra
calls to `Equality.entails` to check if the difference is empty.
Reviewed By: jvillard
Differential Revision: D22038488
fbshipit-source-id: c19c18fc8
Summary:
The list interpreted as a conjunction of individual terms in `Sh.pure`
is now redundant with `Term.And`. This patch removes the redundant
list.
Reviewed By: jvillard
Differential Revision: D22035852
fbshipit-source-id: 49c01a078
Summary:
With the current handling of fresh variable generation during symbolic
execution, it is now possible to delay generating fresh variables in
individual small axioms until the precondition is known. In
particular, the existential variables of the precondition formula can
be bound, and then the small axiom can be generated with variables
fresh with respect to them. Previously, the small axioms were
generated with fresh variables that could later clash with the
precondition's existentials, necessitating renaming. This
double-freshening is now eliminated.
Reviewed By: jvillard
Differential Revision: D21974022
fbshipit-source-id: f217bfb9f
Summary:
When fresh variables are generated to name the overwritten value in an
assignment, they should be included in the ghost variables of the
resulting small axiom. This change should have been included in the
elimination of SSA.
Also strengthen assertion checking of small specs during symbolic
execution.
Reviewed By: jvillard
Differential Revision: D21974019
fbshipit-source-id: a66d8dac6
Summary:
Currently the symbolic execution code in `Exec` manually threads
universal and existential variable contexts through virtually every
function. It is easy to mistakenly pass on a context that is not the
latest-extended one, or to forget to add generated variables to the
contexts.
This patch adds a state monad, `Fresh`, to manage the generation of
fresh variables in `Exec`. This is a standard state monad where the
state is two sets of variables: those to which fresh variables must be
chosen fresh, and those which have been generated. This yields an
abstraction where an `'a Fresh.t` value represents a value of type
`'a` which may contain as-yet-unnamed variables, and `Fresh.gen ~wrt
a` generates names that are fresh with respect to `wrt` for all
unnamed variables in `a`, and yields the set of generated variables
together with `a` expressed in terms of those variables.
Reviewed By: jvillard
Differential Revision: D21974018
fbshipit-source-id: 1917e82c0
Summary: Minor code simplification and optimization of `extend_us` in no-op case.
Reviewed By: jvillard
Differential Revision: D21974021
fbshipit-source-id: a8b12b564
Summary:
Change the `Var.Subst` `freshen` and `restrict` constructors to return
the domain and range of the substitution explicitly. Clients generally
need to compute them immediately, and they are at least partially
constructed during the initial substitution construction anyhow. This
may be an incidental minor optimization.
This allows removing the `apply_set` operation, as it's use can be
handled directly from the domain and range sets.
This also allows `Sh.rename` to be split into a function that assumes
that the substitution is restricted to the vocabulary of the formula,
and a wrapper that does this restriction and calls through. This
allows `Sh.freshen_xs` to be simplified slightly, and avoids some
redundant restriction, domain, and range computations.
Reviewed By: jvillard
Differential Revision: D21974017
fbshipit-source-id: aa8b3db24
Summary:
`Domain_sh.from_call.subst` is a substitution that replaces shadowed
variables with fresh ones, which is constructed by `Domain_sh.call`
and used by `Domain_sh.retn`, after inverting it. This patch changes
the stored substitution to the inverted one, and renames it to
`unshadow` for clarity.
After this change, the stored substitutions have the property that
they map variables to program variables. This is desirable since it
avoids the question about the uninverted substitution of whether the
variables in the range of the uninverted substitution are "fresh".
Reviewed By: jvillard
Differential Revision: D21974020
fbshipit-source-id: d469c89f9
Summary:
Logically there is nothing specific to memory contents (as
byte-arrays) or aggregate (struct/array) values, the theory is for
sequences of non-fixed sized elements.
Reviewed By: jvillard
Differential Revision: D21721019
fbshipit-source-id: b2b730a50
Summary:
`Term.eq_concat` is not primitive and complicates the `Term`
interface. Move it to a couple clients as a convenience wrapper.
Reviewed By: jvillard
Differential Revision: D21721026
fbshipit-source-id: 0d74aa251
Summary:
Previously `null` and `zero` had different sorts/types, but now they
are equivalent.
Reviewed By: jvillard
Differential Revision: D21721023
fbshipit-source-id: 485219f6a
Summary:
Multiplication by a constant is primitive in the linear arithmetic
solver, while general multiplication is not, so for clarity and
predictability, use constants where possible.
Reviewed By: jvillard
Differential Revision: D21721020
fbshipit-source-id: 3497d06c9
Summary:
It is suboptimal for `Sh` to destruct terms with detailed knowledge of
their representation. So add `Term.const_of` to obtain the constant
summand of a polynomial term.
Reviewed By: jvillard
Differential Revision: D21721022
fbshipit-source-id: 4af858896
Summary:
It is suboptimal for `Sh` to destruct terms with detailed knowledge of
their representation. So add `Term.d_int` to destruct an integer term.
Reviewed By: jvillard
Differential Revision: D21721024
fbshipit-source-id: 5f13794b6
Summary:
It is suboptimal for `Sh` to destruct terms with detailed knowledge of
their representation. So add `Term.disjuncts` to obtain the toplevel
disjuncts of a term.
Reviewed By: jvillard
Differential Revision: D21721016
fbshipit-source-id: 809da9b1b
Summary:
It has no dependencies on the rest of the sledge codebase and might be
more generally useful.
Reviewed By: jvillard
Differential Revision: D21720980
fbshipit-source-id: b4f061e73
Summary:
Now that the frontend translates LLVM's undef to nondet instructions,
not expressions, Nondet in Exp and Term are not needed.
Reviewed By: jvillard
Differential Revision: D21720969
fbshipit-source-id: e8acaf432
Summary:
Instead of relying on Exp.nondet to encode the semantics of LLVM's
undef, translate each to a per-function unique register, with a nondet
assignment to it prior to each use. This avoids the need for
Exp.nondet, which is ill-formed in the sense that expressions denote
values, not sets of values (with particular constraints on what ways
in which the choice must be angelic vs demonic). This change
essentially allows the backend to be sane, and makes it the frontend's
problem to deal with LLVM's undef.
This treatment, like the treatment based on Exp.nondet, is expected to
result in LLAIR code with different semantics of undef compared to the
semantics of LLVM described in the
[LangRef](https://llvm.org/docs/LangRef.html#undefined-values). In
particular, the LLVM LangRef states
> An ‘undef’ “variable” can arbitrarily change its value over its
> “live range”. This is true because the variable doesn’t actually
> have a live range. Instead, the value is logically read from
> arbitrary registers that happen to be around when needed, so the
> value is not necessarily consistent over time.
To model this ability of undef to arbitrarily change its value over
its live range, it is likely that additional nondet assignments would
need to be added. Exactly where it not currently known.
Reviewed By: jvillard
Differential Revision: D21720976
fbshipit-source-id: 90c2a0d26
Summary:
Refactor frontend translation of LLVM values, opcodes, etc. to support
emitting not only a LLAIR expression, but also a sequence of
instructions to be prefixed onto the uses of the resulting expression.
This is currently unused, as all prefixes are empty. Later, it will be
used to translate e.g. `undef` to `r := nondet(); r`.
Reviewed By: jvillard
Differential Revision: D21720972
fbshipit-source-id: b89bb57de
Summary:
Refer to Llair modules using `Llair.` qualifier, except for in
`Frontend`, which makes so much use of `Llair` that it is now opened
(`Llair` only contains types and modules, so `open` is safe).
Reviewed By: jvillard
Differential Revision: D21720979
fbshipit-source-id: dd42075d9
Summary:
The term representing an exp should not rely on more info than is
carried by the exp.
Reviewed By: jvillard
Differential Revision: D21720989
fbshipit-source-id: b65bf3678
Summary:
It is now possible to not spew dune files all over the repo, and opam
files aren't needed either.
Reviewed By: jvillard
Differential Revision: D21720978
fbshipit-source-id: 553e1d154
Summary:
```
val ( let@ ) : ('a -> 'b) -> 'a -> 'b
(** [let@ x = e in b] is equivalent to [e @@ fun x -> b], that is,
[e (fun x -> b)] *)
```
Reviewed By: jvillard
Differential Revision: D21721025
fbshipit-source-id: d8efdebbe
Summary:
Rather than compute the size of the llair type of the llair
initializer expression, compute the size of the llvm initializer
directly.
Reviewed By: ngorogiannis
Differential Revision: D21720982
fbshipit-source-id: 4364baf38
Summary:
Rather than translate an llvm type to a llair one and then compute the
size of the llair type, obtain the size of the llvm type directly.
Reviewed By: ngorogiannis
Differential Revision: D21720968
fbshipit-source-id: ad98112a7
Summary:
In LLVM it is possible for struct constant values to be directly
recursive, with no pointer dereference to close the cycle. These
appear for example as the values of vtables from C++ code.
Currently such recursive records in the Exp and Term languages are
represented as genuinely cyclic values. Compared to a standard term
representation, the presence of cyclic values is a significant
complication everywhere. Since the backend solver does not do anything
such as induction over these, they have to be treated as essentially
atomic.
This patch changes the representation to a standard non-recursive tree
term structure. Instead of cyclic references, an explicit constructor
is added for the "non-tree edges", which simply indicates which
ancestor record value to which the recursive reference points.
There is a potential issue with this representation, since for
mutually recursive records, the representation is not canonical: it
chooses one of the records in the cycle to start from and expresses
the cycles relative to that. Currently the choice of representation is
dictated by the frontend. For the case of vtables, the frontend
translates globals in the same order they appear in the LLVM IR, so
the representation choice is fixed.
It may turn out that other potential uses require more reasoning
support in the backend solver, which would involve a theory of
equality of record values induced by equating the representations
resulting from different rotations of the cycle of records.
Reviewed By: jvillard
Differential Revision: D21441533
fbshipit-source-id: 0c5a11378
Summary:
Having `val size_of : Typ.t -> t` in the signature of `Term` and `val
size_of : t -> t` in the signature of `Exp` gives the impression that
`Term` and `Exp` know something about `Typ`. But they don't, those
functions are only trivial convenience wrappers, and only have a few
uses, so just inline them to clarify that it is `Typ` that knows about
the sizes of types.
Reviewed By: jvillard
Differential Revision: D21441535
fbshipit-source-id: 09b135a8c
Summary:
Should have been included in:
22cbec493 [sledge] Rename contexts/profiles to be more conventional
Reviewed By: jvillard
Differential Revision: D21441538
fbshipit-source-id: 92b591171
Summary:
Dune does not track the ocamlformat binary as a dependency of the
`fmt` rules, so when testing or upgrading ocamlformat, the existing
`fmt` target is not reliable.
Reviewed By: jvillard
Differential Revision: D21441539
fbshipit-source-id: 40bea4447
Summary:
Add a path condition to each symbolic state, represented in sledge's arithmetic domain. This gives a precise account of arithmetic constraints. In particular, it is relation and thus is more robust in the face of inter-procedural analysis.
This is gated behind a flag for now as there are performance issues with the new arithmetic.
Reviewed By: jberdine
Differential Revision: D20393947
fbshipit-source-id: b780de22a
Summary:
Show that the SSA restrictions imply that the blocks can be
topologically sorted so that any use of a variable follows its
definition in the list of blocks. This showed a slight flaw in my
definition of SSA form. It used to treat program counters up to
equality. However, PCs that point to a block header contain the location
that control came from (so that the Phi instructions can be executed),
but the SSA restrictions shouldn't pay attention to that, so now the
definition of SSA introduces a equivalence on PCs that ignores that
additional information.
Reviewed By: jberdine
Differential Revision: D21066873
fbshipit-source-id: 735575a1f
Summary:
Introduce an explicit assumption that all of the instructions are ones
that are implemented in the translation so far, rather than just
cheating proof cases.
Reviewed By: jberdine
Differential Revision: D20891352
fbshipit-source-id: 37598bd8f
Summary:
Separate into separate files the theorems that are just about the
translation (mostly about the structure of the variable->expression
mapping that the translation builds) from theorems about the translation
and the semantics.
Also move the stuff about dominator_ordered into the SSA Theory, since
it only makes sense for SSA programs, but doesn't have anything to do
with the translation.
Reviewed By: jberdine
Differential Revision: D20673124
fbshipit-source-id: 9d8b08164
Summary:
The LLVM->LLAIR translation keeps a mapping of variables to expressions.
Previously, the invariants related to that mapping were kept in the
state relation, and so the proof needed show that they were preserved
along execution traces. This wasn't obvious as the state changes in
non-SSA ways during evaluation, but the correctness of the mappings is
heavily based on the program being in SSA form. This change separates
out the invariants, and the proof uses the final mapping that the
compiler builds, which contains all of the relevant bindings that might
be needed during execution.
Reviewed By: jberdine
Differential Revision: D20625109
fbshipit-source-id: d4c2dfe19
Summary:
Treat the remainder of dividing a rational by an integer as if the
rational was an integer division.
Reviewed By: jvillard
Differential Revision: D21042515
fbshipit-source-id: b5d42ddec
Summary:
The partial treatment of Mul and Div terms can simplify some cases,
but since it is only a partial treatment that is not producing a
normal form, in other cases the "simplification" results in large and
non-canonical terms. It is safer to leave them uninterpreted.
Reviewed By: jvillard
Differential Revision: D21042521
fbshipit-source-id: 04fc37f1a
Summary:
The heights of And and Or terms can grow high. This interacts poorly
with some unoptimized Equality operations such as normalization that
do some processing at every subterm.
Reviewed By: jvillard
Differential Revision: D21042518
fbshipit-source-id: 55e6acbb1
Summary:
These are map and folding map that perform a cycle-preserving
pre-order transformation.
Reviewed By: jvillard
Differential Revision: D20877974
fbshipit-source-id: 251288228
Summary:
The current notion of "simplified" function symbols, which are treated
as a hybrid between interpreted and uninterpreted, has no logical
basis. Normalization is now strong enough, due to stronger handling of
the changing carrier set, that the "simplified" classification can be
removed.
Reviewed By: jvillard
Differential Revision: D20726961
fbshipit-source-id: 9962ea323
Summary:
It is possible for the canonical form of a term in (the carrier set
of) a relation to be a term that is not in the relation. It is also
possible for this term to be equal to a term in the relation. It is
the job of the lookup subroutine of normalization to find these
equations.
The current implementation of entails_eq is incomplete in this case,
in particular, when an uninterpreted term has an interpreted subterm,
which itself has a subterm that is not a representative.
This diff strengthens normalization (and hence it's callers such as
entails_eq) to handle such cases. This makes lookup work slightly
harder. An alternative would be to extend the carrier with new terms
from normalization to the carrier and closing the relation, and then
re-normalizing. That would lead to much inflated representation sizes,
and is inefficient.
Reviewed By: jvillard
Differential Revision: D20612565
fbshipit-source-id: 3b7534a62
Summary:
This diff adds enough interpretation of Mul and Div terms to be able
to exclude them from the domain of solution substitutions. While
non-linear arithmetic is still treated very incompletely, this change
increases the propagation power of the equality constraints that are
deduced. Mainly, this appears to be enough to avoid operations that
are semantically equivalence-preserving such as solve_for_vars from
producing equalities that are unprovable from their inputs.
Reviewed By: jvillard
Differential Revision: D20863528
fbshipit-source-id: fca74cba3
Summary:
For non-linear polynomial equations, the solver can currently choose
to solve for a variable that occurs in a non-linear subterm, leading
to a non-idepotent solution substitution. For example, `x - y + (x ×
z) = 0` could be solved for `x`, leading to `x ↦ y - (x × z)` where
`x` is solved in terms of itself. This diff refines Term.solve_zero_eq
to avoid such cyclic solutions.
Reviewed By: jvillard
Differential Revision: D20637482
fbshipit-source-id: 6d7df85c3
Summary:
Strengthen the canonizer for division and rational constant terms. It
is important to avoid constructing `Q.t` values with 0 denominators,
as they do not represent real numbers, and the algebraic manipulation
done by the rest of the solver is incorrect in that case.
Additionally strengthen the Term representation invariants to check
that all coefficients and exponents are valid real rational numbers.
Also normalize division by an integer or rational to multiplication by
a rational.
Reviewed By: jvillard
Differential Revision: D20663964
fbshipit-source-id: 210962fe0
Summary:
Equations of the form `a = b` where `a` is a proper subterm of `b` are
possible when uninterpreted functions are involved. Internally,
Equality does not eagerly substitute `b` for `a`, but external clients
can repeatedly `Equality.normalize` terms and thereby incrementally
blow up the sizes of terms.
This diff uses the heights of uninterpreted terms to choose equality
class representatives to avoid such blow-ups, by orienting equations
so that tall terms are represented by short terms, so that repeated
normalization cannot increase term height indefinitely.
Reviewed By: jvillard
Differential Revision: D20785632
fbshipit-source-id: ff4c5bacd
Summary:
The types are constants and need not be re-checked for equality during
normalization, etc.
Reviewed By: jvillard
Differential Revision: D20863526
fbshipit-source-id: 1adde5ee0
Summary:
Ensure all entry points check the representation invariant before
returning, and strengthen it to check the constraints on preference
between representative terms, and to check that the relation is
closed.
Reviewed By: jvillard
Differential Revision: D20612566
fbshipit-source-id: b345397c4
Summary:
Change the implementation of `Equality.Subst.compose` to preserve
physical equality if the result is `Subst.equal` to the first
argument.
Reviewed By: jvillard
Differential Revision: D20752671
fbshipit-source-id: 4641a298a
Summary:
Currently there is an implicit assumption in Sh.simplify that
variables do not occur free in an equality relation which makes no
constraint on their values. This diff adds a step to formula
simplification that explicitly removes eliminated existentials from
equality relations.
Reviewed By: jvillard
Differential Revision: D20726960
fbshipit-source-id: 7109aa479
Summary:
Fix the crash in
```
(And_eq () (Var (id 10) (name v))
(Mul (((Var (id 8) (name v)) 1) ((Var (id 9) (name v)) 1)))
((xs ()) (sat true) (rep ())))
```
The solver for interpreted functions relies on the solution
substitutions containing mappings from variables to interpreted
applications, and never in the reverse. When solving equations
involving polynomials, this constraint is specifically
established. But for equalities involving only monomials, it could
happen that e.g. `x` was chosen as the representative of `a × b`,
which violates this constraint.
Reviewed By: jvillard
Differential Revision: D20596422
fbshipit-source-id: 69b026f03
Summary: Term.compare already ignores Var names, make Term.equal do so as well.
Reviewed By: jvillard
Differential Revision: D20663961
fbshipit-source-id: 59e7aa880
Summary:
Generate output on stderr containing lines such as:
```
solve_for_vars time: 0.105 ms 0.871 ms 79 calls
```
which indicates that the current maximal time of a single
`solve_for_vars` query is `0.105 ms` and so far the total time spent in
`solve_for_vars` over `79` queries is `0.871 ms`.
At program exit, there is also a line for each timer indicating the
max query time and final accumulated duration and number of queries:
```
solve_for_vars time: 0.173 ms 17.242 ms 1108 calls
```
Additionally, replay sexp are dumped interactively. A query exceeding
1 sec is dumped, then one exceeding 2 sec, then 4 sec, etc.
Reviewed By: jvillard
Differential Revision: D20852160
fbshipit-source-id: 0a316891e
Summary:
Rename and clarify documentation to indicate that the stronger
specification wrt physical equality is due to requiring the mapped
function to be an endofunction.
Reviewed By: jvillard
Differential Revision: D20863524
fbshipit-source-id: 74dabeb5c
Summary:
Otherwise there is an alias `'a t` for `'a option` polluting the
global namespace, which causes e.g. merlin to produce confusing types.
Reviewed By: jvillard
Differential Revision: D20831349
fbshipit-source-id: dff7b4f15
Summary:
Many Equality functions accept and return a set of variables, document
their function.
Reviewed By: ngorogiannis
Differential Revision: D20596492
fbshipit-source-id: b91ae7197
Summary:
Passing info that affects the build through environment variables is
bad since it is not very visible, e.g. in log files. Also, those
settings are not recorded in .merlin files, leading to merlin using
the default configuration.
This diff changes ppx_trace to use a 'cookie' instead of an
environment variable, and configures dune to set this cookie based on
the selected profile. This has the benefit of using command line
arguments such as `--cookie 'ppx_trace_enabled="1"'` which appear in
build logs and .merlin files.
Reviewed By: jvillard
Differential Revision: D20590515
fbshipit-source-id: d2daceaa3
Summary:
Previously building would spew:
```
Warning 58: no cmx file was found in path for module Build_info__Build_info_data, and its interface was not compiled with -opaque
```
This is from compilation of the generated `build_info_data.ml-gen`. It
does not have an interface, so I don't how to resolve this other than
disabling the warning.
See also https://github.com/ocaml/dune/issues/3277 .
Reviewed By: jvillard
Differential Revision: D20589879
fbshipit-source-id: fcd86fb3b
Summary:
Now that the containers use functorial interfaces with representations
that do not include closures for the comparison functions, it is not
necessary to enable closure support when Marshaling the IR between the
frontend and backend. This should be slightly faster, but more
importantly, it means that the serialized form is stable across
changes to the analyzer that do not change the representation of the
IR types, and in particular, the dbg and opt binaries can use the same
serialized form.
Reviewed By: jvillard
Differential Revision: D20589880
fbshipit-source-id: 63f07335e
Summary:
If both the library and binary are named `sledge`, then compilation
works fine and resolves the names correctly, but merlin gets confused
and cannot find the sledge library modules used in the bin sources. So
the binary and library need different names. The name of the library
gets exposed to clients, while the name of the binary only determines
the name of some files in the _build directory, which can be renamed
as desired. Therefore, use the `sledge` name for the library module,
and rename the binary module to `sledge_cli`, but still have dune
install it as `sledge.exe`.
Reviewed By: jvillard
Differential Revision: D20589431
fbshipit-source-id: 14b65907d
Summary:
Integer `div` and `rem` are a pair of functions that satisfy the
division rule, where the result of `div` is truncated toward zero:
```
| Div (** Division, for integers result is truncated toward zero *)
| Rem
(** Remainder of division, satisfies [a = b * div a b + rem a b] and
for integers [rem a b] has same sign as [a], and [|rem a b| < |b|] *)
```
Reviewed By: jvillard
Differential Revision: D20584626
fbshipit-source-id: fa02a3a98
Summary:
This diff adds wrappers for Equality entry points that augment
exceptions escaping out of Equality with an sexp that can be given
back to Equality.replay in order to reexecute the function call that
failed.
A (bogus) example could be raising:
```
((Failure "domains intersect: ((u8) %x_5)")
(And_eq () (Ap1 (Unsigned (bits 8)) (Var (id 5) (name x)))
(Var (id 6) (name y))
((xs ()) (sat true)
(rep (((Var (id 6) (name y)) (Var (id 5) (name x))))))))
```
and then calling `Equality.replay {|(And_eq ...)|}`.
Reviewed By: jvillard
Differential Revision: D20583753
fbshipit-source-id: 80d855950
Summary:
It is more convenient, and harder to misunderstand, if the verbose
tracing setting comes commented-out after the default non-verbose
setting.
Reviewed By: jvillard
Differential Revision: D20583755
fbshipit-source-id: 06ecb0e9a
Summary:
Work on the containers revealed that 1b8746d21 was premature, and this
diff is in part a revert of that. The objectives for the global
namespace are:
- self-consistent as much as possible
- rich data type operations
- does not require maintaining lots of tedious library wrapping code
- has Marshalable containers
- has containers with functorial interfaces
For these aims, it's best to stick with Core. Base isn't enough to
define functorial interfaces for collections (without a lot of tedious
wrapping code to keep in sync manually), and since a few modules not
in Core_kernel are needed anyhow.
Reviewed By: ngorogiannis
Differential Revision: D20583756
fbshipit-source-id: d939be7d0
Summary:
This diff defines Map as a functorover the underlying implementation
of Core.Map. This results in map values that are just trees, with no
comparison function closures, and with the same interface (almost) and
underlying data structure implementation as Core.Map.
Reviewed By: ngorogiannis
Differential Revision: D20583758
fbshipit-source-id: 5a4997b51
Summary:
This diff defines Set as a functorover the underlying implementation
of Core.Set. This results in set values that are just trees, with no
comparison function closures, and with the same interface (almost) and
underlying data structure implementation as Core.Set.
Reviewed By: ngorogiannis
Differential Revision: D20583754
fbshipit-source-id: 79aa7477a
Summary:
- Rename to `combine_adjacent` for clarity
- Reimplement without needing to filter out a dummy value, by keeping
track of the numer of adjacent pairs that have been combined
together
- Use this counting to replace the probably-confusing lazy copy with
explicit eager copy at the right time
- Remove the now-unneeded dummy value argument
Reviewed By: jvillard
Differential Revision: D20482752
fbshipit-source-id: db80bbbe3
Summary:
`dune build fmt` applies to all contexts, but we only need one, so
use `dune build @_build/dbg/fmt` instead. Also swallow stderr as it's
useless with `--auto-promote`.
Reviewed By: ngorogiannis
Differential Revision: D20482767
fbshipit-source-id: a7642f65d
Summary:
The term "vector" evokes expectations of being automatically growable,
and these are just immutable arrays.
Reviewed By: ngorogiannis
Differential Revision: D20482762
fbshipit-source-id: 0cd2c9c23
Summary:
The base containers have inconvenient interfaces due to lacking
support for functors, which also leads to the representation of values
of containers including closures for the comparison functions. This
causes problems when `Marshal`ing these values.
This diff is one step toward not using the base containers.
Reviewed By: ngorogiannis
Differential Revision: D20482756
fbshipit-source-id: 0312c422d
Summary:
The base containers have inconvenient interfaces due to lacking
support for functors, which also leads to the representation of values
of containers including closures for the comparison functions. This
causes problems when `Marshal`ing these values.
This diff is one step toward not using the base containers.
Reviewed By: ngorogiannis
Differential Revision: D20482763
fbshipit-source-id: f55f91bf2
Summary:
Move definitions of profiles from `dune-workspace` to `dune` since it
seems to be ok to use profiles in `dune-workspace` that are only
defined in `dune`. Since the `dune` but not `dune-workspace` file is
used when building as a vendored dependency, this seems to be
preferable.
Also, change the `opt` profile into a wildcard, which seems to be
preferable from the vendored-building perspective.
Also, set library public names such that including `sledge` in the
`libraries` phrase of `dune` files works.
Reviewed By: ngorogiannis
Differential Revision: D20376179
fbshipit-source-id: f2b634716
Summary:
Define a `ppx_sledge` ppx rewriter that composes all the ppx rewriters
used, so that the list needs to be specified only once.
Reviewed By: jvillard
Differential Revision: D20322874
fbshipit-source-id: f15540b7f
Summary:
It seems to currently not be possible to make dune `preprocess`
stanzas vary across contexts, without essentially generating the dune
files. A consequence is that it is not possible to pass
context-dependent command line arguments to ppx rewriters. So add
support for an environment variable instead.
Reviewed By: jvillard
Differential Revision: D20322871
fbshipit-source-id: f3f3ea413
Summary:
Move files, adjust build system, etc.
This also separates out the ppx_trace conditional compilation debug
tracing machinery into an independent package and library.
Reviewed By: jvillard
Differential Revision: D20322876
fbshipit-source-id: a50522462
Summary:
`Reg.demangle` is implemented by calling the `_cxa_demangle` C++
runtime system function. This will be linked into the sledge binary,
due to being linked with llvm, but will not necessarily be available
in the sledge library. So make it a dynamically-set function to avoid
calling an undefined function from the library.
Reviewed By: jvillard
Differential Revision: D20323791
fbshipit-source-id: bda9afd37
Summary:
The convenience symlinks from sledge/bin to the built binaries of
various build modes under sledge/_build/_install conflict with
upcoming code rearrangements, so remove them.
Reviewed By: jvillard
Differential Revision: D20322872
fbshipit-source-id: 1a7c99ae0
Summary:
Formulate the canonizer for Extract from Concat terms uniformly as a
concatenation of extracts.
Reviewed By: jvillard
Differential Revision: D20303064
fbshipit-source-id: a45bc45dd
Summary:
Change constructor for solver goals to enforce variable context
conditions, and simplify other context manipulations that are now
unneeded.
Reviewed By: jvillard
Differential Revision: D20248543
fbshipit-source-id: a255c792b
Summary:
In a few places, implicitly witnessed existential variables could
remain in the existential context. This led to weakness in the solver,
where occurrences bound by the existential context would be not known
to be constrained by their witnesses.
Reviewed By: jvillard
Differential Revision: D20248542
fbshipit-source-id: 44f62839c
Summary: In preparation for enforcing invariants in the constructor.
Reviewed By: jvillard
Differential Revision: D20248541
fbshipit-source-id: 41f7d36e5
Summary:
Program (and global) variables are only distinct when considering
their string names, but logical variables need only their ids.
Reviewed By: jvillard
Differential Revision: D20214528
fbshipit-source-id: f7892c3ad
Summary:
When extracting from a concatenation, drop a prefix of the concat with
length equal to the offset of the extraction:
```
(α₀^…^αᵢ^…) [0+n₀+…+nᵢ₋₁, l) ==> (αᵢ^…)[0,l) where nₓ ≡ |αₓ|
```
Reviewed By: jvillard
Differential Revision: D20192874
fbshipit-source-id: cd015aa36
Summary:
Fix unstated assumptions Sh.or_ made on the universal and existential
contexts of disjuncts.
Reviewed By: jvillard
Differential Revision: D20192873
fbshipit-source-id: 945623e57
Summary:
This diff changes `Sh.simplify` from a logically-weakening syntactic
simplification to an equivalence-preserving rewrite. The
implementation is based on `Equality.solve_for_vars` which is also
used by `Solver` to witness existential variables.
Reviewed By: jvillard
Differential Revision: D20120274
fbshipit-source-id: 5e11659ea
Summary:
Just fix accumulated mis-formatting that is swallowed by the
inline-test promotion implementation.
Reviewed By: ngorogiannis
Differential Revision: D20120262
fbshipit-source-id: 0e387dc55
Summary:
Add `Sh.pp_raw` which is closer to the representation, for use when
tracing `Sh` operations.
Reviewed By: ngorogiannis
Differential Revision: D20120281
fbshipit-source-id: e3b1b531a
Summary:
Due to strengthened existential witnessing, the incomplete ad hoc
witness guessing is no longer needed.
Reviewed By: ngorogiannis
Differential Revision: D20120277
fbshipit-source-id: 8ee1656dd
Summary:
Strengthen computation of solution substitutions used for existential
witnessing by using the solver for the memory contents theory. This
uses a generalization of the equation solver implementation which
accepts a predicate used as a filter for equations added to the
solution substitution. When used for solving for a given set of
variables, this filter excludes equations which do not meet the
desired variable conditions.
Reviewed By: jvillard
Differential Revision: D20120275
fbshipit-source-id: 4203d5e41
Summary:
Strengthen existential quantifier witnessing to enable witnessing an
existential with a term containing another existential if no universal
witness is available. Additionally, strengthen existential witnessing
to enable terms of interpreted theories to witness existential
variables.
Also strengthen and simplify the representation invariant checking for
existential witnessing code.
Reviewed By: jvillard
Differential Revision: D20120271
fbshipit-source-id: 4c44fe9ef
Summary:
Handle the case the universal context of a goal does not stay in sync
with that of the minuend.
The need for this indicates that there is some problematic redundancy
in the representation of solver goals.
Reviewed By: ngorogiannis
Differential Revision: D20120268
fbshipit-source-id: 44a4d6260
Summary:
It can happen that canonizing subterms can change the classification
of a term e.g. to the literal true. In such cases, it is not useful or
correct use `Equality.lookup` (which expects to only be used on
uninterpreted applications) to search for some other equation equal to
the literal and use its representative instead.
Reviewed By: ngorogiannis
Differential Revision: D20120279
fbshipit-source-id: 3e2160233
Summary: When equating Concat terms, drop any common prefix or suffix.
Reviewed By: ngorogiannis
Differential Revision: D20120264
fbshipit-source-id: afdeb990e
Summary: No code change, only reordering definitions in prep for later changes.
Reviewed By: ngorogiannis
Differential Revision: D20120263
fbshipit-source-id: b312dfc9a
Summary: Add `Equality.and_term` and replace most of `Sh.pure` with it.
Reviewed By: ngorogiannis
Differential Revision: D20029742
fbshipit-source-id: 07c2f1fe6
Summary:
Replace `Equality.Subst.trim` with `partition_valid` which has a
logical specification (and unsurprisingly fixes some corner case
bugs):
```
val partition_valid : Var.Set.t -> t -> t * Var.Set.t * t
(** Partition ∃xs. σ into equivalent ∃xs. τ ∧ ∃ks. ν where ks and ν
are maximal where ∃ks. ν is universally valid, xs ⊇ ks and ks ∩
fv(τ) = ∅. *)
```
Reviewed By: ngorogiannis
Differential Revision: D20004974
fbshipit-source-id: 5cb3b3835
Summary:
Sh.var_strength determines, in part, if an existential variable has
only a single occurrence. The objective of this is to determine that a
variable is merely a placeholder and not used to express additional
constraints. For this, it suffices to check the weaker condition that
there is a single occurrence in each branch of a DNF expansion.
Reviewed By: ngorogiannis
Differential Revision: D19973779
fbshipit-source-id: 2c90c61f4
Summary:
The result of promoting expect tests is not necessarily formatted, so
try to reduce iterations by formatting after testing.
Reviewed By: ngorogiannis
Differential Revision: D20004973
fbshipit-source-id: 53af4c034
Summary:
Changing `pps` to `staged_pps` was needed for `import_ppx`, but it has
since been removed.
Reviewed By: jvillard
Differential Revision: D19973778
fbshipit-source-id: 5e2d83157
Summary:
Currently the free variables of the equality relation of a formula are
contained in the free variables of the rest of the formula, so Sh.fv
ignores them. Propagating equality facts across the star-or structure
of a formula, as necessary for quantifier elimination, breaks this
invariant. Some use cases, such as detecting which variables survive
applying a witness substitution, need to ignore the variables that
appear in the equality relation. This diff adds an argument to
conditionally ignore the variables in the equality relations.
Reviewed By: ngorogiannis
Differential Revision: D19580430
fbshipit-source-id: 2d417d89b
Summary:
Add a field to LLAIR variables to indicate whether they are global or
local. Update the LLVM semantics for constant expression evaluation to
be relational, so that it doesn't have to have an answer for references
to undefined globals.
Reviewed By: jberdine
Differential Revision: D19446312
fbshipit-source-id: 9bbfd180e
Summary:
In an equation such as `x = ⟨n,a⟩`, `x` is implicitly an aggregate of
size `n` (or else the equation is ill-typed). Make this explicit by
normalizing such equations to e.g. `⟨|⟨n,a⟩|,x⟩ = ⟨n,a⟩`.
Reviewed By: ngorogiannis
Differential Revision: D19358546
fbshipit-source-id: 77f67a0da
Summary:
Strengthen Equality.solve_for_vars so that it will solve cases such as
```
∃ a. ⟨n,a⟩ = ⟨m,b⟩^⟨o,c⟩
```
Reviewed By: ngorogiannis
Differential Revision: D19356324
fbshipit-source-id: a57625ba6
Summary:
Due to paying closer attention to which terms are already normalized
by the accumulated solution substitution.
Reviewed By: ngorogiannis
Differential Revision: D19356323
fbshipit-source-id: e162414a0
Summary:
The function transforming terms passed to Sh.map might produce trivial
constraints, filter them out.
Reviewed By: ngorogiannis
Differential Revision: D19286628
fbshipit-source-id: e3d9926ce
Summary:
In order for Equality.solve to generate fresh variables, it needs to
be passed the universal context with respect to which variables must
be chosen fresh.
Reviewed By: ngorogiannis
Differential Revision: D19286630
fbshipit-source-id: ebbedd954
Summary:
Aggregate args of Extract and Concat must be aggregate terms, in
particular, not variables. This maintains the property that the size
of any aggregate can be computed.
Reviewed By: ngorogiannis
Differential Revision: D19286625
fbshipit-source-id: 1af1e4183
Summary:
The byte-array theory used for the contents of memory is strong
enough to express all the constraints arising during symbolic
execution without the ability to extract a slice out of a byte-array.
However, without Extract, it is not possible to solve some equations
for some variables, for example solving ⟨n,α⟩^⟨m,β⟩ = ⟨l,γ⟩ for α.
Solving such equations is needed for quantifier elimination and to
formulate the byte-array theory as a Shostak theory.
Reviewed By: ngorogiannis
Differential Revision: D19286632
fbshipit-source-id: 07dc112d0
Summary:
Revise the ad hoc treatment to drop tautologous constraints such as
∃x,y. x = y to only apply when x and y do not appear elsewhere.
Reviewed By: ngorogiannis
Differential Revision: D19282633
fbshipit-source-id: 7fb9951ec
Summary:
Equality.solve_for_var_contexts can produce solution substitutions
that witness existentials in more cases than the
single-existential-occurrence heuristic, and is more conservative in
cases where the latter is incomplete.
Reviewed By: ngorogiannis
Differential Revision: D19282634
fbshipit-source-id: f86f0a9cb
Summary:
```
val trim : bound:Var.Set.t -> Var.Set.t -> t -> t
(** [trim bound kills subst] is [subst] without mappings that mention
[kills] or [bound ∩ fv x] for removed entries [x ↦ u] *)
```
To be used for existential witnessing and quantifier elimination.
Reviewed By: ngorogiannis
Differential Revision: D19282644
fbshipit-source-id: d981a1cb4
Summary:
```
val solve_for_vars : Var.Set.t list -> t -> Subst.t
(** [solve_for_vars \[v₁;…\] r] is a solution substitution that is
entailed by [r] and consists of oriented equalities [x ↦ u] such that
[fv x ⊈ vᵢ ⊇ fv u] where [i] is minimal such that [vᵢ]
distinguishes [fv x] and [fv u], if one exists. *)
```
To be used for existential witnessing and quantifier elimination.
Reviewed By: ngorogiannis
Differential Revision: D19282636
fbshipit-source-id: c5b006cea
Summary:
It is necessary to normalize subterms of Memory and Concat terms or
else Equality.entails_eq is incomplete. They ought to be Interpreted,
but the solver for the byte-array theory is not yet ready for that.
Reviewed By: ngorogiannis
Differential Revision: D19282635
fbshipit-source-id: c06b6ca6d
Summary:
The equality relation is implied by the pure part, so cannot involve
more variables. Also, Sh.invariant checks that the equality relation
does not contain unbound variables.
Reviewed By: ngorogiannis
Differential Revision: D19282641
fbshipit-source-id: 21dd37a3b
Summary:
If `Term.solve_zero_eq` is passed `for_`, then that subterm is solved
for.
Reviewed By: ngorogiannis
Differential Revision: D19282647
fbshipit-source-id: 5d5b76af5
Summary:
Match the `x` suffix naming convention of Term pp functions that take
a classification function.
Reviewed By: ngorogiannis
Differential Revision: D19282639
fbshipit-source-id: fc340e4bc
Summary:
Equality relies on the result of solving an equation to be a "solution
substitution". In constrast to unconstrained Map's, solution
substitutions are idempotent and have constraints on the terms that
may appear in their domain (they must be "maximal solvables", that is,
variables or uninterpreted function applications, which would be
variables if explicit "variable abstraction" was done).
This diff factors out the manipulation of concrete Map's into a
Equality.Subst module, and uses these for the result of `solve`.
Reviewed By: ngorogiannis
Differential Revision: D19282637
fbshipit-source-id: 4fc825e59
Summary:
In preparation for constructing solution substitutions in solve, which
are closely tied to Equality.
Reviewed By: ngorogiannis
Differential Revision: D19282640
fbshipit-source-id: ca0f8ae29
Summary:
Identifying and separating one of the monomials in a polynomial, and
solving an equality for it, is much more dependent on the
representation of polynomial terms than the rest of solve.
Reviewed By: ngorogiannis
Differential Revision: D19282645
fbshipit-source-id: 645191ae0
Summary:
The exposed constructors for Memory and Concat Terms are only used in
a very special idiom: to construct an equality between a single Memory
chunk and the Concat of multiple Memory chunks. This diff specializes
and simplifies by exposing a Term.eq_concat constructor for this
idiom, and removes the underlying Term.memory and Term.concat
constructors.
Reviewed By: ngorogiannis
Differential Revision: D19221866
fbshipit-source-id: 4842737d2
Summary:
Trace.infok is like Trace.info but accepts a polymorphic printf
continuation instead of directly taking a format string and its
args. This is useful to write wrappers such as:
```
let trace k = [%Trace.infok k]
```
Reviewed By: ngorogiannis
Differential Revision: D19221883
fbshipit-source-id: 88e939b26
Summary:
The size of Splats is redundant, as they always appear as subterms of
a Memory chunk or a heap segment, both of which are sized.
Reviewed By: ngorogiannis
Differential Revision: D19221870
fbshipit-source-id: 74044d1ad
Summary:
Now that they are uncurried, congruence closure does not need the
order of subterms to be preserved. Sorting them reduces redundancy in
case the same equality in different orders is encountered, and
improved printing.
Reviewed By: ngorogiannis
Differential Revision: D19221875
fbshipit-source-id: c6bf4ccad
Summary:
Equality.classes was assuming a simpler representation, and was
incomplete as a result.
The 'representative' map is not kept in a normalized form, where
subterms are necessarily representatives. Therefore, applying the
representative map to subterms of terms in a class can reveal new
elements of the class. This mirrors how the `lookup` function in
`normalize` works.
Reviewed By: ngorogiannis
Differential Revision: D19221868
fbshipit-source-id: 4a2ed6d3f
Summary:
Reduce redundancy by printing adjacent segments as if they had been
concatenated together.
Reviewed By: ngorogiannis
Differential Revision: D19221881
fbshipit-source-id: 613105864
Summary:
Also, previous code was sometimes inconsistent regarding whether to
enumerate all subterms or only toplevel terms.
Reviewed By: ngorogiannis
Differential Revision: D19221873
fbshipit-source-id: e8644098b
Summary: It is easier to understand the order of args with diff_inter.
Reviewed By: ngorogiannis
Differential Revision: D19221869
fbshipit-source-id: b29ac83c8
Summary:
Add some test cases from Reuss and Shankar for equality that are
mishandled by Shostak's original algorithm.
Reviewed By: ngorogiannis
Differential Revision: D19221880
fbshipit-source-id: a6f9d51e3
Summary:
This diff enables parsing and auto-formatting documentation
comments (aka docstrings).
I have looked at this entire diff and manually made some changes to
improve the formatting. In some cases it looked like it would take too
much time, or benefit from someone more familiar with the code doing
it, and I instead disabled auto-formatting docstrings in those files.
Also, there are some source files where the docstrings are invalid,
and some where the structure detected by the parser appears not to
match what was intended. Auto-formatting has been disabled for these
files.
Reviewed By: ezgicicek
Differential Revision: D18755888
fbshipit-source-id: 68d72465d
Summary:
OCaml 4.08 supports a form of signature-local bindings, to that a type
can be defined in order to be used in other definitions, without
being part of the signature itself.
Reviewed By: ngorogiannis, jvillard
Differential Revision: D18736380
fbshipit-source-id: 0bb043de6
Summary:
Otherwise it is difficult to tell the difference between compilation
errors from previous versus current builds.
Reviewed By: ngorogiannis
Differential Revision: D18736376
fbshipit-source-id: 2e583f4ba
Summary:
OCaml 4.08 has a new warning (66) on unused `open!` statements. This
has a suboptimal interaction with `ppx_let`'s `let%map_open` which
leads to triggering the warning if any of a group of such let bindings
does not need the open.
In this case, the refactor is easy.
But, warning 66 is very dubious, so also just switch it off.
Reviewed By: jvillard
Differential Revision: D18708466
fbshipit-source-id: 77618ab6e
Summary:
It seems to be effectively unmaintained, as it still doesn't support
4.08.
Reviewed By: jvillard
Differential Revision: D18708467
fbshipit-source-id: dcb3361fc
Summary:
Term.solve makes the assumption that all distinct normalized constants
denote distinct values. This is fragile at best, and it is better to
enumerate the cases where solve discovers inconsistency.
Reviewed By: jvillard
Differential Revision: D18459619
fbshipit-source-id: 71f52557c
Summary:
Equality.or_ assumed a simpler representation of equality relations,
and was incomplete as a result.
Reviewed By: jvillard
Differential Revision: D18298138
fbshipit-source-id: cf91229f6
Summary:
The treatment of type conversions is too complicated, non-uniform,
etc. This diff attempts to simplify things by separating integer to
integer conversions, which are interpreted, from others, which are
essentially just uninterpreted functions. Integer conversions are now
handled using two expression and term forms: Signed and
Unsigned. These each interpret their argument as either a signed or
unsigned number of a given bitwidth:
```
| Signed of {bits: int}
(** [Ap1 (Signed {bits= n}, dst, arg)] is [arg] interpreted as an
[n]-bit signed integer and injected into the [dst] type. That is,
it two's-complement--decodes the low [n] bits of the infinite
two's-complement encoding of [arg]. The injection into [dst] is a
no-op, so [dst] must be an integer type with bitwidth at least
[n]. *)
| Unsigned of {bits: int}
(** [Ap1 (Unsigned {bits= n}, dst, arg)] is [arg] interpreted as an
[n]-bit unsigned integer and injected into the [dst] type. That
is, it unsigned-binary--decodes the low [n] bits of the infinite
two's-complement encoding of [arg]. The injection into [dst] is a
no-op, so [dst] must be an integer type with bitwidth greater than
[n]. *)
| Convert of {src: Typ.t}
(** [Ap1 (Convert {src}, dst, arg)] is [arg] converted from type [src]
to type [dst], possibly with loss of information. The [src] and
[dst] types must be [Typ.convertible] and must not both be
[Integer] types. *)
```
Reviewed By: ngorogiannis
Differential Revision: D18298140
fbshipit-source-id: 690f065b4
Summary:
LLAIR changed how it represents integer-to-integer conversions, and this
updates the semantics and proofs to show that the new way is correct.
Reviewed By: jberdine
Differential Revision: D18448616
fbshipit-source-id: b657fcd20
Summary:
The old version of simp_convert in LLAIR had a bug, but the sanity
theorem didn't catch it because it didn't enforce that the result fit
into the size it should have. This updates to newer version of
simp_convert and adds a theorem that the result fits.
Reviewed By: jberdine
Differential Revision: D18346833
fbshipit-source-id: 533c836bf
Summary:
Extend the APRON-backed interval analysis to handle a wider range
of LLAIR expressions.
Reviewed By: jvillard
Differential Revision: D17858072
fbshipit-source-id: c50f5bf20
Summary:
In some cases the result of an integer conversion needs to be
truncated by a bit.
Differential Revision: D18271179
fbshipit-source-id: e80740045
Summary:
Improve the invariants to show that phi instructions are correctly
translated. It remains to show that the invariants can be established
when jumping to the start of a block
Reviewed By: jberdine
Differential Revision: D18228272
fbshipit-source-id: 4330b4781
Summary:
Add a new interval abstract domain. This domain uses the APRON
numerical analysis library to keep track of the range of values held
by llair variables where possible. This works by translating LLAIR
expressions into APRON tree expressions, so only handles the
subset of the LLAIR expression language that can be embedded.
Note also that function summarization is not yet implemented.
Future commits will add summarization and improve coverage of
LLAIR's expression language.
Reviewed By: jberdine
Differential Revision: D17763517
fbshipit-source-id: 826ce4cc5
Summary:
Add some theorems establishing the correspondence between the
implementation of the Convert operation in OCaml and the definition of
Convert in the semantics. Essentially, the OCaml version is in terms of
extracting certain ranges of bits, whereas the semantics is in terms of
integer arithmetic (addition, modulus, and exponentiation)
Reviewed By: jberdine
Differential Revision: D18113878
fbshipit-source-id: c318596d0
Summary:
This commit adds truncation, sign extension and zero extension to LLVM
and the Convert instruction to LLAIR.
The LLVM instructions use HOL's build-in word/int and word/num
conversions. Sanity-checking theorems prove that zero-extending leaves
the value of the word unchanged when considered as an unsigned value,
and that sign-extending leaves the value unchanged when considered as a
signed value.
The llair semantics for Convert uses the truncate_2comp function which
converts an integer to another integer as though they were represented
in 2's complement. e.g. truncate_2comp 255 16 = 255, truncate_2comp
255 8 = -1, truncate_2comp -3 2 = 1
Reviewed By: jberdine
Differential Revision: D18058833
fbshipit-source-id: df9de480c
Summary:
The old syntactic invariant in prog_ok was in the wrong direction,
saying that all labels in a phi instruction have to exist, rather than
saying that when we jump to a new block, the label of the block we came
from must be in all of the phi instructions.
Reviewed By: jberdine
Differential Revision: D18058832
fbshipit-source-id: d2ad33b04
Summary:
This required some minor tweaks to how the semantics encode values into
and out of byte lists. The remaining problems have to do with how LLVM
globals are translated into llair. At the moment, llair semantic's state
keeps a mapping for globals to their addresses, following the LLVM
semantics. However, it is not used because the translation (following
the code in frontend.ml) translates LLVM globals into llair locals,
which the llair semantics isn't set up to handle.
Reviewed By: jberdine
Differential Revision: D17930787
fbshipit-source-id: 06c6368e0
Summary:
The Used globals (pre-)analysis produces results queried by
Control. This diff adds a type definition for these and moves the
query into the Used_globals module.
Reviewed By: bennostein
Differential Revision: D17856879
fbshipit-source-id: 0211b82d7
Summary:
To avoid code explosion, the frontend emits move instructions for
expressions with more than one use. This diff relaxes this slightly by
allowing duplication of casts.
Reviewed By: bennostein
Differential Revision: D17856384
fbshipit-source-id: 6f6c496ef
Summary:
The frontend translation of exceptional control flow is untrusted
enough that it makes sense to disable it by default.
Reviewed By: bennostein
Differential Revision: D16061018
fbshipit-source-id: 65dca36ae
Summary:
The CFG of a function is implicit in the blocks themselves, so it is
possible to remove the explicit represention as a vector of
blocks. The only uses are fold or iter, and since the cycles are
detected during construction, these can be simple depth-first
traversals.
Reviewed By: bennostein
Differential Revision: D17821845
fbshipit-source-id: fc7a02151
Summary:
Fix a bug where the actual return variable was not scoped correctly in
cases where its name clashed with a local or formal of the
callee. Also comment and simplify to attempt to make more
understandable.
Reviewed By: bennostein
Differential Revision: D17801944
fbshipit-source-id: 286739241
Summary:
Some code that is otherwise benignly scalar still uses the
ExtractElement and InsertElement vector operations, so translate them
as if they were array operations.
Reviewed By: ngorogiannis
Differential Revision: D17801949
fbshipit-source-id: 89f3666bd
Summary:
Previously, the LLVM semantics could be stuck where the LLAIR semantics
was not yet stuck, but would become stuck (at the same place) after
taking a step. This was due to LLVM using the traditional definition of
stuck states: any state from which there are no transitions. However,
LLAIR cannot do that because it might get stuck in the middle of a block
that contains several visible stores. We don't want to consider the
whole block stuck, nor can we finish it. Thus, the LLAIR definition of
stuckness is when the state has the stuck flag set which happens when
stopping in the middle of a block after encountering a stuck
instruction. Now LLVM takes the same approach.
Reviewed By: jberdine
Differential Revision: D17855085
fbshipit-source-id: a094d25d5
Summary:
Add an argument to the Exit instruction. Update the LLVM semantics to
execute the Exit instruction and store the result in an "exited"
component of the state. (Previously it just noticed that it was stuck
about to do an Exit.)
With exiting treated uniformly, now in the proof that for every LLVM
trace, there is a llair trace that simulates it, all of the cheats
except for 1 are just cases that I haven't got to yet. However, the last
cheat is for the situation where the LLVM program gets stuck and the
llair program doesn't. For example, the following two line LLVM program
gets stuck because r2 is not assigned (ignoring for the moment the static
restriction that LLVM is in SSA form).
r1 := r2
Exit(0)
The compilation to llair omits the assignment and so we get a llair
program that doesn't get stuck:
Exit(0)
The key question is whether the static restrictions are sufficient to
ensure that no expression that might be omitted can get stuck.
Reviewed By: jberdine
Differential Revision: D17737589
fbshipit-source-id: bc6c01a1b
Summary:
If the LLVM to llair translation keeps a mapping from register r to
expression e, then for each register r' mentioned in e, there must be an
assignment to r' that dominates the entire live range of r. Thus, where
ever r might be replaced by e, the value of r' will be the same as it
was when the initial assignment to r occurred. Maintaining this
invariant relies on the LLVM being in SSA form.
Reviewed By: jberdine
Differential Revision: D17710288
fbshipit-source-id: fd3eaa57d
Summary:
This is work in progress; many of the cheats aren't true. In particular,
the definition of stuck/complete/partial traces in LLVM and llair don't
quite match up and need some modification. Also, the state relation
isn't strong enough; it will need to include information about registers
used in the expressions of the LLVM register to llair expression
mapping. But the overall shape of the proof is ok and so it can be
used to poke at various local aspects of the translation, such as
individual instructions.
Reviewed By: jberdine
Differential Revision: D17631604
fbshipit-source-id: 743b5d64d
Summary:
By some unfortunate logic, OCaml often decides to use
`sexp_list`/`sexp_option` instead of just `list`/`option`. Sometimes
these get copy/pasted in interface files.
It would be good to tell OCaml not to do that in the first place but in
the meantime: this diff.
Reviewed By: ngorogiannis
Differential Revision: D17907938
fbshipit-source-id: 7546834a2
Summary:
For test scripting purposes, when the analysis finishes successfully,
report the number of alarms.
Reviewed By: ngorogiannis
Differential Revision: D17801947
fbshipit-source-id: 1660866df
Summary:
In a spec, it currently may be that foot.us does not contain xs. So
exec_specs needs to extend the vocabulary of foot before existentially
quantifying out xs.
Reviewed By: ngorogiannis
Differential Revision: D17801933
fbshipit-source-id: 7b4b9262a
Summary:
Previously it was added to the locals before calling Dom.call, but
this results in the scope of freturn ending too early.
Reviewed By: ngorogiannis
Differential Revision: D17801939
fbshipit-source-id: 739ec8981
Summary:
Some globals have 'appending' linkage, where linking modules results
in appending the arrays from each module. These can appear even when
empty, leading to useless and somewhat troublesome 0-length arrays. So
drop them.
Reviewed By: ngorogiannis
Differential Revision: D17801927
fbshipit-source-id: d2dc180d7
Summary:
While BitCasts are the identity function on the bitwise
representation, they are not necessarily so in the semantics or the
logical representation. So be more conservative about eliding them in
the Exp language. Those that are actually semantic identities are
still omitted in the Term language.
Reviewed By: ngorogiannis
Differential Revision: D17801950
fbshipit-source-id: bf9ae57b5
Summary:
The analyzer (currently) hard-codes some assumptions about sizes of
basic types such as Typ.bool, Typ.siz, etc. Check that these
assumptions are satisfied by the input llvm datalayout, and give
reasonable error messages otherwise.
Reviewed By: ngorogiannis
Differential Revision: D17801941
fbshipit-source-id: 4fe484ee0
Summary:
Now that expression types and type sizes can be computed, it is not
necessary to store the sizes of globals separately.
Reviewed By: ngorogiannis
Differential Revision: D17801932
fbshipit-source-id: f746e506b
Summary:
- The `Llvm_target.DataLayout.size_in_bits` needs to be used for checking casts
e.g. it is ok to `bitcast <16 x i1> to i16`: they both have 16 bits, but they have sizes 16 vs 2 bytes
- The `Llvm_target.DataLayout.abi_size` needs to be used for the size of memory blocks containing values
e.g. for the size of memory segments containing the initial values of globals
- The example above shows that we can't compute the byte size from the bit size without knowing the target specific datalayout
- So we need both in each sized type
- Also add checks that Convert exps and terms are not no-ops
- Simplifications of size manipulating code
Reviewed By: ngorogiannis
Differential Revision: D17801928
fbshipit-source-id: 8c8ce6128
Summary:
In order to type-check casts, it is necessary to have the size of each
sized type. This size information is also useful in a few other places.
Reviewed By: bennostein
Differential Revision: D17801931
fbshipit-source-id: f8ef53276
Summary:
This is needed since expressions distinguish between the integer or
pointer zero value and zero-initialized array/tuple/struct aggregates
based on type, and the backend distinguishes them semantically.
Reviewed By: bennostein
Differential Revision: D17801938
fbshipit-source-id: ac8665e65
Summary:
Linking can lead to opaque types becoming identified with a known
types. Assertions in various places that types should be sized can be
triggered by such opaque types. Until there is a distinction between
processing fully-linked versus incomplete code, these checks need to
be relaxed to permit opaque types where sized ones are expected.
Reviewed By: bennostein
Differential Revision: D17801929
fbshipit-source-id: c5e62f7c8
Summary: Integer terms need to compare higher than any monomial.
Reviewed By: bennostein
Differential Revision: D17725607
fbshipit-source-id: c64fd52d5
Summary:
Also weaken definition of Typ.castable to permit casting between
floats and ints of the same size.
Reviewed By: bennostein
Differential Revision: D17725611
fbshipit-source-id: 5e8114e26
Summary:
Typ.equivalent is currently defined the same as Typ.castable, but
conceptually they are different and castable needs to be
weakened. They are different since for example it is possible to cast
from an i64 to a f64, but those types denote different sets of values
in the semantics, and the bitcast is modeled using a conversion
function.
Reviewed By: bennostein
Differential Revision: D17725615
fbshipit-source-id: 973574f2a
Summary:
For function calls where the callee is a cast expression, previous the
wrong type would be used for the callee. This could lead to crashes in
llvm, or asserting in sledge.
Reviewed By: bennostein
Differential Revision: D17725610
fbshipit-source-id: 938b49a49
Summary:
Some called functions are represented in llvm as a global variable
with e.g. external linkage, and so they do not appear as
'functions'. It is still valid to call such functions, though the
analyzer does not know their definitions.
Reviewed By: bennostein
Differential Revision: D17725609
fbshipit-source-id: 333d19c0d
Summary:
Improve Trace.fail to log the error and raise informative exceptions.
Eliminate the confusion between Import.fail and Trace.fail by removing
Import.fail.
Reviewed By: bennostein
Differential Revision: D17725608
fbshipit-source-id: 79fdfbd86
Summary:
By default all functions except those specified as entry points in the
config file are "internalized". Internal functions are removed if they
are not called. It is sometimes necessary to disable internalization,
e.g. to analyze the llvm tests.
Reviewed By: bennostein
Differential Revision: D17725614
fbshipit-source-id: 4b13501f5
Summary:
Sometimes the models for the C/C++ runtime and standard libraries are
not needed. Furthermore, sometimes, e.g. when analyzing llvm tests,
trying to link them fails.
Reviewed By: bennostein
Differential Revision: D17725616
fbshipit-source-id: 76a4bcf90
Summary:
The `(t, unit) result` type is no more informative than `t option` and
less convenient.
Reviewed By: bennostein
Differential Revision: D17665244
fbshipit-source-id: fa969d8b7
Summary:
This puts the mediation between Exp and Term together in Sh_domain
rather than being spread across the two.
Reviewed By: bennostein
Differential Revision: D17665235
fbshipit-source-id: edf277d45
Summary:
The move instruction takes a vector of assignments to perform in
parallel, so generalize exec_move from one to a vector.
Reviewed By: bennostein
Differential Revision: D17665248
fbshipit-source-id: 52aae5ff9
Summary:
Extend the encoding using `id` from 0 indicating a program variable to
also -1 indicating a global program variable.
Reviewed By: bennostein
Differential Revision: D17665229
fbshipit-source-id: 848b8a31e
Summary:
The sorting of heap blocks when printing formulas was broken by the
change to the direct representation of polynomials.
Reviewed By: bennostein
Differential Revision: D17665246
fbshipit-source-id: 4ebea9f20
Summary: It is not necessary to have both < and >=, and similarly for <= and >.
Reviewed By: bennostein
Differential Revision: D17665232
fbshipit-source-id: 01b3511f5
Summary:
Now that terms operate over unbounded, signed, integers rather than
bounded integers, and Boolean operations are treated uniformly with
bitwise operations, it is not necessary to propagate types throughout
arithmetic term manipulation.
Reviewed By: bennostein
Differential Revision: D17665257
fbshipit-source-id: 5236b101d
Summary:
Z.numbits ignores the sign, which allows 2^(N - 1) as representable
within N bits, while it is not. So check explicitly.
Reviewed By: bennostein
Differential Revision: D17665231
fbshipit-source-id: 0d3940517
Summary:
Instead of having separate signed and unsigned operations, use the
signed operations applied to explicit conversion of the arguments
using an unsigned integer interpretation.
Reviewed By: bennostein
Differential Revision: D17665267
fbshipit-source-id: 0b3271e71
Summary:
Add an Extract term form to interpret an integer with given signedness
and bitwidth.
Reviewed By: bennostein
Differential Revision: D17665263
fbshipit-source-id: 1d8917f3c
Summary:
Be more explicit about semantics of unsigned vs. signed conversions,
and fix a few related corner cases.
Reviewed By: bennostein
Differential Revision: D17665268
fbshipit-source-id: 67fecdf34
Summary:
With terms using unbounded two's complement arithmetic, it is not
necessary to special-case 1-bit integers as Booleans.
Reviewed By: ngorogiannis
Differential Revision: D17665228
fbshipit-source-id: a2f280fc3
Summary:
Remove the guards that prevent normalizing in some cases where the
corresponding instruction in LLVM would produce a poison
value. Usefully tracking poison values will be more involved.
Reviewed By: ngorogiannis
Differential Revision: D17665230
fbshipit-source-id: 59fb25042
Summary:
Revise program expressions based on the changed constraints now that
Term is separate from Exp. In particular:
- Add types to all application, indicating how the operation
interprets its arguments
- Change to a simpler uncurried form
- Remove now-unneeded normalizations
Reviewed By: bennostein
Differential Revision: D17665236
fbshipit-source-id: 1bcf2efd6
Summary:
Boolean and bitwise negation of `e` is represented using `-1 xor
e`. Since Equality can only maintain and propagate equality
constraints, Boolean negation `-1 xor b` is normalized to `b =
false`. This diff delays this normalization from being part of
expression construction to part of symbolic heap formula
construction. This makes the normalization done as part of expression
construction independent of the distinction between bitwise and
boolean operations.
Reviewed By: bennostein
Differential Revision: D17665254
fbshipit-source-id: 0a0722865
Summary:
Splat, Memory, and Concat expressions are never used. Only the term
forms are needed.
Reviewed By: bennostein
Differential Revision: D17665259
fbshipit-source-id: cbfd7650d
Summary:
There are a number if issues with using the same type for expressions
in code and in formulas. One is that the type systems of the two
should be different. Another is that conflating the two compromises
the ability of Llair to correctly express aspects such as integer
overflow, floating point rounding, etc. Also, it could be beneficial
to have more source locations for program expressions than makes sense
for terms.
This diff simply unshares Exp, leading to a copy named Term. Likewise,
Reg is now a copy of Var. Simplifications to come.
Reviewed By: bennostein
Differential Revision: D17665250
fbshipit-source-id: 4359a80d5
Summary:
The generation of names for the function formal return and throw
parameters is not central to LLAIR, but a detail of the frontend,
since they are generated only because LLVM does not already have such
names.
Reviewed By: ngorogiannis
Differential Revision: D17665240
fbshipit-source-id: 684cbae92
Summary:
Using a type of keys richer than strings, which are the unique symbol
names at the C/LLVM level, is unnecessary.
Reviewed By: ngorogiannis
Differential Revision: D17665262
fbshipit-source-id: 6b8c31146
Summary:
The convenience wrappers for operations on signed 1-bit integers
represented by Z.t are not specific to Exp.
Reviewed By: ngorogiannis
Differential Revision: D17665252
fbshipit-source-id: d4b58e2a6
Summary:
Now that the relation domain construction is factored out and
generalized.
Reviewed By: ngorogiannis
Differential Revision: D17665253
fbshipit-source-id: eb156ce6b
Summary:
Since version 2, none of the `opam pin` modes work reasonably well for
the standard llvm build procedure. As a workaround to prevent opam
from making several copies of the build directory when pinning, adjust
to move the llvm build and install directories out of the llvm source
tree.
Reviewed By: bennostein
Differential Revision: D17665242
fbshipit-source-id: ac84a4b0b
Summary:
Since the correcteness of the mapping from LLVM to llair depends on
LLVM being SSA, we need to formalise what that means. We also prove that
the domination relation is a strict partial order, which will probably
be helpful when reasoning about the translation.
Reviewed By: jberdine
Differential Revision: D17631456
fbshipit-source-id: a00eb3f87
Summary:
The LLVM semantics and translation was not consistently treating the
1-bit word value condition as signed or unsigned.
Reviewed By: jberdine
Differential Revision: D17605766
fbshipit-source-id: 77edf63b7
Summary:
Previously the LLVM semantics did the phi instructions at the head of a
block as part of executing the branch into that block. This looked a bit
weird, but had the advantage that the semantics knew which block was
being jumped from, which is necessary to run the phi instructions.
However, it meant that the rules for doing phi instructions would need
to show up with each branching construct. It was also annoying for the
LLVM->llair proof, since the phis are removed and their effect happens as
a distinct step from the branch.
Here we add a distinct Phi_ip instruction pointer to indicate that the
phi instructions at the start of the block should execute next, and then
be incremented to the usual numeric instruction pointer that points to
the non-phi instructions. The Phi_ip contains the identity of the
previous block.
Reviewed By: jberdine
Differential Revision: D17452416
fbshipit-source-id: 78fef7cca
Summary:
Give the llair semantics observable side effects (writes to global
variables) and a semantic function mirroring the LLVM semantics. Start
sketching out the LLVM/llair translation equivalence proof in a top-down
way from the obvious statement of equality of the semantics.
Reviewed By: jberdine
Differential Revision: D17399654
fbshipit-source-id: 2170678a8
Summary:
The simple LLVM semantics steps one instruction at a time, but the
generated llair does whole blocks at a time, since many individual LLVM
instructions can become a single llair expression. We add a bigger-step
LLVM semantics that does whole blocks at a time (except that it also
stops at function calls, since those end blocks in llair). The steps in
this bigger-step semantics should be at the same granularity as the
llair steps, making it easier to verify the translation.
We add a notion of observation to the LLVM semantics (right now, just
global variable writes) and use that to define two top-level semantic
functions, which we prove to be equivalent.
Reviewed By: jberdine
Differential Revision: D17396016
fbshipit-source-id: ee632fb92
Summary:
This diff allows domains to specify which abstract states can or can't
be merged together by the worklist. In particular, this is needed for
relational domains to ensure that Hoare triples are joined only when
they share a precondition.
Reviewed By: jberdine
Differential Revision: D17571148
fbshipit-source-id: d9345fdc9
Summary:
This diff adds a "-prenalyze-globals" flag to all analyze targets
which, when set, computes used-globals sets for all reachable
functions and then uses that information to track only relevant
global variables at calls in the main analysis.
Reviewed By: jberdine, jvillard
Differential Revision: D17526746
fbshipit-source-id: 1a114285c
Summary:
Fixes a bug in Llair.Frontend.xlate_value where the l-val register
of LLVM instruction calls was being marked as global.
Reviewed By: jberdine
Differential Revision: D17570458
fbshipit-source-id: e1b5924e2
Summary:
Fixes a bug where are all calls are treated as intrinsics in used
globals analysis, since exec_intrinsic is invoked at _all_ calls
to determine which are intrinsic, not only at call sites known to
target intrinsics.
Reviewed By: jberdine
Differential Revision: D17499406
fbshipit-source-id: 41f7621f2
Summary:
While the symbolic heap analysis ends its search upon hitting the
bound on recursion depth, the used-globals analysis should instead
simply skip recursive calls beyond the depth. Note that this is
unsound for arbitrary abstract domains, however, and the flag
controlling this feature should be used with caution.
Note that procedure calls are still not handled correctly, since
Used_globals.exec_intrinsic does not properly check whether callees
are intrinsic. A forthcoming commit will fix that, as well.
Reviewed By: jberdine
Differential Revision: D17479753
fbshipit-source-id: aa92e0ef3
Summary:
Include global variables used in function callees in used globals
analysis. Also adds support for arbitrary changes to symbolic
state while resolving callees in other analyses.
Reviewed By: jberdine
Differential Revision: D17479352
fbshipit-source-id: e3cd9f179
Summary:
Replace custom version reporting support using a shell script with
code using dune's Build_info API.
Note that after this diff, the executables under _build/<context> are
not version-stamped, but those under _build/_install are. The symlinks
in bin point to the latter, stamped, exes.
Reviewed By: bennostein
Differential Revision: D16985446
fbshipit-source-id: 7afac87be
Summary:
Adds an abstract domain to track global variable usages, as well as supporting
changes to the frontend, IR and CLI. This analysis will support optimizations
to the main symbolic-heap analysis, but for now can be invoked independently
through the `-domain` flag on `analyze` targets of the Sledge executable.
Reviewed By: jberdine
Differential Revision: D17422212
fbshipit-source-id: 74bed0a76
Summary:
Generalize the lifting from State_domain (i.e. symbolic heaps) to Sh_domain (i.e. relations over symbolic heaps).
Also, extract abstract-domain-related code into its own module/directory.
Reviewed By: jberdine
Differential Revision: D17319007
fbshipit-source-id: cefbd1393
Summary: Add support for future development of new abstract domains by eliminating hard-wired dependencies from the worklist into the symbolic heap domain. Also includes an implementation of a trivial unit domain and a CLI flag to enable its use, for debugging purposes.
Reviewed By: jberdine
Differential Revision: D17281681
fbshipit-source-id: 5858fd420
Summary:
This includes a few changes and corrections to the semantics, to support
the translation. This initial attempt to reason about LLVM -> llair
showed three things that needed repair in the semantics, in addition to
various bugs. We address them as follows.
Refactor llair semantics to have only a single kind of flat value:
integers that fit into specified bit widths. Operations on size values
(e.g., offsets, indices and the like) can just take an integer and
ignore its number of bits. Pointers can just be considered integers that
fit into a certain size given by the constant pointer_size. Later on we
can consider making this a parameter to the model.
Change the generic memory model interface to use numbers rather than
words as the generic encoding of a large value. This makes it more
useful for llair where words are not used.
Pay more careful attention to signed/unsigned issues. Neither LLVM nor
llair have a concept of signed vs unsigned value. Instead individual
operations interpret bit patterns in various ways, some of which are
ambiguous in the LLVM manual. For example, since getelementpointer's
indices are explicitly said to be interpreted as signed 2's complement,
we should probably do the same for insertvalue and extractvalue. However
it is not clear how the argument to alloca is to be interpreted. For now
we assume signed.
Reviewed By: jberdine
Differential Revision: D17164133
fbshipit-source-id: 31a8af635
Summary:
Not everything is here yet, and there is some confusion on what to do
about the size values. However, the semantics has the right general
shape and will be a nice starting point for thinking about the details.
Reviewed By: jberdine
Differential Revision: D17111041
fbshipit-source-id: cc75651c6
Summary:
The translation from LLVM to llair now builds expressions up across
blocks, following the implementation. This is easy to do because of the
dominance restrictions in SSA, but might be difficult to reason
about.
Reviewed By: jberdine
Differential Revision: D17111040
fbshipit-source-id: a8e99147d
Summary:
LLVM and llair have similar memory models, and we don't want to
duplicate any definitions or theorems. This adds a new memory model
theory which should be understandable in its own right. A heap is a
mapping from addresses to bytes, alongside a set of valid addresses, and
intervals that have been allocated already. Primitives are defined for
allocating and de-allocating as well as reading and writing chuncks of
bytes.
There is also a generic type of structured values, and functions for
converting them to/from byte arrays.
Reviewed By: jberdine
Differential Revision: D17074470
fbshipit-source-id: bdab6089f
Summary:
In some cases inlining pure expressions into their use sites causes
code blowup. This diff changes the frontend to inline expressions only
if there is a single use, and otherwise adds a move instruction.
Reviewed By: ngorogiannis
Differential Revision: D17071770
fbshipit-source-id: d866a0622
Summary:
This has been out of date since arithmetic was changed from a purely
uninterpreted treatment to having a solver.
Reviewed By: jvillard
Differential Revision: D16985159
fbshipit-source-id: 39e42069c
Summary:
While SSA can be useful for code transformation purposes, it offers
little for semantic static analyses. Essentially, such analyses
explore the dynamic semantics of code, and the *static* single
assignment property does not buy much. For example, once an execution
visits a loop body that assigns a variable, there are multiple
assignments that the analysis must deal with. This leads to the need
to treat blocks as if they assign all their local variables, renaming
to avoid name clashes a la Floyd's assignment axiom. That is fine, but
it makes it much more involved to implement a version that is
economical with respect to renaming only when necessary. Additionally
the scoping constraints of SSA are cumbersome and significantly
complicate interprocedural analysis (where there is a long history of
incorrect proof rules for procedures, and SSA pushes the
interprocedural analysis away from being able to use known-good
ones). So this diff changes Llair from a functional SSA form to a
traditional imperative language.
Reviewed By: jvillard
Differential Revision: D16905898
fbshipit-source-id: 0fd835220
Summary:
Before this diff symbolic execution of instructions assumed that
assigned variables were unconstrained in the precondition. This is
ensured by symbolic execution of control flow, which renames all local
variables of a block when it is entered.
This diff changes symbolic execution of instructions to rename
modified variables that appear in the precondition when necessary, and
accounts for the modified variable occurrence condition on the frame
rule. This will enable more economically renaming variables, as most
of the time it is not needed.
Reviewed By: jvillard
Differential Revision: D16905893
fbshipit-source-id: 3a53525d7
Summary:
Each variable now contains its type, alongside its name. This is more
uniform than in LLVM, where the name is usually paired with a type, but
not always, for example, the register type of the result of an
extractvalue is left implicit.
Reviewed By: jberdine
Differential Revision: D16984630
fbshipit-source-id: 1c3bc4985
Summary:
HOL now lets us omit quotations on Datatypes and make them look more
like the other new-style HOL definitions.
Reviewed By: jberdine
Differential Revision: D16983934
fbshipit-source-id: f8ef3abb5
Summary:
This sketches out how translation can be approached. It is partially
based on the Sledge code.
For basic blocks, isn't based on the Sledge code, but just my own
thoughts as a starting point. Essentially, we are trying to build up
larger expressions, and so not assigning to temporary registers that
don't live past the end of the block. This does remove sharing, so a
fancier approach could check for multiple uses of end-of-block dead
registers, or look at the sizes of expressions. The approach should be
flexible enough to accommodate such changes.
Fix icmp syntax
Using finite maps is elegant in the semantics, but awkward for writing
the translation function. Refactor the mappings from labels to functions
and from labels to blocks to use association lists instead.
To remove phi nodes, the translation takes every edge in the control
flow graph and makes a new basic block that contains a single parallel
move instruction that corresponds to the action of the phi node of the
target block.
Reviewed By: jberdine
Differential Revision: D16831051
fbshipit-source-id: 005663e26
Summary:
The AST is not complete on expressions, but it should have most of the
important features.
The representation is in some ways very different from the OCaml
implementation, because the OCaml code uses mutation to build the CFG as
an actual pointer graph in memory, and also because the expression
representation is optimised for the backend. For the former, it should
be easy to see that the AST here is isomorphic, representing the CFG
with finite maps from block labels. The correspondence is less clear in
the latter case, but the point here is not to model or verify
implementation optimisations, but to give a semantics to llair as a
language.
Reviewed By: jberdine
Differential Revision: D16807132
fbshipit-source-id: b0f64b3ec
Summary:
Change loc_var (for local variable) to reg (for register) because
loc_var looks too much like a location tagged variable.
Reviewed By: jberdine
Differential Revision: D16827920
fbshipit-source-id: 5b11f1065
Summary:
There could very well still be bugs in the semantics, since the
invariant here doesn't say all that much, and it completely ignores
local registers. But most trivial things and typos are probably fixed.
Reviewed By: jberdine
Differential Revision: D16803281
fbshipit-source-id: 48ba2523b
Summary:
Made progress on the sanity checking lemma (that the step relation
preserves some simple invariants on the state). Proved the Ret
instruction case of the state invariant lemma. To do this, I fixed a few
bugs in the definition, and strengthened the invariants.
Reviewed By: jberdine
Differential Revision: D16786900
fbshipit-source-id: 6fa8cb170
Summary:
Global variables need allocating and initialising before the machine can
start. The definition here shouldn't constrain how and where they are
allocated. For example, they don't all need to have separate
allocations. We also tag allocated blocks so that the allocation for a
global can never be deallocated.
Start working on a sanity checking invariant on states.
Reviewed By: jberdine
Differential Revision: D16735068
fbshipit-source-id: 0d5e60e7a
Summary:
Start working on a simple model of LLVM with the ultimate goal of
handling relevant and/or tricky aspects of LLVM and LLAIR and then
formalising the translation from LLVM to LLAIR.
This is a complete initial model of everything that we are interested in
except for exceptions, which should be tricky. Also no thought has gone
into the treatment of poison and the undefined value, so the treatment
is naive, which is at least partially justified because we are
interested in the semantics of LLVM IR after the optimisation passes
have run.
Include some sanity checking theorems.
Reviewed By: jberdine
Differential Revision: D16731885
fbshipit-source-id: fd53949fe
Summary:
Currently bitcode produced with `sledge buck link` can have missing
symbols that are clearly defined in the source. For example consider a
symbol `awesome_function` that is defined in the libraries linked in but
not in the produced binary (despite being reachable from main).
`llvm-nm` of the bitcode produced by `llvm-link` might look like:
```
U awesome_function
t awesome_function.1892
```
Some our `awesome_function` is undefined and its definition is called
`awsome_function.1892` for some reason and is local. I think this is because symbol get internalized too early and then they get renamed and somehow lost. Not sure why `llvm-link` behaves this way sometimes.
This patch removes internalization from `llvm-link` and puts it into `opt`, where it doesn't cause problems.
Reviewed By: jvillard
Differential Revision: D16494153
fbshipit-source-id: aad9053a4
Summary:
`__llair_alloc` is meant to be a drop-in non-failing replacement for
`mallco`. Currently `__llair_alloc(1)` allocates 8 bytes instead of 1 as
`malloc(1)` would. This is because handling of `__llair_alloc` was
merged with handling of `new`. This patch reverts changes to handling of
`new` in D15778817 and adds a new case for `__llair_alloc`.
Reviewed By: jvillard
Differential Revision: D16356865
fbshipit-source-id: 3878d87c3
Summary:
When using summaries we first garbage collect the precondition and then
ask the solver to infer the frame of the precondition with respect to
grabage-collected footprint.
Currently if the solver fails to show the frame, we just give it an
empty frame. This is bad, because if grabage collection removed some
segments, they don't get added back on.
This patch throws an exception instead to be very explicit when the
solver cannot show the frame in this case.
Reviewed By: ngorogiannis
Differential Revision: D16339587
fbshipit-source-id: b88d0689c
Summary:
The actual implementation of folly::usingJEMalloc() tests if malloc is
jemalloc using internal knowledge of the jemalloc implemenation of
malloc. This internal behavior is not reflected in the analyzer's
spec, so the detection fails.
Additionally, folly::usingJEMalloc is implemented using mallctl to
query internal state of jemalloc. Depending on the key string passed
to mallctl, it might return a pointer to jemalloc internal state, or a
scalar, which means that the spec needs to essentially allocate that
state in those cases.
Since the jemalloc detection fails, and the analyzer is not always
able to reason precisely about string equality, this diff models
folly::usingJEMalloc directly (as nondet).
Reviewed By: kren1
Differential Revision: D16059776
fbshipit-source-id: 7e7156d7d
Summary:
It seems that functions internalized by llvm no longer have valid
mangled names, and instead have a `.<int>` suffix. This diff removes
these unpredictable suffixes when checking if a called function is a
specified/modeled intrinsic.
Reviewed By: kren1
Differential Revision: D16059781
fbshipit-source-id: a4b9f6c73
Summary:
A frame inference query `Minuend ⊢ ∃xs. Subtrahend` returns a
`∃zs. Remainder` formula such that `Minuend ⊢ ∃xs. Subtrahend *
∃zs. Remainder` when successful. Currently if the subtrahend is itself
existentially quantified, its existentials are treated trivially: they
must witness themselves. This diff allows the solver to find witnesses
as the `xs`. They are still existentially quantified in the remainder,
so clients that need to constrain them should still name them before
calling the solver.
Reviewed By: kren1
Differential Revision: D16269630
fbshipit-source-id: 65136edd1
Summary:
Add a global merge pass that merges globals into a single big global. It
replaces the uses of globals merged, with offsets into the big global.
Function summarisationis a big benefactor of this as it greatly reduces
the number of implicit formals (ie. globals).
Reviewed By: jvillard
Differential Revision: D16260098
fbshipit-source-id: 1b936f02f
Summary:
Fix a bug where summaries would be created even if summarisation option
is disabled.
Reviewed By: jvillard
Differential Revision: D16259761
fbshipit-source-id: f7319ef03
Summary:
If function summaries are enabled calling a function first tries to
apply a summary, if succesful, it directly jumps to the return site of
the call. Otherwise it proceeds as before.
Reviewed By: jvillard
Differential Revision: D16201251
fbshipit-source-id: cec52e0e5
Summary:
Define a new function summary type and compute it on function return.
As an intermediary step also apply the just computed summary to function
pre so it can be compared to what was actually computed.
Reviewed By: jvillard
Differential Revision: D16149833
fbshipit-source-id: b826c17e8
Summary:
Fix a crash that occurs when subtrahend has an existential variable that
was renamed as in the test.
The crash is due to an assertion in `Sh.exists` that says only variables
in the vocabulary can be existentialy quantifed out.
The problem was `Sh.exists` call in Solver.ml:611. Where `ws`
(existentials of the subthrehend) are not present in the vocabulary of
the remainder. This is because remainder "inheirts" the vocabulary of
the minued.
This fix simply extends the vocabulary of minued with `ws`, which
means the remaainder has the correct vocabulary. This should have no
externally visible effect as `ws` are then existentialed out.
Another option would be to try to change all the `excise_seg` functions,
to keep the vocabulary, but that looked annoying to implement.
Reviewed By: jvillard
Differential Revision: D16201423
fbshipit-source-id: b88c3abc4
Summary:
Add a `-color` option to sledge, that prints variable that are
existentially bound as bold.
Reviewed By: ngorogiannis
Differential Revision: D16088750
fbshipit-source-id: bd21cb8a0
Summary:
This fixes two bugs:
* All local variables would get existentially quantified out, that means
the the local variables of the caller couldn't be restored properly
* Frame was added back on after the formals were killed. Which meant
that if frame talked about formals (in pure for example), those
formals would remain to be free variables.
Reviewed By: ngorogiannis
Differential Revision: D16091157
fbshipit-source-id: dfe12ed82
Summary:
This fixes two issues with function summarization when calling a
function multiple times.
* Previously on return, the actuals wouldn't get added back in, so
their name would be "lost" (that is existentially quantified out),
this patch adds the formals to actuals equalities back on return,
before quantifying the formals out.
* Previously the entry state of the function would be lost if there were
multiple calls to other functions.
Reviewed By: jberdine
Differential Revision: D16071656
fbshipit-source-id: 9df7b1d4b
Summary:
Currently alarms are reported to stdout while the debug trace is
written to stderr. This makes synchronizing the two difficult. With
this diff, the alarm reports can also be included in the debug trace,
and analysis can be stopped when an alarm is encountered by tracing
the `Stop` module, e.g.:
```
sledge -trace Report+Stop.on_invalid_access
```
Reviewed By: kren1
Differential Revision: D16072611
fbshipit-source-id: 32c3639a2
Summary:
There are many assumptions on the behavior of mutexes, condition
variables, etc. in the implementation of the cxxabi with threads
support. So compile with `_LIBCXXABI_HAS_NO_THREADS` defined to select
the much simpler code paths for the single-threaded case.
Reviewed By: kren1
Differential Revision: D16069454
fbshipit-source-id: 9f975e0e6
Summary:
Each call to __cxa_allocate_exception, in practice, is shortly
followed by raising an exception. With -skip-throw, execution will not
proceed past the throw. Since the concrete implementation of
__cxa_allocate_exception and the following initialization of the
exception object is very low-level code that plays tricks, the
analyzer has trouble with it. So model __cxa_allocate_exception as
unreachable to avoid (needlessly) executing that code and potentially
failing spuriously.
Reviewed By: kren1
Differential Revision: D16069451
fbshipit-source-id: bea1dae09
Summary:
Allow intrinsics to return an inconsistent state, to specify that they
do not return.
Reviewed By: kren1
Differential Revision: D16069453
fbshipit-source-id: deb5d2a22
Summary:
This adds an optimized debug build mode, which is compiled with
optimizations, and without assertions, but still has tracing enabled.
Reviewed By: kren1
Differential Revision: D16069452
fbshipit-source-id: 445cfa329
Summary:
The report output got disturbed by the change from predicate to
relational Domain, and the tricky control of printing simplified
states. After this diff by default states are printed in full, and in
simplified form with `-t State_domain.pp_simp`.
Also includes some minor output improvements.
Reviewed By: kren1
Differential Revision: D16059780
fbshipit-source-id: b33289887
Summary:
Trivial renamings to use the standard "libFuzzer" name instead of "lib
fuzzer".
Reviewed By: kren1
Differential Revision: D16067881
fbshipit-source-id: 3ff2a8f86
Summary:
On function return add the computed summary (pre/post) condition to a
hashtable.
Reviewed By: jberdine
Differential Revision: D16052136
fbshipit-source-id: 0c5c9bafb
Summary:
Outputting the list of bitcode inputs when no output flag is ok for
`sledge buck bitcode` but does not make sense when it is composed as
part of other commands. So only output to stdout if `-` is given as
the output file name.
Reviewed By: kren1
Differential Revision: D16059782
fbshipit-source-id: abac9c36f
Summary:
To easily monitor and track changes to the help generated by the
command line interface, generate it in full and add it to the repo.
Reviewed By: kren1
Differential Revision: D16059783
fbshipit-source-id: be15f9943
Summary:
This diff enhances `-function-summaries` to remember the frame computed by
the solver and actually execute the function using the summary. Upon
return the frame is added back on the computed post condition.
Reviewed By: ngorogiannis
Differential Revision: D15900318
fbshipit-source-id: 8bb56b771
Summary:
This diff is preparation for function summarization and focuses on
function calls and function summary precondition computation.
It introduces `-function-summaries` flag behind most of functionality is
hidden, when enabled on each call
* A function summary is computed by quantifying all the non-formal/global variables
and removing all the segments that are not reachable from them
* `pre` and `foot` are computed from function summary and the calling context
by replacing formals with actuals again.
* A solver is asked if `pre` entails `foot` and a frame is printed if it
does
Currently this only works for formulas without disjunctions, so when
function summaries are enabled, that state is first moved to dnf and then
the call is done for each disjunct.
Reviewed By: ngorogiannis
Differential Revision: D15898928
fbshipit-source-id: 49d32504c
Summary: For buck targets that contain at least one of the substrings in `buck-target-pattern` option in config, change the buck target to add `_sledge` suffix.
Reviewed By: jberdine
Differential Revision: D15920018
fbshipit-source-id: 44c242e99
Summary: And fix test Makefile to call the C++ compiler on .cpp files.
Reviewed By: kren1
Differential Revision: D15972426
fbshipit-source-id: 719de755f
Summary:
Simplify all conversions between castable types to the identity. The
backend treats castable types as equal, so distinguishing conversions
between them is incomplete.
Reviewed By: kren1
Differential Revision: D15972427
fbshipit-source-id: fa09859ac
Summary:
The entry block contains all locals of the entire function, as
required by the backend. This makes the manipulation of the locals of
each block redundant.
This diff moves the locals from the entry block to the function
itself, removes the Locals frames of the Control.Stack, and adds a
locals field to Return frames.
This is part cleanup and part preparation for removing the
Control.Stack.
Reviewed By: ngorogiannis
Differential Revision: D15963503
fbshipit-source-id: 523ebc260
Summary:
Adds `-mergefunc` and `-dce` passes to `Frontend.translate` to match
the `buck link` flow with `opt`
Reviewed By: ngorogiannis
Differential Revision: D15938641
fbshipit-source-id: 128cb89cd
Summary:
The current handling of the formal return variable scope is not
correct. Since it is passed as an actual argument to the return
continuation, it is manipulated as if it was a local variable of the
caller. However, its scope is not ended with the caller's locals,
leading to clashes.
This diff reworks the passing of return values to avoid this problem,
mainly by introducing a notion of temporary variables during parameter
passing. This essentially has the effect of taking a function spec
{ P } f(x) { λv. Q }
and generating a "temporary" variable v, applying the post λv. Q to it
to obtain the pre-state for the call to the return continuation
k(v). Being a temporary variable just means that it goes out of scope
just after parameter passing. This amounts to a long-winded way of
applying the post-state to the formal parameter of the return
continuation without violating scopes or SSA.
This diff also separates the manipulation of the symbolic states as they
proceed from:
1. the pre-state before the return instruction;
2. the exit-state after the return instruction (including the binding
of the returned value to the return formal variable);
3. the post-state, where the locals are existentially quantified; and
4. the return-state, which is expressed in terms of actual args
instead of formal parameters.
Also in support of summarization, formal return and throw parameters
are no longer tracked on the analyzer's stack.
Note that these changes involve changing the locals of blocks and
functions to no longer include the formal parameters.
Reviewed By: kren1
Differential Revision: D15912148
fbshipit-source-id: e41dd6e42
Summary:
The solver couldn't deal with `∃ a,b . a = b` , so this diff adds
a special case to deal with it.
Reviewed By: ngorogiannis
Differential Revision: D15897953
fbshipit-source-id: d841d3557
Summary:
:
This patch adds several passes that reduce the amount of bitcode
making sledge's job easier, more info:
https://llvm.org/docs/Passes.html
`-mergefunc`
This pass merges functions that do the same thing, this can be because of
templating or casts (ie. same functionality but on 32bit and 64bit ints,
which is the same in machine code). More details at
http://llvm.org/docs/MergeFunctions.html
Note that this pass is currently not available through C/OCaml API.
`-constmerge`
This merges constants that have the same value, this is possible to do
when the constants are internalized.
`-argpromotion`
```
This pass promotes “by reference” arguments to be “by value” arguments.
In practice, this means looking for internal functions that have pointer
arguments. If it can prove, through the use of alias analysis, that an
argument is only loaded, then it can pass the value into the function
instead of the address of the value. This can cause recursive
simplification of code and lead to the elimination of allocas
(especially in C++ template code like the STL).
```
`-ipsccp`
```
Sparse conditional constant propagation and merging, which can be
summarized as:
Assumes values are constant unless proven otherwise
Assumes BasicBlocks are dead unless proven otherwise
Proves values to be constant, and replaces them with constants
Proves conditional branches to be unconditional
```
`-deadargelim`
Removes dead arguments of internal functions, good to run after other inter-procedural
passes. Seems to crash llvm if run directly after `ipsccp`.
Note that while this might look like doing full link-time optimisation,
we are actually picking relatively cheap optimisations that mostly look
at globals and walk their use chains. The main reason link-time
optimisations are expensive is due to inlining and then running the full
optimisation again from there.
Reviewed By: jberdine
Differential Revision: D15851408
fbshipit-source-id: be7191683
Summary:
This diff introduces a `-lib-fuzz` flag to `buck link`, which links in a
simple main that calls the LLVMFuzzerTestOneInput function, which is the
entry point of libFuzzer fuzzer.
Reviewed By: jberdine, jvillard
Differential Revision: D15821512
fbshipit-source-id: cff731ed3
Summary:
Previous change to allow bitcasts in call instructions was too strict
and did not allow for indirect calls.
Reviewed By: jberdine
Differential Revision: D15803262
fbshipit-source-id: 40d828b59
Summary:
Currently printing symbolic heaps is unreadable, because there are too
many quantified variables, that are mostly just equal to other
variables.
This diff tries to replace all variables in an equivalence class with a
single variable and remove the unneccesary variables.
It also introduces two modes for printing state domains:
`-t +State_domain.pp_full` prints the state domain as is
`-t +State_domain.pp` uses the simplification before printing.
Reviewed By: jberdine
Differential Revision: D15738748
fbshipit-source-id: 7c85b580e