Summary:
We translated the expression `CXXStdInitializerListExpr` naively in D3058895 as a call to
a skip function, with the hope that it would be translated better in the future. However, the naive means that we lose access to the initialized list/array because we are simply skipping it. So, even if we want to model the initializer properly, we have to deal with the skip specially.
This diff tries to solve this problem by removing the skip call whenever
possible. Instead, we translate the underlying array/list as a Load, so
that when it is passed to the constructor, we can pick it up.
For the following initialization:
``` std::vector<int*> vec = {nullptr};
```
Before, we translated it as
```
*&0$?%__sil_tmpSIL_materialize_temp__n$7[0]:int* const =null
n$8=_fun___infer_skip_function(&0$?%__sil_tmpSIL_materialize_temp__n$7:int* const [1*8] const )
n$9=_fun_std::vector<int*,std::allocator<int*>>::vector(&vec:std::vector<int*,std::allocator<int*>>*,n$8:std::initializer_list<int*>)
```
However, this means, `n$8` would be result of something skipped which we can't reason about. Instead, we just pass the underlying initialized array now, so we get the following translation:
```
*&0$?%__sil_tmpSIL_materialize_temp__n$7[0]:int* const =null
n$8=*&0$?%__sil_tmpSIL_materialize_temp__n$7:int* const [1*8] const
n$9=_fun_std::vector<int*,std::allocator<int*>>::vector(&vec:std::vector<int*,std::allocator<int*>>*,n$8:std::initializer_list<int*>)
```
Reviewed By: jvillard
Differential Revision: D21155014
fbshipit-source-id: 75850b1e6
Summary:
It is true that `Info` issues are normally not intended for the end user
and in general should be hidden by default.
However, the current behavior - show them only if `--no-filtering` is
true - is super non-intuitive and complicates already complex reporting
logic.
Lets use the general "enable/disable" mechanism for controlling this.
Reviewed By: jvillard
Differential Revision: D21154140
fbshipit-source-id: 69e4c88e4
Summary:
Computing sledge's equality relation and normalising terms is costly. We
can avoid doing that most of the time by keeping the sledge path
condition lazily evaluated and only forcing it down to a value at two
critical points in the analysis:
1. Summary creation, to avoid storing unsatisfiable pre/posts that will have
to be needlessly executed by callers. This also saves us from having
to serialise the closures involved in the uncomputed form of lazy
values inside the pulse summaries.
2. Before reporting errors we check in the state is in fact satisfiable.
If not we just prune it away at that point.
This yields ~4x speedup on some targets.
Reviewed By: ezgicicek
Differential Revision: D21129759
fbshipit-source-id: a75fdd3bc
Summary:
This is mostly just a type change for now, more changes to come. This
doesn't make thing much faster yet because we force computations pretty
often to check for unsatisfiability (each function call and PRUNE node).
Next diff will build on that.
Reviewed By: skcho
Differential Revision: D21129758
fbshipit-source-id: 72200e2b1
Summary:
When encountering a constant, pulse creates an abstract value (a
variable) to represent it, and remembers that it's equal to it. The
problem is that pulse doesn't yet know how to deal with the fact that
some variables are going to be equal to each other.
This hacks around this issue in the case of constants, within the same
procedure, by remembering which constants have been assigned to which
place-holder variables, and serving those variables again when the same
constant is translated again.
Limitation: this doesn't work across procedure calls as the "constant
maps" are not saved in summaries.
Something to look out for: we don't want to make `if (p == NULL)` create
a path where `p` is invalid (we only make null invalid when we see an
assignment from 0, i.e. `p = NULL;`).
Reviewed By: ezgicicek
Differential Revision: D21089961
fbshipit-source-id: 5ebb85d0a
Summary:
1. Package will make the error too verbose.
2. We don't even need to say it is "class" because we say it in the error
description ("Class has 0 issues and can be marked Nullsafe").
Reviewed By: artempyanykh
Differential Revision: D21131998
fbshipit-source-id: 6ccca7615
Summary:
One source of false positives on container races is when the container member field is initialised to a concurrent version in a constructor, but the static type of the field doesn't reflect the thread safety of it.
This solution
- tracks flows from constructors of safe data structures to abstract addresses;
- initialises the initial attribute state when analysing a non-constructor method to that achieved by all constructors/class-initializers.
- checks for that attribute when recording container accesses.
Reviewed By: jvillard
Differential Revision: D21089428
fbshipit-source-id: 02a88f6e8
Summary:
As artempyanykh pointed out, `nullable` works much better for required fields
than `optional` in terms of x-plat.
Reviewed By: artempyanykh
Differential Revision: D21129614
fbshipit-source-id: b03c91b78
Summary: Modeling vector iterator with two internal fields: an internal array and an internal pointer. The internal array field points to the internal array field of a vector; the internal pointer field represents the current element of the array. For now `operator++` creates a fresh element inside the array.
Reviewed By: ezgicicek
Differential Revision: D21043304
fbshipit-source-id: db3be49ce
Summary:
The java source file parser should refuse to run on non
.java file. Also, as we expect some autogenerated source files to
break Java official syntax, we catch parsing error silently and
cancel location recording for them.
Reviewed By: jvillard
Differential Revision: D21089587
fbshipit-source-id: 35f1a1e28
Summary:
Now that only races rooted at formals or globals are reported, there is no point using the ownership domain to exclude accesses to locals, so remove the code that initialises the ownership of those.
Also, remove an accidentally quadratic use of `Map.bindings |> List.find` in `FormalMap` and simplify the analysis driver.
Reviewed By: skcho
Differential Revision: D21066855
fbshipit-source-id: 126080778
Summary:
Add a path condition to each symbolic state, represented in sledge's arithmetic domain. This gives a precise account of arithmetic constraints. In particular, it is relation and thus is more robust in the face of inter-procedural analysis.
This is gated behind a flag for now as there are performance issues with the new arithmetic.
Reviewed By: jberdine
Differential Revision: D20393947
fbshipit-source-id: b780de22a
Summary:
Show that the SSA restrictions imply that the blocks can be
topologically sorted so that any use of a variable follows its
definition in the list of blocks. This showed a slight flaw in my
definition of SSA form. It used to treat program counters up to
equality. However, PCs that point to a block header contain the location
that control came from (so that the Phi instructions can be executed),
but the SSA restrictions shouldn't pay attention to that, so now the
definition of SSA introduces a equivalence on PCs that ignores that
additional information.
Reviewed By: jberdine
Differential Revision: D21066873
fbshipit-source-id: 735575a1f
Summary: Move state to summary conversion into the domain file, move a model matcher into the models file and simplify.
Reviewed By: skcho
Differential Revision: D21064351
fbshipit-source-id: bb5b07b6b
Summary:
There are two types of anonymous classes (not user defined classes):
- classic anonymous classes (defined as $<int> suffixes)
- lambda classes (corresponding to lambda expressions). Experimentally,
they all have form `$Lambda$_<int>_<int>`, but the code just uses
`$Lambda$` as a heuristic so it is potentially more robust.
# Problem this diff solves
When generate meta-issues for nullsafe, we are interested only in
user-defined classes, so we merge all nested anonymous stuff to
corresponding user-defined classes and hence aggregate the issues.
Without this diff, for each lambda in the code, we would report this as
a separate meta-issue, which would both screw up stats and be confusing
for the user (when we start reporting mode promo suggestions!).
Reviewed By: artempyanykh
Differential Revision: D21042928
fbshipit-source-id: a7be266af
Summary: In `PowLoc`, we focus on making every constructors of `PowLoc.t` return nomalized value. Thus, it is unnecessary to nomalize any `PowLoc.t` inputs.
Reviewed By: ezgicicek
Differential Revision: D21064893
fbshipit-source-id: d00a7f06b
Summary:
Introduce an explicit assumption that all of the instructions are ones
that are implemented in the translation so far, rather than just
cheating proof cases.
Reviewed By: jberdine
Differential Revision: D20891352
fbshipit-source-id: 37598bd8f
Summary:
Separate into separate files the theorems that are just about the
translation (mostly about the structure of the variable->expression
mapping that the translation builds) from theorems about the translation
and the semantics.
Also move the stuff about dominator_ordered into the SSA Theory, since
it only makes sense for SSA programs, but doesn't have anything to do
with the translation.
Reviewed By: jberdine
Differential Revision: D20673124
fbshipit-source-id: 9d8b08164
Summary:
The LLVM->LLAIR translation keeps a mapping of variables to expressions.
Previously, the invariants related to that mapping were kept in the
state relation, and so the proof needed show that they were preserved
along execution traces. This wasn't obvious as the state changes in
non-SSA ways during evaluation, but the correctness of the mappings is
heavily based on the program being in SSA form. This change separates
out the invariants, and the proof uses the final mapping that the
compiler builds, which contains all of the relevant bindings that might
be needed during execution.
Reviewed By: jberdine
Differential Revision: D20625109
fbshipit-source-id: d4c2dfe19
Summary:
This diff revises how to handle the unknown location in inferbo in two ways:
* stop appending field to the `Unknown` location, e.g. `Unknown.x.a` is evaluated to `Unknown`
* redesign the abstract of multiple locations, like `Bottom` < `Unknown` < `Known` locations
I am doing them in one diff since applying only one of them showed bad results.
Background: `Unknown` was adopted for abstracting all unknown concrete locations, so we could avoid missing semantics of assignments to unknown locations. We tried to keep soundness. However, it brought some other problems related to precision and performance.
1. Sometimes especially when Inferbo failed to reason precise pointer values, `Unknown` may point to many other abstract locations.
2. At that time, value assignments to `*Unknown` makes the situation worse: many abstract locations are updated with imprecise values.
This problem harmed not only its precision, but also its performance since it introduced more location entries in the abstract memory.
Reviewed By: jvillard
Differential Revision: D21017789
fbshipit-source-id: 0bb6bd8b5
Summary:
Treat the remainder of dividing a rational by an integer as if the
rational was an integer division.
Reviewed By: jvillard
Differential Revision: D21042515
fbshipit-source-id: b5d42ddec
Summary:
The partial treatment of Mul and Div terms can simplify some cases,
but since it is only a partial treatment that is not producing a
normal form, in other cases the "simplification" results in large and
non-canonical terms. It is safer to leave them uninterpreted.
Reviewed By: jvillard
Differential Revision: D21042521
fbshipit-source-id: 04fc37f1a
Summary:
The heights of And and Or terms can grow high. This interacts poorly
with some unoptimized Equality operations such as normalization that
do some processing at every subterm.
Reviewed By: jvillard
Differential Revision: D21042518
fbshipit-source-id: 55e6acbb1
Summary:
These are map and folding map that perform a cycle-preserving
pre-order transformation.
Reviewed By: jvillard
Differential Revision: D20877974
fbshipit-source-id: 251288228
Summary:
The current notion of "simplified" function symbols, which are treated
as a hybrid between interpreted and uninterpreted, has no logical
basis. Normalization is now strong enough, due to stronger handling of
the changing carrier set, that the "simplified" classification can be
removed.
Reviewed By: jvillard
Differential Revision: D20726961
fbshipit-source-id: 9962ea323
Summary:
It is possible for the canonical form of a term in (the carrier set
of) a relation to be a term that is not in the relation. It is also
possible for this term to be equal to a term in the relation. It is
the job of the lookup subroutine of normalization to find these
equations.
The current implementation of entails_eq is incomplete in this case,
in particular, when an uninterpreted term has an interpreted subterm,
which itself has a subterm that is not a representative.
This diff strengthens normalization (and hence it's callers such as
entails_eq) to handle such cases. This makes lookup work slightly
harder. An alternative would be to extend the carrier with new terms
from normalization to the carrier and closing the relation, and then
re-normalizing. That would lead to much inflated representation sizes,
and is inefficient.
Reviewed By: jvillard
Differential Revision: D20612565
fbshipit-source-id: 3b7534a62
Summary:
This diff adds enough interpretation of Mul and Div terms to be able
to exclude them from the domain of solution substitutions. While
non-linear arithmetic is still treated very incompletely, this change
increases the propagation power of the equality constraints that are
deduced. Mainly, this appears to be enough to avoid operations that
are semantically equivalence-preserving such as solve_for_vars from
producing equalities that are unprovable from their inputs.
Reviewed By: jvillard
Differential Revision: D20863528
fbshipit-source-id: fca74cba3
Summary:
For non-linear polynomial equations, the solver can currently choose
to solve for a variable that occurs in a non-linear subterm, leading
to a non-idepotent solution substitution. For example, `x - y + (x ×
z) = 0` could be solved for `x`, leading to `x ↦ y - (x × z)` where
`x` is solved in terms of itself. This diff refines Term.solve_zero_eq
to avoid such cyclic solutions.
Reviewed By: jvillard
Differential Revision: D20637482
fbshipit-source-id: 6d7df85c3
Summary:
Strengthen the canonizer for division and rational constant terms. It
is important to avoid constructing `Q.t` values with 0 denominators,
as they do not represent real numbers, and the algebraic manipulation
done by the rest of the solver is incorrect in that case.
Additionally strengthen the Term representation invariants to check
that all coefficients and exponents are valid real rational numbers.
Also normalize division by an integer or rational to multiplication by
a rational.
Reviewed By: jvillard
Differential Revision: D20663964
fbshipit-source-id: 210962fe0
Summary:
Equations of the form `a = b` where `a` is a proper subterm of `b` are
possible when uninterpreted functions are involved. Internally,
Equality does not eagerly substitute `b` for `a`, but external clients
can repeatedly `Equality.normalize` terms and thereby incrementally
blow up the sizes of terms.
This diff uses the heights of uninterpreted terms to choose equality
class representatives to avoid such blow-ups, by orienting equations
so that tall terms are represented by short terms, so that repeated
normalization cannot increase term height indefinitely.
Reviewed By: jvillard
Differential Revision: D20785632
fbshipit-source-id: ff4c5bacd