Summary:
Split into:
- `PulseDiagnostic`, formerly `PulseDomain.Diagnostic`
- `PulseOperations`, formerly `PulseDomain.Operations`
This breaks down the now quite large and complex PulseDomain.ml into
more manageable pieces. More importantly, it will allow us to build a
bigger pulse domain later, where elements of the domain are pairs of the
base domain that include a biabductive "footprint".
What's not as nice is that more of the interface of `PulseDomain` is
exposed, in particular `PulseDomain.Memory` and `PulseDomain.Stack`.
We'll have to be careful not to break abstraction barriers and prefer
`PulseOperations` to `PulseDomain` outside of the domain implementation.
OCaml forces us to do that because of the multi-file approach. It could
be solved by introducing pulse domains as a library but who has time for
that...
Sending early because rebasing that diff is painful.
Reviewed By: ngorogiannis
Differential Revision: D13537602
fbshipit-source-id: d211d6e84
Summary:
Record per-location traces. Actually, that doesn't quite make sense as a
location can be accessed in many ways, so associate a trace to each
*edge* in the memory graph. For instance, when doing `x->f = *y`, we
want to take the history of the `<val of y> --*--> ..` edge, add "assigned
at location blah" to it and store this extended history to the edge
`<val of x> --f--> ..`.
Use this machinery to print nicer traces in `infer explore` and better
error messages too (include the last assignment, like biabduction
messages).
Reviewed By: da319
Differential Revision: D13518668
fbshipit-source-id: 0a62fb55f
Summary: In the `operator=` case that assigns from a temporary, we want to assign an object of a temporary not it's address (as a comment already says)
Reviewed By: jvillard
Differential Revision: D13518496
fbshipit-source-id: 72bd23623
Summary:
When a C++ temporary goes out of scope, tag its address in the heap with
a new attribute `AddressOfCppTemporary` so that we can later check that
we don't return it.
Reviewed By: da319
Differential Revision: D13466898
fbshipit-source-id: 8808338b4
Summary:
When assign to the special `return` variable, check that the result is
not the address of a local variable, otherwise report.
Reviewed By: ngorogiannis
Differential Revision: D13466896
fbshipit-source-id: 465da7f13
Summary:
It's ok to take an address of a field / array access of an invalid object.
This diff calculates the inner most dereference for an access expression starting with `&` and does not report on the dereference even if the address is invalid.
Reviewed By: jvillard
Differential Revision: D13450758
fbshipit-source-id: 18c038701
Summary:
When we create Dereference edge, we also create TakeAddress back edge. This causes false positives for stack variables. When we write to a stack variable and then take its address, the resulting address is the one from the back edge of the written value. See example `push_back_value_ok`. To solve this issue, this diff changes stack to denote a map from address of variables rather than from variables.
We still have issue for fields, see example, FP_push_back_value_field_ok. To solve this, we probably need to remove back edges.
Reviewed By: jvillard
Differential Revision: D13432415
fbshipit-source-id: 9254a1a6d
Summary: Mostly a revert of D13190876 once the disjunctive domain is in place.
Reviewed By: da319
Differential Revision: D13432488
fbshipit-source-id: f1e98ef0d
Summary:
Change join/widen policies to more interesting ones and play around to
find a good tradeoff.
Reviewed By: mbouaziz
Differential Revision: D13432492
fbshipit-source-id: 2c3e498dd
Summary:
Introduce machinery to do disjunctive HIL domains and use it for pulse,
but only in a mode that preserves the existing behaviour.
The disjunctive domain is a functor that turns any (HIL for now)
transfer function module into one operating on sets of elements of the
original domain. The behaviour of joins (and widenings, which are equal
to joins) can be chosen when instantiating the functor among 3
behaviours:
- `` `JoinAfter n`: when the set of disjuncts gets bigger than `n` the
underlying domain's join is called to collapse them into one state
- `` `UnderApproximateAfter n`: when the sest of disjuncts gets bigger
than `n` then just stop adding new states to it, drop any further states
on the floor. This corresponds to an under-approximation/bounded
approach.
- `` `NeverJoin`
The widening is always of the form ``
`UnderApproximateAfterNumIterations max_iter` for now since the only
user is pulse and I'm not sure what else would be useful.
Picking `` `JoinAfter 0` gives the same results as the non-disjunctive
domain since the underlying `join` will always be called. Make pulse use
this mode for now, and tune it in a next diff.
Reviewed By: mbouaziz
Differential Revision: D13431375
fbshipit-source-id: b93aa50e7
Summary:
This will be useful to make the analysis more precise. In particular, it
allows a disjunctive version of pulse to deal will deleting vector
elements in a loop: without this, deleting an array element in one
iteration will make the analysis think that the next array element is
invalid too since they are all the same. By keep track of the index, we
can detect when we are sure that two elements are the same and only
report in that case.
Reviewed By: ngorogiannis
Differential Revision: D13431374
fbshipit-source-id: dae82deeb
Summary:
A lot of functors that take a `Make{SIL,HIL}` can take a `{SIL,HIL}`
directly instead. This makes my head hurt a bit less.
Reviewed By: mbouaziz
Differential Revision: D13416967
fbshipit-source-id: eb0b33bc4
Summary:
When a lambda gets created, record the abstract addresses it captures, then
complain if we see some of them be invalidated before it is called.
Add a notion of "allocator" for reporting better messages. The messages are
still a bit sucky, will need to improve them more generally at some point.
```
jul lambda ~ infer 1 infer -g --pulse-only -- clang -std=c++11 -c infer/tests/codetoanalyze/cpp/pulse/closures.cpp
Logs in /home/jul/infer.fb/infer-out/logs
Capturing in make/cc mode...
Found 1 source file to analyze in /home/jul/infer.fb/infer-out
Found 2 issues
infer/tests/codetoanalyze/cpp/pulse/closures.cpp:21: error: USE_AFTER_DESTRUCTOR
`&(f)` accesses address `s` captured by `&(f)` as `s` invalidated by destructor call `S_~S(s)` at line 20, column 3 past its lifetime (debug: 5).
19. f = [&s] { return s.f; };
20. } // destructor for s called here
21. > return f(); // s used here
22. }
23.
infer/tests/codetoanalyze/cpp/pulse/closures.cpp:30: error: USE_AFTER_DESTRUCTOR
`&(f)` accesses address `s` captured by `&(f)` as `s` invalidated by destructor call `S_~S(s)` at line 29, column 3 past its lifetime (debug: 8).
28. f = [&] { return s.f; };
29. }
30. > return f();
31. }
32.
Summary of the reports
USE_AFTER_DESTRUCTOR: 2
```
Reviewed By: da319
Differential Revision: D13400074
fbshipit-source-id: 3c68ff4ea
Summary: Model more `std::vector` functions that can potentially invalidate references to vector's elements (https://en.cppreference.com/w/cpp/container/vector).
Reviewed By: mbouaziz
Differential Revision: D13399161
fbshipit-source-id: 95cf2cae6
Summary:
Some code calls `this->~Obj()` then proceeds to use fields in the current
object, which previously we would report as invalid uses. Assume people know
what they are doing and ignore destructor calls to `this`.
Reviewed By: mbouaziz
Differential Revision: D13401145
fbshipit-source-id: f6b0fb6ec
Summary:
`AccessExpression.t` represents array accesses as `ArrayOffset of t * Typ.t * t
list`, i.e. the index is represented by a list of access expressions. This is
not precise enough when indices cannot be represented as such. In fact, in
general any `HilExp.t` can be an array index but this type was an approximation
that was good enough for existing checkers based on HIL.
This diff changes the type of access expressions to be parametric in the type
of array offsets, and uses this to record `HilExp.t` into them when translating
from SIL to HIL.
To accomodate the option of not caring about array offsets
(`include_array_indexes=false`), the type of array offsets is an option type.
Reviewed By: mbouaziz
Differential Revision: D13360944
fbshipit-source-id: b01442459
Summary:
`AccessExpression.t` and `HilExp.t` are about to become mutually
recursive, this will help distinguish the actual changes from the moving
of code around.
This deletes the file left around in the previous commit to preserve
callers of `AccessExpression`.
Reviewed By: mbouaziz
Differential Revision: D13377645
fbshipit-source-id: 71338d1f3
Summary:
`AccessExpression.t` and `HilExp.t` are about to become mutually
recursive, this will help distinguish the actual changes from the moving
of code around.
Reviewed By: mbouaziz
Differential Revision: D13377644
fbshipit-source-id: 9d6f290b6
Summary:
This will be useful for proper support of array indexes in pulse and in
HIL in general.
Reviewed By: mbouaziz
Differential Revision: D13377642
fbshipit-source-id: e431121fb
Summary: Similarly as `std::vector::push_back`, `std::vector::reserve` can invalidate the references to elements if the new size is bigger than the existing one. More info on `std::vector::reserve`: https://en.cppreference.com/w/cpp/container/vector/reserve
Reviewed By: jvillard
Differential Revision: D13340324
fbshipit-source-id: bf99b6923
Summary: Moving all the files related to nullable type checking under the same directory. The goal is to merge everything into the same backend based on the AI framework and access expressions.
Reviewed By: ngorogiannis
Differential Revision: D13350880
fbshipit-source-id: 8ab3cf81b
Summary: Instead of variable having the value of a single location on stack, we now allow variables to have multiple locations. Consequently, we also allow a memory location to point to a set of locations in the heap. We enforce a limit on a maximum number of locations in a set (currently 5).
Reviewed By: jvillard
Differential Revision: D13190876
fbshipit-source-id: 5cb5ba9a6
Summary: It is not used yet and still manages to cause false positives.
Reviewed By: mbouaziz
Differential Revision: D13102948
fbshipit-source-id: 2122666c2
Summary:
It's useful for checkers to know when variables go out of scope to
perform garbage collection in their domains, especially for complex
domains with non-trivial joins. This makes the analyses more precise at
little cost.
This could have been added as a custom function call to a builtin, but I
decided against it because this instruction doesn't have the semantics
of any function call. It's better for each checker to explicitly not
deal with the custom instruction instead.
Reviewed By: jberdine
Differential Revision: D13102951
fbshipit-source-id: 33be22fab
Summary:
Before, the liveness pre-analysis would place extra instructions in the
CFG for either:
1. marking an `Ident.t` as dead, or
2. marking a `Pvar.t` as `= 0`
But we have no way of marking pvars dead without setting them to 0. This
is bad because setting pvars to 0 is not possible everywhere they are
dead. Indeed, we only do it when we haven't seen their address being
taken anyway. This prevents the following situation, recorded in our tests:
```
int address_taken() {
int** x;
int* y;
int i = 7;
y = &i;
x = &y;
// if we don't reason about taken addresses while adding nullify instructions,
// we'll add
// `nullify(y)` here and report a false NPE on the next line
return **x;
}
```
So we want to mark pvars as dead without nullifying them. This diff
extends the `Remove_temps` SIL instruction to accept pvars as well, and
so renames it to `ExitScope`.
Reviewed By: da319
Differential Revision: D13102953
fbshipit-source-id: aa7f03a52