Summary:
This started as an attempt to understand how to modify the frontend to
inject destructors for C++ temporaries (see next diffs).
This diff rewrites the existing logic for computing the list of
variables that should be destroyed at the end of each statement, either
because it's the end of their syntactic scope or because control flow
branches outside of their syntactic scope.
The frontend translates a function from the last instructions to the
first, but scope computation needs to be done in the other direction, so
it's done in a separate pass *before* the main translation happens. That
first pass creates a map from statements in the AST to the list of
variables that should be destroyed at the end of these statements. This
is still the case now.
Before, that map would be computed in a bit of a weird way: scopes are
naturally a stack but instead of that the structure maintained was a
flat list + a counter to know where the current scope ended in that
list.
In this diff, redo the computation maintaining a stack of scopes
instead, which is a bit cleaner. Also treat more instructions as
introducing a new scope, eg if, for, ...
Reviewed By: mbouaziz
Differential Revision: D15674208
fbshipit-source-id: c92429e82
Summary:
Somewhat trivial: add a string to "Destruction" nodes to indicate why
they were created. Rename the main `instruction_aux` function into
`instruction_translate` (see next diff for why).
Reviewed By: mbouaziz
Differential Revision: D15674211
fbshipit-source-id: 8a7eda72c
Summary:
I rewrote the test so it doesn't need any C++ headers so that:
- it's easier to see what's going on
- it's easier to debug: the whole AST is now somewhat readable vs before
the headers made it impossibly long
Reviewed By: ezgicicek
Differential Revision: D15674213
fbshipit-source-id: d98941983
Summary:
The synthetic methods from `topl.Property` are now nonempty: they
simulate a nondeterministic automaton.
Reviewed By: jvillard
Differential Revision: D15668471
fbshipit-source-id: 050408283
Summary:
- take advantage more structured attributes in the exported AST
- circumvent new format of `if` and `switch`
- a few new features/nodes but nothing major there
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz, martintrojer
Differential Revision: D15453572
fbshipit-source-id: c0c24345f
Summary:
This messes with the deduplication heuristic when templated function
names show up in the error messages, since the heuristic demands that
the error messages are the same.
Reviewed By: mbouaziz
Differential Revision: D15374333
fbshipit-source-id: 70232d254
Summary:
Improve the error messages, change is more or less documented in the
code.
Reviewed By: mbouaziz
Differential Revision: D15374334
fbshipit-source-id: f1dd54180
Summary:
Before: the trace would explain how a value was invalidated and
accessed, but not how the value that was invalidated had been
constructed.
Now: `PulseTrace.t` records breadcrumbs of how the value was constructed
in addition to the interproc "action" trace leading to the invalidation
or access action.
Concretely:
```
void bad(X &x) {
X *y = x;
X *z = x;
delete y;
access(z);
}
```
will produce the trace:
Invalidation part:
y = x
delete y
Access part:
z = x
access(z)
access to z->f inside of access(z)
Before this diff the "Access part" would be missing the "z = x" part of
the trace, so it might be confusing why `z` has anything to do with `y`.
However, such "breadcrumbs" are not recorded in the inter-procedural
part, only the sequence of calls is. This is a trade-off for simplicity,
maybe it's enough for developers maybe it isn't, we'll find out later.
Reviewed By: jberdine
Differential Revision: D15354438
fbshipit-source-id: 8d0aed717
Summary:
Feedback from peterogithub:
- mention which access path is being invalidated and accessed in the message
- mention the line at which it was invalidated (the line at which it's accessed is already the line at which we report)
- traces for stack variable/C++ temporary address escapes
- delete double implementation of the same functionality in
`PulseTrace`: `location_of_action_start` is the same as
`outer_location_of_action`...
Reviewed By: jberdine
Differential Revision: D14800294
fbshipit-source-id: 3d9ab9b3d
Summary:
Similarly to function parameters (and the return value), we need to
apply the pre/post of a function call to the globals mentioned in its
summary.
- tigthen summaries further to remember only abducible variables in the
post (as well as in the pre)
- take globals into account when applying pre/post pairs
Reviewed By: jberdine
Differential Revision: D14780800
fbshipit-source-id: fc0d180bb
Summary:
The heuristic to detect variables going out of scope was to detect any
access expression passed as argument to an injected destructor call.
However destructor calls are also injected in destructor bodies to
destruct each field of an object, so the heuristic would detect fields
going out of scope, which, erm, doesn't make sense. Limit the heuristic
to local program variables.
Reviewed By: jberdine
Differential Revision: D14771454
fbshipit-source-id: ffa3c9fe3
Summary:
Only throw values to the pre if they can be followed from "abducible"
variables: formals of the current method and globals.
Because figuring out if a `Pvar.t` is a formal of the current procedure
is actually a giant pain, hack something not too bad instead:
pre-register all formals at the start of the analysis of the
procedure. Then the only other variables we care about in the
precondition are globals, which we can detect easily.
This is mostly an optimisation (summaries won't include irrelevant
"abduced" facts about the procedure's local variables anymore), but it
also fixes a bug where we would sometimes overwrite things in the pre. I
think that's why the tests improved.
Reviewed By: ngorogiannis
Differential Revision: D14753493
fbshipit-source-id: 08e73637f
Summary:
This mostly doesn't make sense. The only thing this would have been good
for was to give the most accurate result on access paths such as
`*(&(x.f))`, but these are normalised anyway (into `x.f`) so we actually
never see these. That said there might be some use to some similar logic
in the future, but in the meantime let's delete the current feature as
it wasn't thought through.
Reviewed By: ezgicicek
Differential Revision: D14753492
fbshipit-source-id: 597cec027
Summary:
The previous message formatting had regressed and produced non-sensical messages.
More importantly, remove template parameters from error messages to
trigger the heuristic in `InferPrint` that deduplicates errors that are
on the same line with the same error type and message. Without this we
get hundreds of reports that correspond to as many instantiations of the
same code.
Reviewed By: ngorogiannis
Differential Revision: D14747979
fbshipit-source-id: 3c4aad2b1
Summary:
We see the magic function `__variable_initialization` at the point where
the variable is declared, eg `int x = foo()`. It's safe to reset `&x` at
that point. This circumvents an issue that pops up in some rare cases
where the ternary conditional operator `?:` and variable initialization
conspire to produce weird frontend results.
Some test becomes a FN again, but I think it was being reported for the
wrong reasons; will investigate more later.
Reviewed By: ngorogiannis
Differential Revision: D14747980
fbshipit-source-id: e75d6e30f
Summary:
This ensures that each attribute type can only be present once per
address. Makes ~80x time improvement on pathological cases such as
Duff's device.
This introduces a new kind of Set in `PrettyPrintable`.
Reviewed By: mbouaziz
Differential Revision: D14645091
fbshipit-source-id: c7f9b760c
Summary:
Detect when a variable goes out of scope. When that's the case, mark its
address *and* its contents as invalid.
Give subsequent uses a USE_AFTER_LIFETIME error type instead of
USE_AFTER_DESTRUCTOR.
Reviewed By: jberdine
Differential Revision: D14387147
fbshipit-source-id: a2c530fda
Summary:
Instead of emitting an ad-hoc builtin on variable declaration emit a new
metadata instruction. This allows us to remove the code matching on that
ad-hoc builtin that had to be inserted in several checkers.
Inferbo & pulse used that information meaningfully and had to undergo
some minor changes to cope with the new metada instruction.
Reviewed By: ezgicicek
Differential Revision: D14833100
fbshipit-source-id: 9b3009d22
Summary:
Bundle all non-semantic-bearing instructions into a `Metadata _`
instruction in SIL.
- On a documentation level this makes clearer the distinction between
instructions that encode the semantics of the program and those that are
just hints for the various backend analysis.
- This makes it easier to add more of these auxiliary instructions in
the future. For example, the next diff introduces a new `Skip` auxiliary
instruction to replace the hacky `ExitScope([], Location.dummy)`.
- It also makes it easier to surface all current and future such
auxiliary instructions to HIL as the datatype for these syntactic hints
can be shared between SIL and HIL. This diff brings `Nullify` and
`Abstract` to HIL for free.
Reviewed By: ngorogiannis
Differential Revision: D14827674
fbshipit-source-id: f68fe2110
Summary:
This diff accumulates LatestPrune in sequential prunings. It should be sound since Inferbo invalidates some data of LatestPrune if they are updated.
Depends on D14321534
Reviewed By: mbouaziz
Differential Revision: D14321575
fbshipit-source-id: 233dbae32
Summary:
Some of these tests were wrong, eg `~lambda()` calls `lambda()` then...
takes the bitwise complement or something? The intent was to call the
destructor.
Add interprocedural tests for later.
Reviewed By: jberdine
Differential Revision: D14324762
fbshipit-source-id: 40d2c32f5
Summary:
Previously we would say that `lhs <= rhs` (or `lhs |- rhs`) when a
mapping existed between the abstract addresses of `lhs` and `rhs` such
that `mapping(lhs)` was a supergraph of `rhs`. In particular,
we had that `x |-> x' * x' |-> x'' |- x |-> x'`. This is not entirely
great, in particular once we get pairs of state representing footprint +
current state. I'm not sure I have an extremely compelling argument why
though, except that it's not the usual way we do implication in SL, but
there wasn't a compelling argument for the previous state of affairs
either.
This changes `|-` to be true only when `mapping(lhs) = rhs` (modulo only
considering the addresses reachable from the stack variables).
Reviewed By: jberdine
Differential Revision: D14568272
fbshipit-source-id: 1bb83950e
Summary: This helps convergence when `<=` is based on physical equality for example, and widening is implemented as `widen ~prev ~next = join prev next`.
Reviewed By: skcho
Differential Revision: D14568270
fbshipit-source-id: ded5ed296
Summary:
Re-declarations of global variables sometimes hide constant
initializations in the original declaration, which caused FN before.
In this diff, it translates global variables to point to original
declarations, rather than following re-declarations, if possible.
Reviewed By: mbouaziz, jvillard
Differential Revision: D14596301
fbshipit-source-id: 55c3b5f95
Summary: In SIL, sometimes a return value is assigned to `__return_param`.
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D14538590
fbshipit-source-id: dfbb74dc2
Summary:
To meet the pure parts of formulas, the process was to (a) call Rename.extend
with variables occuring in similar places and (b) extract substitutions out of
those. Two matching primed vars would both be replaced by some fresh primed var.
However, equivalence classes of primed variables would *not* be replaced by
one fresh (primed) variable. Now, that should work.
Reviewed By: mbouaziz
Differential Revision: D14150192
fbshipit-source-id: 90ca9216c
Summary:
This will be used in the future to determine what to do with destructors
in pulse.
Reviewed By: mbouaziz
Differential Revision: D14324759
fbshipit-source-id: bc3c34471
Summary:
When joining two lists of disjuncts we try to ensure there isn't a state
that under-approximates another already in the list. This helps reduce
the number of disjuncts that are generated by conditionals and loops.
Before we would always just add more disjuncts unless they were
physically equal but now we do a subgraph computation to assess
under-approximation.
We only do this half-heartedly for now however, only taking into
consideration the "new" disjuncts vs the "old" ones. It probably makes
sense to do a full quadratic search to minimise the number of disjuncts
from time to time but this isn't done here.
Reviewed By: mbouaziz
Differential Revision: D14258482
fbshipit-source-id: c2dad4889
Summary:
The disjunctive domain shouldn't really be a set in the first place as
comparing abstract states for equality is expensive to do naively
(walking the whole maps representing the abstract heap). Moreover in
practice these sets have a small max size (currently 50 for pulse, the
only client), so switching them to plain lists makes sense.
Reviewed By: mbouaziz
Differential Revision: D14258489
fbshipit-source-id: c512169eb
Summary:
It's useful to keep the size of states down, especially when humans are
trying to read it. It will also help keep the size of summaries down in
the inter-procedural pulse.
Reviewed By: mbouaziz
Differential Revision: D14258486
fbshipit-source-id: 45ebcac67
Summary: After a redeclaration of a global constant, it is not parsed as ICE(integral constant expression), which results in FN.
Reviewed By: ezgicicek
Differential Revision: D14299288
fbshipit-source-id: 394afd595
Summary:
Add an option to specify some classes that we really want to warn about
with the liveness checker, even when they appear used because of the
implicit destructor call inserted by the compiler.
Reviewed By: mbouaziz
Differential Revision: D13991129
fbshipit-source-id: 7fafdba84
Summary:
The purpose these serve is unclear to me. From the comment I *think*
they were used to hint to the biabduction backend that smart pointers
are just pointers. That said, The tests still mostly pass even without
that (just a few `weak_ptr` tests changed from `NULL_DEREFERENCE` to
`Bad_footprint`).
Moreover, this extra dereference was added unreliably. For instance,
this piece of code:
```
auto x = std::make_unique<X>(some_X);
```
would either get the extra dereference or not depending on which headers
were picked for the C++ stdlib.
The extra dereference was tripping up the liveness checker (see later in
the stack), and probably most checkers too.
Reviewed By: mbouaziz
Differential Revision: D13991130
fbshipit-source-id: 462923595
Summary: When a `VarDecl` has the attribute `unused` then do not assign its initialisation result to the corresponding variable.
Reviewed By: ngorogiannis
Differential Revision: D13974497
fbshipit-source-id: 28029f995
Summary:
This diff updates the reachability conditions of proof obligations at every function calls.
Depends on D13781124
Reviewed By: mbouaziz
Differential Revision: D13781147
fbshipit-source-id: 3c8768bd9
Summary: We do not want to export unnecessary information from clang plugin, hence, the solution to this false positive would be to annotate with `__unused__` attribute.
Reviewed By: jvillard
Differential Revision: D13861333
fbshipit-source-id: 774009f37
Summary:
This diff extends the abstract domain to keep binary conditions on
prunings, so Inferbo can suppress more proof obligations (i.e., false
positives) that are known to be unreachable according to the binary
conditions.
Depends on D13729600
Reviewed By: mbouaziz
Differential Revision: D13749914
fbshipit-source-id: 314f048f1
Summary:
It suppresses intended integer overflows that generate hash values or random numbers. For judging that the intention of integer overflow, it applies a heuristics: checking if traces of issues include a whitelisted words, e.g., "rand" or "seed".
While we would be able to suppress all integer overflows of unsigned integers since they have defined behaviors, we don't want to miss unintended integer overflows, e.g., that on unsigned index value.
Depends on D13595958
Reviewed By: mbouaziz
Differential Revision: D13595967
fbshipit-source-id: 8d3aca5a7
Summary:
Record per-location traces. Actually, that doesn't quite make sense as a
location can be accessed in many ways, so associate a trace to each
*edge* in the memory graph. For instance, when doing `x->f = *y`, we
want to take the history of the `<val of y> --*--> ..` edge, add "assigned
at location blah" to it and store this extended history to the edge
`<val of x> --f--> ..`.
Use this machinery to print nicer traces in `infer explore` and better
error messages too (include the last assignment, like biabduction
messages).
Reviewed By: da319
Differential Revision: D13518668
fbshipit-source-id: 0a62fb55f
Summary:
This diff substitutes the conditions of proof obligations strictly, so that the condition of "p!=Null" becomes bottom
when callee's p is Null.
In the non-strict substitution (which is used by default), if p's location is not found it returns the unknown location.
On the other hand, in the strict substitution (which is used only in the substitution of condition of proof obligation),
it returns bottom location.
Depends on D13415366, D13414636
Reviewed By: mbouaziz, jvillard
Differential Revision: D13415377
fbshipit-source-id: 5ae1e888e
Summary: Adding a test case for use after destructor for temporaries. At the moment pulse does not find it as frontend does not inject destructors for temporaries.
Reviewed By: jvillard
Differential Revision: D13506229
fbshipit-source-id: 31b9466f7
Summary:
When a C++ temporary goes out of scope, tag its address in the heap with
a new attribute `AddressOfCppTemporary` so that we can later check that
we don't return it.
Reviewed By: da319
Differential Revision: D13466898
fbshipit-source-id: 8808338b4
Summary:
When assign to the special `return` variable, check that the result is
not the address of a local variable, otherwise report.
Reviewed By: ngorogiannis
Differential Revision: D13466896
fbshipit-source-id: 465da7f13
Summary: Naming it `FP_` was a mistake in the original commit that copied the tests over as pulse has never reported on that method.
Reviewed By: da319
Differential Revision: D13465324
fbshipit-source-id: f8b24ebda
Summary:
In order to avoid FPs due to lack of relational info, we apply a heuristic: proof obligations has a latest pruned values,
then it is instantiated at Call statements. If there is a bottom value in the instantiated pruned values, we can say the
program point where the proof obligation is introduced is unreachable with the given parameters of the function.
Depends on D13414441
Reviewed By: mbouaziz
Differential Revision: D13414483
fbshipit-source-id: 61bd34ebb
Summary:
It's ok to take an address of a field / array access of an invalid object.
This diff calculates the inner most dereference for an access expression starting with `&` and does not report on the dereference even if the address is invalid.
Reviewed By: jvillard
Differential Revision: D13450758
fbshipit-source-id: 18c038701