Summary:
Change the logic of the annotation reachability checker in the following
ways:
1. Sanitizers take priority over sinks, i.e. a procedure that is both a
sink and a sanitizer is not a sink. This changes the existing tests
that seemed to assume the opposite. However I think that way is more
useful and goes better with the fact that sanitizers are specified as
"overrides".
2. When applying a summary, check again that we are not in a sanitizer
for the corresponding sink.
Without (2) this there was a subtle bug when several rules were
specified. For example, if `sink_wrapper()` wraps `sink()` for a rule
`R` then the summary of `sink_wrapper()` will be: `R-sink : call to sink()`.
Then, suppose `sanitizer()` calls `sink_wrapper()` and `sanitizer()` is
a sanitizer for `R` but not for another rule `R'`. The previous code
would add the call to `sink()` to the summary of `sanitizer()` because
it's not a sanitizer for `R'`, even though `sink()` is not a sink for
`R'`!
The current code will re-apply the rules correctly so that sinks are
matched only against the right sanitizers.
Reviewed By: skcho
Differential Revision: D16895577
fbshipit-source-id: 266cc4940
Summary:
- run the tests! they weren't hooked up to the main Makefile :/
- add some html debug messages
- formatting
Reviewed By: skcho
Differential Revision: D16895578
fbshipit-source-id: e96d737cc
Summary:
Since it is non-sense to get ranges of boolean values, this diff
ignores control values that only contain boolean symbols.
Depends on D16804802
Reviewed By: ezgicicek
Differential Revision: D16804808
fbshipit-source-id: ccb25db4d
Summary: Functions with empty body have unit cost, not zero. The unit cost comes from the start node.
Reviewed By: skcho
Differential Revision: D16855642
fbshipit-source-id: 6b5181faf
Summary:
Before this diff it returned `[0,size-1]`, but which became bottom
when size was given by 0. As a result, it made the both branches of
`if(iterator.hasNext())` unreachable. Similarly, if the size was 1,
it only visited the false branch of the if condition because the
condition value was `[0,0]` at that time.
This diff changes it to return `[0,size]`, so that
* the false branch is reachable when the size is 0
* the both branches are reachable when the size is 1
Reviewed By: ezgicicek
Differential Revision: D16803000
fbshipit-source-id: f8772be27
Summary: We want to keep big O notation as simple as possible in cost analysis reports (especially in diff time). Therefore, let's not show constants/min/max in big O notations even though the resulting asymptotic bound might be inaccurate. Developers can click on the trace and see the actual cost.
Reviewed By: skcho
Differential Revision: D16731351
fbshipit-source-id: 2e16f7eca
Summary: In order to test changes to bigO notation, let's record them in test results.
Reviewed By: skcho
Differential Revision: D16763972
fbshipit-source-id: c1376909b
Summary:
Correct the models of ArrayList initialization. Basically, there are two ways to initialize:
- by setting an initial capacity, which creates an empty list
- by passing another collection as an argument
Before, we had only modeled the second case which was resulting in wrong memory model for the first case. This diff fixes that.
Reviewed By: skcho
Differential Revision: D16711055
fbshipit-source-id: e82faf191
Summary:
It adds a vector model of `data` method.
Depends on D16687280
Reviewed By: ezgicicek
Differential Revision: D16689400
fbshipit-source-id: 156016b3c
Summary:
It adds a model of vector::push_back
Depends on D16687225
Reviewed By: ezgicicek
Differential Revision: D16687269
fbshipit-source-id: 9d2a73fca
Summary:
It enables pruning of vector's size when the return value of the function call of `vector::size` is pruned.
Depends on D16687167
Reviewed By: ezgicicek
Differential Revision: D16687225
fbshipit-source-id: 793a21b3a
Summary:
It generates vector value ondemand when it is given as a parameter.
Depends on D16645589
Reviewed By: ezgicicek
Differential Revision: D16645624
fbshipit-source-id: 7498c8ab2
Summary:
These have proved to be too fragile to maintain as they would often break
compilation of user code. They have been off by default for more than a year
now (D7350715).
Removing the include models shows a more accurate picture of what infer results
look like in production. As such, lots of tests have changed, mostly
biabduction but also in inferbo. SIOF was using include-based models too but
now libc++ is better and iostreams are implemented in a way that SIOF
understands (instead of being magical creatures) so nothing changed there.
Reviewed By: skcho
Differential Revision: D16602171
fbshipit-source-id: ce38f045b
Summary: Models of Java's Collection mistakenly assumed that there was an argument for empty set whereas `Collections.emptySet()` doesn't have any actuals. This diff fixes that an also removes the type argument from the corresponding model definition.
Reviewed By: skcho
Differential Revision: D16582314
fbshipit-source-id: d4304dc60
Summary: Sometimes programmers use integer underflow to get a maximum number of that type. This diff assumes that integer underflows from the syntactical form `(unsigned 0) - constant` is intended by the programmer, and suppresses the alarms of which.
Reviewed By: ezgicicek
Differential Revision: D16560639
fbshipit-source-id: 206f30dbc
Summary:
Before this diff, it gave up pruning of linear bound by minmax bound.
For example, `overapprox_min (x+c1, c2+min(d1,y))` was `x+c1`.
However, we can get a bit more preciser value as follows.
```
overapprox_min (x+c1, c2+min(d1,y))
<= min (x+c1, c2+d1)
= c1+min(c2+d1-c1, x)
```
Reviewed By: ezgicicek
Differential Revision: D16543837
fbshipit-source-id: 8fdbce097
Summary:
This diff prevents that the latest prune value is overwritten as top
from callees.
Reviewed By: jvillard
Differential Revision: D16540391
fbshipit-source-id: bdd5b42ed
Summary:
This diff improves the precision of the mod operator.
For example, result of x % c (when x>=0 and c>0) is
(before) [0, c-1]
(after) [0, min(c-1,x)]
Reviewed By: ezgicicek
Differential Revision: D16518578
fbshipit-source-id: a68660ee7
Summary: This diff tries to do weak update for the abstract locations by pointer arithmetic, e.g. `p[n]` or `p+n`, even if the type of `p` is declared as a simple pointer, not an array.
Reviewed By: ezgicicek
Differential Revision: D16458367
fbshipit-source-id: 3b4cdd7e4
Summary:
The `represents_multiple_values` flag was adopted to decide whether updating abstract value strongly or weakly. However, the flag was included in the `Val` domain, which is strange, because it is a property of abstract locations, rather than abstract values. This makes the behavior of memory update function depend on the abstract value to update, making its code complicated.
This diff detach the `represents_multiple_values` flag from the `Value` domain, thus the memory update does not depend on the abstract value. Since this is a refactoring, I believe the diff should not make many semantic changes.
Reviewed By: ezgicicek
Differential Revision: D16441734
fbshipit-source-id: 4c10779d7
Summary:
Pulse didn't treat local variables going out of scope as invalidating the corresponding address in memory. This diff fixes that by
- marking all local variables that exits the scope with the attribute `AddressOfStackVariable`
- before we write the summary for the proc, we make sure to invalidate all such addresses local to the procedure as `Invalid.` If such an address is read, then we would raise a use-after-lifetime issue.
Reviewed By: jvillard
Differential Revision: D16458355
fbshipit-source-id: 3686524cb
Summary:
It downgrades issues of void pointer to L5, because of its impreciseness. This is not
ideal but Inferbo cannot analyze arrays pointed by void pointers precisely at the moment.
Reviewed By: jvillard
Differential Revision: D16379911
fbshipit-source-id: f2c016aba
Summary: The method defined in the interface didn't match the implementation. Caught by ulyssesr.
Differential Revision: D16339179
fbshipit-source-id: 9cbb1dc74
Summary:
A common gotcha is the new test. Model the minimum amount of
`std::basic_string` to catch it.
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121090
fbshipit-source-id: 66f06cb43
Summary:
Be more flexible in what type of function calls are allowed in `ViaCall ...` actions to be able to include models.
Also get rid of `here here` in traces /o\
As a side-effect, get more precise (=qualified) procedure names in
traces (but not in messages so as not to be too verbose).
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121092
fbshipit-source-id: fb51b02f8
Summary:
The domain supported path sensitivity wrt to a specific boolean guard `Branch.unlikely`. This isn't used in actual code, so remove it.
Also
- add an .mli to the domain;
- unabbreviate domain name to match analyser name;
- use Payload.read instead of calling Ondemand directly;
- adjust tests.
Reviewed By: mbouaziz
Differential Revision: D16203953
fbshipit-source-id: 743aa4400
Summary:
Move annotation reachability tests to their own directory.
Clean up and complete the tests.
Reviewed By: jvillard
Differential Revision: D16201387
fbshipit-source-id: 8a87a25b7
Summary:
Treat `MainThread` and `WorkerThread` annotations.
Fix wrong test (`AnyThread` cannot call a UI-only method, because it can be called by ANY thread ;) ) See https://developer.android.com/reference/android/support/annotation/AnyThread
Clean up the code a bit.
Reviewed By: jvillard
Differential Revision: D16183798
fbshipit-source-id: 6b7e3b27e
Summary: There were FNs caused by only looking for the immediate predecessors when we were checking the pattern. This diff heuristically chases 4 more predecessors to reduce the number of FNs.
Reviewed By: ngorogiannis
Differential Revision: D16149983
fbshipit-source-id: f65c57a0a
Summary: Adding typechecks to prevent potential FPs like the added test
Reviewed By: ngorogiannis
Differential Revision: D16149511
fbshipit-source-id: 6d3fe0ad4
Summary:
Replaced by pulse. `--ownership` is now a deprecated form of `--pulse`.
The ownership checker is starting to give wrong answers due to changes in the
clang frontend, so it's better to remove it in favour of pulse.
there_goes_my_hero
Reviewed By: ngorogiannis
Differential Revision: D16107650
fbshipit-source-id: bb2446a19
Summary:
So it turns out we need to translate even more cases. Pulse had a FP
before that this fixes.
Reviewed By: ezgicicek
Differential Revision: D16073629
fbshipit-source-id: c03460b5a
Summary:
This is needed to test some functionality in the next diff. Only one
test changes (no longer a FN), which is now documented. Also, stop
including the "header models" meant for biabduction!
Maybe one day we'll need to have several test modes for different C++
versions. Seems overkill for now, so let's wait until we see some actual
issues (eg FPs) that manifest in one version but not the other.
Reviewed By: mbouaziz
Differential Revision: D16073630
fbshipit-source-id: 1cfdfc933
Summary:
Sometimes the post of a function call has attributes on addresses that
were mentioned in the pre but are no longer reachable in the post. We
don't want to forget these, see added test.
Reviewed By: mbouaziz
Differential Revision: D16050050
fbshipit-source-id: 1ce522b97
Summary:
The previous code would call the destructor for the C++ temporary
*before* the prune nodes, which then try to dereference it. Wrong.
Quick fix: don't destroy temporaries in conditionals.
Reviewed By: mbouaziz
Differential Revision: D16030735
fbshipit-source-id: e11abad58
Summary:
We were skipping some instructions before and that was a problem for
pulse. See added pulse test.
Reviewed By: mbouaziz
Differential Revision: D16030150
fbshipit-source-id: 9c62e6213
Summary: Not sure if anyone uses this but there, now it's modelled.
Reviewed By: mbouaziz
Differential Revision: D16008162
fbshipit-source-id: f4795dcba
Summary:
Prevent false positives about variables captured by value gone out of
scope.
Reviewed By: ezgicicek
Differential Revision: D16008165
fbshipit-source-id: d70e47db4
Summary: We know how to do interprocedural calls so let's use that!
Reviewed By: mbouaziz
Differential Revision: D16008164
fbshipit-source-id: 4c34bf704
Summary:
`function::operator=` is called whenever we assign a literal lambda to a
variable, so it's pretty useful to be able to report anything on
lambdas.
Reviewed By: mbouaziz
Differential Revision: D16008163
fbshipit-source-id: a9d07668d
Summary:
Printing `Exp.Const (Cfun proc_name)` adds `_fun_` in front of the
procedure name, eg `_fun_foo` instead of `foo`. This showed up in pulse
traces.
Reviewed By: mbouaziz
Differential Revision: D16004606
fbshipit-source-id: 72ac6866f
Summary:
Fixes a false positive where the address of a C++ temporary is bound to
a static const reference variable then returned. The fix doesn't try to
establish that the variable is a const reference so could lead to false
negatives but that can be addressed later.
Reviewed By: ezgicicek
Differential Revision: D16004538
fbshipit-source-id: e403dbefe
Summary:
[apologies for the unreviewable diff...]
Get rid of HIL expressions in pulse. This finishes the HIL -> SIL
migration. The first step made pulse start from SIL instructions but
would translate most accesses to HIL to re-use most of the existing
pulse code. This diff gets rid of the intermediate translation of SIL
expressions to HIL expressions.
Big changes:
1. `PulseOperations` mostly rewritten, driven by using `Exp.t` instead of `HilExp.AccessExpression.t` for everything.
2. Stop trying to reverse-engineer what addresses mean in terms of
access paths from program variables. Rely on the trace pointing at
the right places in the code to be enough. This is because it wasn't
that useful (and could even be misleading when wrong) but could be
prohibitively expensive in degenerate cases (eg nodes with tens of
thousands of successive array accesses...)
3. `PulseAbductiveDomain.apply_post` now returns the computed return
value instead of recording it itself.
4. Change of vocabulary: `materialize` -> `eval`, `crumb` -> `event`
5. Function calls arguments are now evaluated prior to doing anything
else, which saves everything else from having to (remember to) do
that. In particular, this changes how models look quite a bit.
Reviewed By: mbouaziz
Differential Revision: D15986373
fbshipit-source-id: 1d79935de
Summary: Inject destructor calls to destroy a temporary when its lifetime ends.
Reviewed By: mbouaziz
Differential Revision: D15674209
fbshipit-source-id: 0f783a906
Summary:
Now that HIL doesn't help us anymore we need to reconstruct its mapping
"SIL logical var -> program access path". We already have everything we
need in pulse: it suffices to walk the current memory graph starting
from program variables until we find the value of the temporary we are
interested in.
This diff also builds some type machinery to make sure all accesses are
explained.
Reviewed By: mbouaziz
Differential Revision: D15824959
fbshipit-source-id: 722c81b39
Summary:
It turns out HIL gets in the way of a precise heap analysis. For
instance, instead of:
```
n$0 = *&x.f
_ = delete(&x)
*&y = n$0
```
HIL tries hard to forget about intermediate variables and shows instead
```
_ = delete(&x)
*&y = *&x.f
```
Oops, that's a use-after-delete, whereas the original code was safe.
While it's easy to write SIL programs that are completely unsound for
HIL, they are not generated very often from the frontends. In fact, the
problem became apparent only when making the clang frontend translate
C++ temporaries destructors, which produces the situation above
routinely.
This diff makes the minimal amount of change to make Pulse build and
produce equivalent results (minus HIL bugs) starting from SIL instead of
HIL. The reporting sucks for now because we need to translate SIL
temporaries back into program access paths. This is done in the next
diff.
Reviewed By: mbouaziz
Differential Revision: D15824961
fbshipit-source-id: 8e4e2a3ed
Summary:
- Add allocation costs to `costs-report.json` and enable diffing over allocation costs.
- Also, let's be more consistent and modular in naming our cost issues.
- introduce a generic issue type `X_TIME_COMPLEXITY_INCREASE` where `X` can be one of the cost kinds. If the function is on the cold start, issue can have the `COLD_START` suffix. Similarly for infinite/zero/expensive calls.
- Change `PERFORMANCE_VARIATION` -> `EXECUTION_TIME_COMPLEXITY_INCREASE`
- Add new issue type for `ALLOCATION_COMPLEXITY_INCREASE_COLD_START` which will be enabled by default
- Refactor cost issues to be more modular and succinct. This also makes addition of a new cost kind very easy by adding the kind into the `enabled_cost_kinds` list in `CostKind.ml`
Reviewed By: mbouaziz
Differential Revision: D15822681
fbshipit-source-id: cf89ece59
Summary:
This one isn't caught because we don't destruct temporaries that are
bound to a const reference. According to the C++ standard these should
get destroyed when the const reference gets destroyed but instead we
just don't destroy them for now.
Reviewed By: mbouaziz
Differential Revision: D15760209
fbshipit-source-id: 32c935ec0
Summary:
In a next diff temporaries will get destructed at the end of their
lifetimes and that naive model would be causing false positives.
The flipside is that we lose all reports on closures for now, will need
to model them separately later.
Reviewed By: mbouaziz
Differential Revision: D15695943
fbshipit-source-id: c2c482c02
Summary:
This started as an attempt to understand how to modify the frontend to
inject destructors for C++ temporaries (see next diffs).
This diff rewrites the existing logic for computing the list of
variables that should be destroyed at the end of each statement, either
because it's the end of their syntactic scope or because control flow
branches outside of their syntactic scope.
The frontend translates a function from the last instructions to the
first, but scope computation needs to be done in the other direction, so
it's done in a separate pass *before* the main translation happens. That
first pass creates a map from statements in the AST to the list of
variables that should be destroyed at the end of these statements. This
is still the case now.
Before, that map would be computed in a bit of a weird way: scopes are
naturally a stack but instead of that the structure maintained was a
flat list + a counter to know where the current scope ended in that
list.
In this diff, redo the computation maintaining a stack of scopes
instead, which is a bit cleaner. Also treat more instructions as
introducing a new scope, eg if, for, ...
Reviewed By: mbouaziz
Differential Revision: D15674208
fbshipit-source-id: c92429e82
Summary:
Somewhat trivial: add a string to "Destruction" nodes to indicate why
they were created. Rename the main `instruction_aux` function into
`instruction_translate` (see next diff for why).
Reviewed By: mbouaziz
Differential Revision: D15674211
fbshipit-source-id: 8a7eda72c
Summary:
I rewrote the test so it doesn't need any C++ headers so that:
- it's easier to see what's going on
- it's easier to debug: the whole AST is now somewhat readable vs before
the headers made it impossibly long
Reviewed By: ezgicicek
Differential Revision: D15674213
fbshipit-source-id: d98941983
Summary:
This is a simple checker that identifies inefficient uses of `keySet` iterator where (not only the key but also) the value is accessed via `get(key)`. It is more efficient to use `entrySet` iterator which already returns both key-value pairs. This optimization would get rid of many extra lookups which can be expensive.
We simply traverse the CFG starting from the loop head upwards and pick up the map that is iterated over. Then, we check in the loop nodes if there is a call to `get(...)` over this map. If, so we report.
Reviewed By: ngorogiannis
Differential Revision: D15737779
fbshipit-source-id: 702465b4e
Summary:
The synthetic methods from `topl.Property` are now nonempty: they
simulate a nondeterministic automaton.
Reviewed By: jvillard
Differential Revision: D15668471
fbshipit-source-id: 050408283
Summary:
Instrument SIL according to TOPL properties. Roughly, the
instrumentation is a set of calls into procedures that simulate a
nondeterministic automaton. For now, those procedures are NOP dummies.
Reviewed By: jvillard
Differential Revision: D15063942
fbshipit-source-id: d22c2f6fa
Summary:
It is unsafe to call protocol methods defined optional. Before calling them we should check it
the implementation exists by calling
`if ([object respondsToSelector:selector(...)]) ...`
Without the above check we get run time crashes.
Reviewed By: jvillard
Differential Revision: D15554951
fbshipit-source-id: f0560971b
Summary: In its new form it actually tests that infer takes the correct branch.
Reviewed By: mbouaziz
Differential Revision: D15494297
fbshipit-source-id: 7b9bb8f75
Summary:
- take advantage more structured attributes in the exported AST
- circumvent new format of `if` and `switch`
- a few new features/nodes but nothing major there
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz, martintrojer
Differential Revision: D15453572
fbshipit-source-id: c0c24345f
Summary:
- Rename `invariantModels` to `purityModels`
- Track which arguments are modified in purity models. Before we were invalidating all arguments of impure modeled functions. Instead, now we only invalidate modified args given in the model. This should ideally result in more precision in the analysis.
- Add some more purity models for :`cast`, `new`, `new_array` and `Math.random`
Reviewed By: mbouaziz
Differential Revision: D15535332
fbshipit-source-id: 5395800d9
Summary:
That test wasn't hooked up to `make test` and so regressed at some
unknown time in the past. Just recording the new state of things for
now.
Reviewed By: ngorogiannis
Differential Revision: D15495234
fbshipit-source-id: 14fb112de
Summary:
Thanks to the newly added `StarField`, path length is better controlled before ondemand is used.
Hence there is no need to (unsoundly) canonicalize paths then anymore.
Reviewed By: ezgicicek
Differential Revision: D15409716
fbshipit-source-id: 9ea7b4717
Summary:
This messes with the deduplication heuristic when templated function
names show up in the error messages, since the heuristic demands that
the error messages are the same.
Reviewed By: mbouaziz
Differential Revision: D15374333
fbshipit-source-id: 70232d254
Summary:
Improve the error messages, change is more or less documented in the
code.
Reviewed By: mbouaziz
Differential Revision: D15374334
fbshipit-source-id: f1dd54180
Summary:
Some edge case involving casting field pointers to the structure type itself generated arbitrarily long paths when used in a loop.
Without changing the widening, this diff avoids repetitions of fields in paths by abstracting them with a star.
E.g. `x.a.b.c.b` will become `x.a.b.c*.b`, and so will `x.a.b.c.a.b`, `x.a.b.c.c.b`, or `x.a.b.c.b.b`.
Reviewed By: ngorogiannis
Differential Revision: D15352143
fbshipit-source-id: 5ea426c5e
Summary:
Before: the trace would explain how a value was invalidated and
accessed, but not how the value that was invalidated had been
constructed.
Now: `PulseTrace.t` records breadcrumbs of how the value was constructed
in addition to the interproc "action" trace leading to the invalidation
or access action.
Concretely:
```
void bad(X &x) {
X *y = x;
X *z = x;
delete y;
access(z);
}
```
will produce the trace:
Invalidation part:
y = x
delete y
Access part:
z = x
access(z)
access to z->f inside of access(z)
Before this diff the "Access part" would be missing the "z = x" part of
the trace, so it might be confusing why `z` has anything to do with `y`.
However, such "breadcrumbs" are not recorded in the inter-procedural
part, only the sequence of calls is. This is a trade-off for simplicity,
maybe it's enough for developers maybe it isn't, we'll find out later.
Reviewed By: jberdine
Differential Revision: D15354438
fbshipit-source-id: 8d0aed717
Summary:
update-submodule: facebook-clang-plugins
We used to translate `offsetof` by an unknown value.
This fixes it. It is now translated like an integer literal.
Reviewed By: ddino
Differential Revision: D15317799
fbshipit-source-id: ae89e0ec5
Summary:
Feedback from peterogithub:
- mention which access path is being invalidated and accessed in the message
- mention the line at which it was invalidated (the line at which it's accessed is already the line at which we report)
- traces for stack variable/C++ temporary address escapes
- delete double implementation of the same functionality in
`PulseTrace`: `location_of_action_start` is the same as
`outer_location_of_action`...
Reviewed By: jberdine
Differential Revision: D14800294
fbshipit-source-id: 3d9ab9b3d
Summary:
Similarly to function parameters (and the return value), we need to
apply the pre/post of a function call to the globals mentioned in its
summary.
- tigthen summaries further to remember only abducible variables in the
post (as well as in the pre)
- take globals into account when applying pre/post pairs
Reviewed By: jberdine
Differential Revision: D14780800
fbshipit-source-id: fc0d180bb
Summary:
The heuristic to detect variables going out of scope was to detect any
access expression passed as argument to an injected destructor call.
However destructor calls are also injected in destructor bodies to
destruct each field of an object, so the heuristic would detect fields
going out of scope, which, erm, doesn't make sense. Limit the heuristic
to local program variables.
Reviewed By: jberdine
Differential Revision: D14771454
fbshipit-source-id: ffa3c9fe3
Summary:
Only throw values to the pre if they can be followed from "abducible"
variables: formals of the current method and globals.
Because figuring out if a `Pvar.t` is a formal of the current procedure
is actually a giant pain, hack something not too bad instead:
pre-register all formals at the start of the analysis of the
procedure. Then the only other variables we care about in the
precondition are globals, which we can detect easily.
This is mostly an optimisation (summaries won't include irrelevant
"abduced" facts about the procedure's local variables anymore), but it
also fixes a bug where we would sometimes overwrite things in the pre. I
think that's why the tests improved.
Reviewed By: ngorogiannis
Differential Revision: D14753493
fbshipit-source-id: 08e73637f
Summary:
This mostly doesn't make sense. The only thing this would have been good
for was to give the most accurate result on access paths such as
`*(&(x.f))`, but these are normalised anyway (into `x.f`) so we actually
never see these. That said there might be some use to some similar logic
in the future, but in the meantime let's delete the current feature as
it wasn't thought through.
Reviewed By: ezgicicek
Differential Revision: D14753492
fbshipit-source-id: 597cec027
Summary:
The previous message formatting had regressed and produced non-sensical messages.
More importantly, remove template parameters from error messages to
trigger the heuristic in `InferPrint` that deduplicates errors that are
on the same line with the same error type and message. Without this we
get hundreds of reports that correspond to as many instantiations of the
same code.
Reviewed By: ngorogiannis
Differential Revision: D14747979
fbshipit-source-id: 3c4aad2b1
Summary:
We see the magic function `__variable_initialization` at the point where
the variable is declared, eg `int x = foo()`. It's safe to reset `&x` at
that point. This circumvents an issue that pops up in some rare cases
where the ternary conditional operator `?:` and variable initialization
conspire to produce weird frontend results.
Some test becomes a FN again, but I think it was being reported for the
wrong reasons; will investigate more later.
Reviewed By: ngorogiannis
Differential Revision: D14747980
fbshipit-source-id: e75d6e30f
Summary:
This ensures that each attribute type can only be present once per
address. Makes ~80x time improvement on pathological cases such as
Duff's device.
This introduces a new kind of Set in `PrettyPrintable`.
Reviewed By: mbouaziz
Differential Revision: D14645091
fbshipit-source-id: c7f9b760c
Summary:
Detect when a variable goes out of scope. When that's the case, mark its
address *and* its contents as invalid.
Give subsequent uses a USE_AFTER_LIFETIME error type instead of
USE_AFTER_DESTRUCTOR.
Reviewed By: jberdine
Differential Revision: D14387147
fbshipit-source-id: a2c530fda
Summary:
Increases precision a bit. I didn't observe speed problems on what I tested. (But, who knows?)
Closes https://github.com/facebook/infer/pull/799
Reviewed By: jvillard
Differential Revision: D6284206
Pulled By: rgrig
fbshipit-source-id: 6f1e8631f
Summary:
Instead of emitting an ad-hoc builtin on variable declaration emit a new
metadata instruction. This allows us to remove the code matching on that
ad-hoc builtin that had to be inserted in several checkers.
Inferbo & pulse used that information meaningfully and had to undergo
some minor changes to cope with the new metada instruction.
Reviewed By: ezgicicek
Differential Revision: D14833100
fbshipit-source-id: 9b3009d22
Summary:
Bundle all non-semantic-bearing instructions into a `Metadata _`
instruction in SIL.
- On a documentation level this makes clearer the distinction between
instructions that encode the semantics of the program and those that are
just hints for the various backend analysis.
- This makes it easier to add more of these auxiliary instructions in
the future. For example, the next diff introduces a new `Skip` auxiliary
instruction to replace the hacky `ExitScope([], Location.dummy)`.
- It also makes it easier to surface all current and future such
auxiliary instructions to HIL as the datatype for these syntactic hints
can be shared between SIL and HIL. This diff brings `Nullify` and
`Abstract` to HIL for free.
Reviewed By: ngorogiannis
Differential Revision: D14827674
fbshipit-source-id: f68fe2110
Summary:
This diff propagates LatestPrune on function calls.
Depends on D14321605
Reviewed By: mbouaziz
Differential Revision: D14321618
fbshipit-source-id: cb2e1b547
Summary:
Given a pointer-typed parameter, Inferbo assumes that it is an array
block. However, when a pointer is given as an actual parameter, it
failed the substitution of the array block value of the parameter, thus
which made some return values to bottom unexpectedly.
This diff revises the substitution of array block, so it can
substitute array block values with actual pointers correctly when it
is possible.
Reviewed By: mbouaziz
Differential Revision: D14663475
fbshipit-source-id: 0477de1ba
Summary:
This diff accumulates LatestPrune in sequential prunings. It should be sound since Inferbo invalidates some data of LatestPrune if they are updated.
Depends on D14321534
Reviewed By: mbouaziz
Differential Revision: D14321575
fbshipit-source-id: 233dbae32
Summary:
Some of these tests were wrong, eg `~lambda()` calls `lambda()` then...
takes the bitwise complement or something? The intent was to call the
destructor.
Add interprocedural tests for later.
Reviewed By: jberdine
Differential Revision: D14324762
fbshipit-source-id: 40d2c32f5
Summary:
Previously we would say that `lhs <= rhs` (or `lhs |- rhs`) when a
mapping existed between the abstract addresses of `lhs` and `rhs` such
that `mapping(lhs)` was a supergraph of `rhs`. In particular,
we had that `x |-> x' * x' |-> x'' |- x |-> x'`. This is not entirely
great, in particular once we get pairs of state representing footprint +
current state. I'm not sure I have an extremely compelling argument why
though, except that it's not the usual way we do implication in SL, but
there wasn't a compelling argument for the previous state of affairs
either.
This changes `|-` to be true only when `mapping(lhs) = rhs` (modulo only
considering the addresses reachable from the stack variables).
Reviewed By: jberdine
Differential Revision: D14568272
fbshipit-source-id: 1bb83950e
Summary: This helps convergence when `<=` is based on physical equality for example, and widening is implemented as `widen ~prev ~next = join prev next`.
Reviewed By: skcho
Differential Revision: D14568270
fbshipit-source-id: ded5ed296
Summary:
Re-declarations of global variables sometimes hide constant
initializations in the original declaration, which caused FN before.
In this diff, it translates global variables to point to original
declarations, rather than following re-declarations, if possible.
Reviewed By: mbouaziz, jvillard
Differential Revision: D14596301
fbshipit-source-id: 55c3b5f95
Summary: In SIL, sometimes a return value is assigned to `__return_param`.
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D14538590
fbshipit-source-id: dfbb74dc2
Summary: This diff substitutes symbolic values for unknown functions in proof obligations to top. The goal of the diff is to avoid generating too many number of proof obligations that cannot be concretized.
Reviewed By: ezgicicek
Differential Revision: D14537542
fbshipit-source-id: 7f8f3bb4b
Summary:
TOPL properties are essentially automata, which specify a bad pattern.
This commit is just a parser for them.
Reviewed By: jvillard
Differential Revision: D14477671
fbshipit-source-id: c38a8ef37
Summary:
Add support for GuardedBy: we deviate from the spec as follows:
- No warnings issued for any access within a private method, unless that method is called from a public method and the lock isn't held when the access occurs.
- Warnings are suppressed with the general RacerD mechanism, ie `ThreadSafe(enableChecks=false)`
- GuardedBy warnings override thread-safety violation warnings on the same access, because GuardedBy has a clearer and simpler contract.
Also, some simplifications, cleanups and perf improvements (eg avoid unreportable procs at the top level as opposed to on each of their accesses).
Reviewed By: jeremydubreil
Differential Revision: D14506161
fbshipit-source-id: b7d794051
Summary:
While adding a footprint frame during rearrangement, the footprint
variables should be fresh with respect to the current state too, not
only with respect to he footprint, because the frame is added to the
state.
Reviewed By: jberdine
Differential Revision: D14401026
fbshipit-source-id: 20ea4485a
Summary:
To meet the pure parts of formulas, the process was to (a) call Rename.extend
with variables occuring in similar places and (b) extract substitutions out of
those. Two matching primed vars would both be replaced by some fresh primed var.
However, equivalence classes of primed variables would *not* be replaced by
one fresh (primed) variable. Now, that should work.
Reviewed By: mbouaziz
Differential Revision: D14150192
fbshipit-source-id: 90ca9216c
Summary:
This will be used in the future to determine what to do with destructors
in pulse.
Reviewed By: mbouaziz
Differential Revision: D14324759
fbshipit-source-id: bc3c34471
Summary:
This seems generally useful. Force people to do it in the future even if
they want to avoid having to update the frontend tests.
Reviewed By: mbouaziz
Differential Revision: D14324758
fbshipit-source-id: cdef3f72a
Summary:
When joining two lists of disjuncts we try to ensure there isn't a state
that under-approximates another already in the list. This helps reduce
the number of disjuncts that are generated by conditionals and loops.
Before we would always just add more disjuncts unless they were
physically equal but now we do a subgraph computation to assess
under-approximation.
We only do this half-heartedly for now however, only taking into
consideration the "new" disjuncts vs the "old" ones. It probably makes
sense to do a full quadratic search to minimise the number of disjuncts
from time to time but this isn't done here.
Reviewed By: mbouaziz
Differential Revision: D14258482
fbshipit-source-id: c2dad4889
Summary: Unknown locations in the alias domain resulted in unexpected unreachable code.
Reviewed By: mbouaziz
Differential Revision: D14339412
fbshipit-source-id: a5dca6489
Summary:
The disjunctive domain shouldn't really be a set in the first place as
comparing abstract states for equality is expensive to do naively
(walking the whole maps representing the abstract heap). Moreover in
practice these sets have a small max size (currently 50 for pulse, the
only client), so switching them to plain lists makes sense.
Reviewed By: mbouaziz
Differential Revision: D14258489
fbshipit-source-id: c512169eb
Summary:
It's useful to keep the size of states down, especially when humans are
trying to read it. It will also help keep the size of summaries down in
the inter-procedural pulse.
Reviewed By: mbouaziz
Differential Revision: D14258486
fbshipit-source-id: 45ebcac67
Summary: After a redeclaration of a global constant, it is not parsed as ICE(integral constant expression), which results in FN.
Reviewed By: ezgicicek
Differential Revision: D14299288
fbshipit-source-id: 394afd595
Summary:
It assigns symbolic values for global variables in the load commands. However, it does not instantiate the symbols for the global variables yet, which will be addressed in another diff.
Depends on D14208643
Reviewed By: ezgicicek
Differential Revision: D14257619
fbshipit-source-id: f9113c8a3
Summary:
the predicate to check that a decl is const was not working for VarDecl.
This diff fixes this
Reviewed By: jvillard
Differential Revision: D14106798
fbshipit-source-id: 1f6c24113
Summary:
This diff differentiates proof obligations by allocsites. Sizes and
offsets of arrays were joined when making proof obligations.
Reviewed By: mbouaziz
Differential Revision: D14163149
fbshipit-source-id: cb6608c16
Summary:
The Eradicate backend is reporting nullable type errors, that are not always necessarily leading to null pointer exceptions.
For example, the analysis is designed to be consistent with the Java type system and report on the following code:
String foo(boolean test) {
Object object = test ? new Object() : null;
if (test) {
return object.toString(); // the analysis reports here
}
}
even though the code will not crash.
In order to make this aspect clear, this diff renames the warnings `Null Method Call` and `Null Field Access` into `Nullable Dereference`
Reviewed By: ngorogiannis
Differential Revision: D14001979
fbshipit-source-id: ff1285283
Summary:
This diff adds a constant to the set of widening thresholds if the
constant is compared to an abstract value in condition expressions.
Each abstract value has its own set of thresholds.
Reviewed By: mbouaziz
Differential Revision: D14147150
fbshipit-source-id: ca0db34d4
Summary:
In this diff, it avoids a precision-losing pruning, which was needed
to keep effects of assume commands.
```
unsigned int c = a + b; // (1)
if (c > 0) { // (2)
char result[c];
result[c - 1] = 0; // (4)
}
```
For example, in the example, `c` is assigned by `[a+b,a+b]` at (1),
then it tried to prune the lower bound of `c` to 1 at (2) while losing
precision, in order to say `c - 1` at (4) is safe in terms of integer
underflow. Instead, it could not say that `c - 1` is smaller than `c`
in the buffer access, because the former is analyzed to `[0,a+b-1]` and
the latter `[1,a+b]` at (4).
Now, the situation has changed. By adopting conditional proof
obligation (D13749914), the FP of integer overflow can be suppressed
without the precision-losing pruning.
Reviewed By: mbouaziz
Differential Revision: D14122770
fbshipit-source-id: 634744e99
Summary: Since Inferbo's current min/max domain can keep only a single symbol, e.g. min(constant, symbol) , it loses precision when trying to prune a value including multiple symbols.
Reviewed By: ezgicicek
Differential Revision: D14099399
fbshipit-source-id: 71d677b75
Summary: Record where each symbol in a polynomial is coming from: either a loop, function call or a modeled call.
Reviewed By: mbouaziz
Differential Revision: D14047420
fbshipit-source-id: 56d0bd926
Summary: It keeps alias of simple plus/minus arithmetic in order to pruning the value of "++i" expression.
Reviewed By: mbouaziz
Differential Revision: D14080230
fbshipit-source-id: d3af32a32
Summary: In SIL, Java's array member is a pointer to an array, while C++'s is the array itself. This diff differentiate them in evaluating abstract locations.
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D14021451
fbshipit-source-id: 00f14fe3b
Summary:
- There is no need to use AI to compute a dot product: let's just fold over all nodes, but still do it in order (using the WTO) to report at the right place
- The previous version was computing a dot product on nodes for each node, which was quadratic, the new version is linear
- Report only once, the first time the threshold is reached (if in a loop, report at the loop head)
Reviewed By: ddino
Differential Revision: D14028171
fbshipit-source-id: b4a840c6e
Summary:
Add an option to specify some classes that we really want to warn about
with the liveness checker, even when they appear used because of the
implicit destructor call inserted by the compiler.
Reviewed By: mbouaziz
Differential Revision: D13991129
fbshipit-source-id: 7fafdba84
Summary:
Get rid of false positive as in the test by modelling `Double`. Longer term we
should probably prevent biabduction from blocking the angelic analysis on
`Nullable` fields but that seems harder.
Reviewed By: jeremydubreil
Differential Revision: D14005228
fbshipit-source-id: 59ef2ed66
Summary:
The purpose these serve is unclear to me. From the comment I *think*
they were used to hint to the biabduction backend that smart pointers
are just pointers. That said, The tests still mostly pass even without
that (just a few `weak_ptr` tests changed from `NULL_DEREFERENCE` to
`Bad_footprint`).
Moreover, this extra dereference was added unreliably. For instance,
this piece of code:
```
auto x = std::make_unique<X>(some_X);
```
would either get the extra dereference or not depending on which headers
were picked for the C++ stdlib.
The extra dereference was tripping up the liveness checker (see later in
the stack), and probably most checkers too.
Reviewed By: mbouaziz
Differential Revision: D13991130
fbshipit-source-id: 462923595
Summary: When a `VarDecl` has the attribute `unused` then do not assign its initialisation result to the corresponding variable.
Reviewed By: ngorogiannis
Differential Revision: D13974497
fbshipit-source-id: 28029f995
Summary: The Makefile was missing that target and making `make test-replace` at the root of the repo fail.
Reviewed By: ngorogiannis
Differential Revision: D13990483
fbshipit-source-id: 805b5d2a9
Summary:
If the result of `fgets` is not-null, the length of `s` should be
bigger than or equal to 1.
Reviewed By: mbouaziz
Differential Revision: D13939994
fbshipit-source-id: 298fe33f4
Summary:
This diff updates the reachability conditions of proof obligations at every function calls.
Depends on D13781124
Reviewed By: mbouaziz
Differential Revision: D13781147
fbshipit-source-id: 3c8768bd9
Summary:
It adds a semantics for evaluation of abstract location of literal
string, which was missing.
Reviewed By: mbouaziz
Differential Revision: D13915265
fbshipit-source-id: 741df843b
Summary: We do not want to export unnecessary information from clang plugin, hence, the solution to this false positive would be to annotate with `__unused__` attribute.
Reviewed By: jvillard
Differential Revision: D13861333
fbshipit-source-id: 774009f37
Summary:
This will allow to get the numerical results for Cost, Hoisting, Purity without the Inferbo issues.
For now, I still forced Inferbo issues for Cost and Purity to avoid lots of changes in tests, that will go away soon.
Reviewed By: ezgicicek, skcho
Differential Revision: D13826741
fbshipit-source-id: 796d1a50d
Summary:
This diff extends the abstract domain to keep binary conditions on
prunings, so Inferbo can suppress more proof obligations (i.e., false
positives) that are known to be unreachable according to the binary
conditions.
Depends on D13729600
Reviewed By: mbouaziz
Differential Revision: D13749914
fbshipit-source-id: 314f048f1
Summary:
`memcpy` should copy the contents of the source to the destination.
Depends on D13634754
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D13668414
fbshipit-source-id: cb0ff2010
Summary: It extends the abstract location for C string length, i.e., the first index of the null character in character array.
Reviewed By: mbouaziz
Differential Revision: D13634241
fbshipit-source-id: d2727d5f5
Summary: Publish solutions to the lab, and a Docker file and image to get started more quickly with infer hacking.
Reviewed By: mbouaziz
Differential Revision: D13648847
fbshipit-source-id: daf48ad03
Summary:
This diff prevents deduplications of issues when they have different
conditions to reach.
Reviewed By: mbouaziz
Differential Revision: D13596220
fbshipit-source-id: 95f802edb
Summary:
In ObjC there are no access modifiers. The strongest alternative is to put methods in the implementation but omit them from the interface declaration.
Put exported ObjC methods in their own field in the class structure and use that in RacerD to decide whether to report on the method.
Reviewed By: mbouaziz
Differential Revision: D13597504
fbshipit-source-id: c4a3d2705
Summary:
It suppresses intended integer overflows that generate hash values or random numbers. For judging that the intention of integer overflow, it applies a heuristics: checking if traces of issues include a whitelisted words, e.g., "rand" or "seed".
While we would be able to suppress all integer overflows of unsigned integers since they have defined behaviors, we don't want to miss unintended integer overflows, e.g., that on unsigned index value.
Depends on D13595958
Reviewed By: mbouaziz
Differential Revision: D13595967
fbshipit-source-id: 8d3aca5a7
Summary:
Record per-location traces. Actually, that doesn't quite make sense as a
location can be accessed in many ways, so associate a trace to each
*edge* in the memory graph. For instance, when doing `x->f = *y`, we
want to take the history of the `<val of y> --*--> ..` edge, add "assigned
at location blah" to it and store this extended history to the edge
`<val of x> --f--> ..`.
Use this machinery to print nicer traces in `infer explore` and better
error messages too (include the last assignment, like biabduction
messages).
Reviewed By: da319
Differential Revision: D13518668
fbshipit-source-id: 0a62fb55f
Summary:
This diff substitutes the conditions of proof obligations strictly, so that the condition of "p!=Null" becomes bottom
when callee's p is Null.
In the non-strict substitution (which is used by default), if p's location is not found it returns the unknown location.
On the other hand, in the strict substitution (which is used only in the substitution of condition of proof obligation),
it returns bottom location.
Depends on D13415366, D13414636
Reviewed By: mbouaziz, jvillard
Differential Revision: D13415377
fbshipit-source-id: 5ae1e888e
Summary: This diff unset powloc and arrayblk values of p when assume(p==Null).
Reviewed By: mbouaziz, jvillard
Differential Revision: D13415366
fbshipit-source-id: a491a957f
Summary:
For abstract values representing one concrete value, create only one symbol instead of two.
Still create two symbols (lb, ub) for abstract values representing multiple concrete values (like array cells).
As a consequence, comparisons of symbolic values are more precise (we can even prove equality). I expect to remove a bunch of FPs.
Another consequence is the disappearance of `.lb` and `.ub` in many reports.
Reviewed By: skcho
Differential Revision: D13072084
fbshipit-source-id: 9bc0b9881
Summary: Adding a test case for use after destructor for temporaries. At the moment pulse does not find it as frontend does not inject destructors for temporaries.
Reviewed By: jvillard
Differential Revision: D13506229
fbshipit-source-id: 31b9466f7
Summary:
It weakens canonical path in order to avoid an explosion of locations when a struct type has pointers to struct.
For example:
```
struct Tree {
struct Tree *root;
struct Tree *left;
struct Tree *right;
};
```
It was able to generate lots of abstract locations before this diff:
```
t->root
t->left
t->right
t->root->left
t->root->right
t->left->root
t->left->right
t->right->root
t->right->left
t->root->left->right
t->root->right->left
t->left->root->right
t->left->right->root
t->right->root->left
t->right->left->root
```
By this diff, pointer fields that have the same type are (unsoundly) canonicalized to the same one. For example,
```
t->root
t->root->left
t->root->right
t->root->left->right
t->root->right->left
```
are canonicalized to `t->root`. This is definitely unsound but I believe it is better than the location explosion by which analyses do not terminate in a reasonable time or giving a fixed limit of depth to the field access.
Reviewed By: mbouaziz
Differential Revision: D13503808
fbshipit-source-id: 867018712
Summary:
When a C++ temporary goes out of scope, tag its address in the heap with
a new attribute `AddressOfCppTemporary` so that we can later check that
we don't return it.
Reviewed By: da319
Differential Revision: D13466898
fbshipit-source-id: 8808338b4
Summary:
When assign to the special `return` variable, check that the result is
not the address of a local variable, otherwise report.
Reviewed By: ngorogiannis
Differential Revision: D13466896
fbshipit-source-id: 465da7f13
Summary: Naming it `FP_` was a mistake in the original commit that copied the tests over as pulse has never reported on that method.
Reviewed By: da319
Differential Revision: D13465324
fbshipit-source-id: f8b24ebda
Summary:
In order to avoid FPs due to lack of relational info, we apply a heuristic: proof obligations has a latest pruned values,
then it is instantiated at Call statements. If there is a bottom value in the instantiated pruned values, we can say the
program point where the proof obligation is introduced is unreachable with the given parameters of the function.
Depends on D13414441
Reviewed By: mbouaziz
Differential Revision: D13414483
fbshipit-source-id: 61bd34ebb
Summary:
It's ok to take an address of a field / array access of an invalid object.
This diff calculates the inner most dereference for an access expression starting with `&` and does not report on the dereference even if the address is invalid.
Reviewed By: jvillard
Differential Revision: D13450758
fbshipit-source-id: 18c038701
Summary:
When we create Dereference edge, we also create TakeAddress back edge. This causes false positives for stack variables. When we write to a stack variable and then take its address, the resulting address is the one from the back edge of the written value. See example `push_back_value_ok`. To solve this issue, this diff changes stack to denote a map from address of variables rather than from variables.
We still have issue for fields, see example, FP_push_back_value_field_ok. To solve this, we probably need to remove back edges.
Reviewed By: jvillard
Differential Revision: D13432415
fbshipit-source-id: 9254a1a6d
Summary:
When a lambda gets created, record the abstract addresses it captures, then
complain if we see some of them be invalidated before it is called.
Add a notion of "allocator" for reporting better messages. The messages are
still a bit sucky, will need to improve them more generally at some point.
```
jul lambda ~ infer 1 infer -g --pulse-only -- clang -std=c++11 -c infer/tests/codetoanalyze/cpp/pulse/closures.cpp
Logs in /home/jul/infer.fb/infer-out/logs
Capturing in make/cc mode...
Found 1 source file to analyze in /home/jul/infer.fb/infer-out
Found 2 issues
infer/tests/codetoanalyze/cpp/pulse/closures.cpp:21: error: USE_AFTER_DESTRUCTOR
`&(f)` accesses address `s` captured by `&(f)` as `s` invalidated by destructor call `S_~S(s)` at line 20, column 3 past its lifetime (debug: 5).
19. f = [&s] { return s.f; };
20. } // destructor for s called here
21. > return f(); // s used here
22. }
23.
infer/tests/codetoanalyze/cpp/pulse/closures.cpp:30: error: USE_AFTER_DESTRUCTOR
`&(f)` accesses address `s` captured by `&(f)` as `s` invalidated by destructor call `S_~S(s)` at line 29, column 3 past its lifetime (debug: 8).
28. f = [&] { return s.f; };
29. }
30. > return f();
31. }
32.
Summary of the reports
USE_AFTER_DESTRUCTOR: 2
```
Reviewed By: da319
Differential Revision: D13400074
fbshipit-source-id: 3c68ff4ea
Summary: Model more `std::vector` functions that can potentially invalidate references to vector's elements (https://en.cppreference.com/w/cpp/container/vector).
Reviewed By: mbouaziz
Differential Revision: D13399161
fbshipit-source-id: 95cf2cae6
Summary:
Some code calls `this->~Obj()` then proceeds to use fields in the current
object, which previously we would report as invalid uses. Assume people know
what they are doing and ignore destructor calls to `this`.
Reviewed By: mbouaziz
Differential Revision: D13401145
fbshipit-source-id: f6b0fb6ec
Summary: It weakly updates array when there are more than two contents.
Reviewed By: mbouaziz
Differential Revision: D13318443
fbshipit-source-id: fa740d8b1
Summary:
It materializes symbolic values of function parameters on-demand. The on-demand materialization is triggered when finding a value from an abstract memory and joining/widening abstract memories.
Depends on D13294630
Main idea:
* Symbolic values are on-demand-ly generated by a symbol path and its type
* In order to avoid infinite generation of symbolic values, symbol paths are canonicalized by structure types and field names (which means they are abstracted to the same value). For example, in a linked list, a symbolic value `x->next->next` is canonicalized to `x->next` when the structures (`*x` and `*x->next`) have the same structure type and the same field name (`next`).
Changes from the previous code:
* `Symbol.t` does not include `id` and `pname` for distinguishing symbols. Now, all symbols are compared by `path:SymbolPath.partial` and `bound_end`.
* `SymbolTable` is no longer used, which was used for generating symbolic values with new `id`s.
Reviewed By: mbouaziz
Differential Revision: D13294635
fbshipit-source-id: fa422f084
Summary:
`eval_locs` is like `eval |> get_all_locs` but avoids computing things that aren't necessary.
The goal was not to be an optimization but is needed for Java where array blocks don't have offsets.
Reviewed By: skcho
Differential Revision: D13190939
fbshipit-source-id: 1cc0e6338
Summary: Similarly as `std::vector::push_back`, `std::vector::reserve` can invalidate the references to elements if the new size is bigger than the existing one. More info on `std::vector::reserve`: https://en.cppreference.com/w/cpp/container/vector/reserve
Reviewed By: jvillard
Differential Revision: D13340324
fbshipit-source-id: bf99b6923
Summary: Instead of variable having the value of a single location on stack, we now allow variables to have multiple locations. Consequently, we also allow a memory location to point to a set of locations in the heap. We enforce a limit on a maximum number of locations in a set (currently 5).
Reviewed By: jvillard
Differential Revision: D13190876
fbshipit-source-id: 5cb5ba9a6
Summary:
At function calls, it copies callee's values that are reachable from parameters.
Depends on D13231291
Reviewed By: mbouaziz
Differential Revision: D13231711
fbshipit-source-id: 1e8aed1c4
Summary: It instantiates not only symbols for bound but also symbols for locations at function calls.
Reviewed By: mbouaziz
Differential Revision: D13231291
fbshipit-source-id: ce23a943b
Summary: Recent improvements in join fixed `FP_allocate_in_branch_ok` because the variable was not read after the join.
Reviewed By: mbouaziz
Differential Revision: D13233441
fbshipit-source-id: 89b701e12
Summary: Delete function that would get a linter warning or not depending on the version of Xcode.
Reviewed By: martintrojer
Differential Revision: D13215750
fbshipit-source-id: 886ce397d
Summary:
It's useful for checkers to know when variables go out of scope to
perform garbage collection in their domains, especially for complex
domains with non-trivial joins. This makes the analyses more precise at
little cost.
This could have been added as a custom function call to a builtin, but I
decided against it because this instruction doesn't have the semantics
of any function call. It's better for each checker to explicitly not
deal with the custom instruction instead.
Reviewed By: jberdine
Differential Revision: D13102951
fbshipit-source-id: 33be22fab
Summary:
Before, the liveness pre-analysis would place extra instructions in the
CFG for either:
1. marking an `Ident.t` as dead, or
2. marking a `Pvar.t` as `= 0`
But we have no way of marking pvars dead without setting them to 0. This
is bad because setting pvars to 0 is not possible everywhere they are
dead. Indeed, we only do it when we haven't seen their address being
taken anyway. This prevents the following situation, recorded in our tests:
```
int address_taken() {
int** x;
int* y;
int i = 7;
y = &i;
x = &y;
// if we don't reason about taken addresses while adding nullify instructions,
// we'll add
// `nullify(y)` here and report a false NPE on the next line
return **x;
}
```
So we want to mark pvars as dead without nullifying them. This diff
extends the `Remove_temps` SIL instruction to accept pvars as well, and
so renames it to `ExitScope`.
Reviewed By: da319
Differential Revision: D13102953
fbshipit-source-id: aa7f03a52
Summary:
It modifies sizes and offsets of array values on pointer castings.
Currently, it supports only simple castings of pointer-to-integers.
Reviewed By: mbouaziz
Differential Revision: D12920589
fbshipit-source-id: a5ba831b8
Summary:
It enables the translation of casting expression. As of now, it
translates only the castings of pointers to integer types, in order to
avoid too much of change, which may mess the checkers up.
Reviewed By: jvillard
Differential Revision: D12920568
fbshipit-source-id: a5489df24
Summary:
Useful to understand the changes in the pre-analysis, or to inspect the
CFG that checkers actually get.
This means that the pre-analysis always runs when we output the dotty,
but I don't really see a reason why not. In fact, we could probably
*always* store the CFGs as pre-analysed.
Reviewed By: mbouaziz
Differential Revision: D13102952
fbshipit-source-id: 89f3102ec
Summary: It fixes the conditions of the `check` function to address `is_collection_add` cases correctly.
Reviewed By: mbouaziz
Differential Revision: D13081281
fbshipit-source-id: 39ae5ef03
Summary:
Update clang plugin which now gives names to variables captured by lambdas that were empty before.
update-submodule: facebook-clang-plugins
Reviewed By: jvillard
Differential Revision: D12979015
fbshipit-source-id: 0b092fb24
Summary:
It turns out keeping attributes (such as invalidation facts) separate
from the memory is a bad idea and leads to loss of precision and false
positives, as seen in the new test (which previously generated a
report).
Allow me to illustrate on this example, which is a stylised version of
the issue in the added test: previously we'd have:
```
state1 = { x = 1; invalids={} }
state2 = { x = 2; invalids ={1} }
join(state1, state2) = { x = {1, 2}; invalids={{1, 2}} }
```
So even though none of the states said that `x` pointed to an invalid
location, the join state says it does because `1` and `2` have been
glommed together. The fact `x=1` from `state1` and the fact "1 is
invalid" from `state2` conspire together and `x` is now invalid even
though it shouldn't.
Instead, if we record attributes as part of the memory we get that `x`
is still valid after the join:
```
state1 = { x = (1, {}) }
state2 = { x = (2, {}) }
join(state1, state2) = { x = ({1, 2}, {}) }
```
Reviewed By: mbouaziz
Differential Revision: D12958130
fbshipit-source-id: 53dc81cc7
Summary:
I hear that this scheduler is better. I want the best scheduler
possible. Also pulse's join is a bit complex so it might matter one day.
whydididothis
Reviewed By: mbouaziz
Differential Revision: D12958131
fbshipit-source-id: 3bd77ccba
Summary: There is a bug on the instantiation of function parameters.
Reviewed By: mbouaziz
Differential Revision: D12973691
fbshipit-source-id: ca7fbc4e6
Summary: The aligned width of bool should be 1-byte, while the range of bool [0,1].
Reviewed By: jvillard
Differential Revision: D12932394
fbshipit-source-id: be1a5d6d1
Summary: For a general case of `operator=` we want to create a fresh location for the first parameter as `operator=` behaves as copy assignment.
Reviewed By: jvillard
Differential Revision: D12940635
fbshipit-source-id: 89c6e530d
Summary:
Whenever `vec.reserve(n)` is called, remember that the vector is
"reserved". When doing `vec.push_back(x)` on a reserved vector, assume
enough size has been reserved in advance and do not invalidate the
underlying array.
This gets rid of false positives.
Reviewed By: mbouaziz
Differential Revision: D12939837
fbshipit-source-id: ce6354fc5
Summary:
Instead of keeping at most one invalidation fact for each address, keep
a set of them and call them "attributes". Keeping a set of invalidation
facts is redundant since we always only want the smallest one, but
makes the implementation simpler, especially once we add more kinds of
attributes (used for modelling, see next diffs).
Reviewed By: mbouaziz
Differential Revision: D12939839
fbshipit-source-id: 4a54c2132
Summary:
Copied on the ownership checker logic: return the initial value of the
domain as return. This can probably be improved.
Reviewed By: mbouaziz
Differential Revision: D12888102
fbshipit-source-id: 9e2dac7fc
Summary:
When initialising a variable via semi-exotic means, the frontend loses
the information that the variable was initialised. For instance, it
translates:
```
struct Foo { int i; };
...
Foo s = {42};
```
as:
```
s.i := 42
```
This can be confusing for backends that need to know that `s` actually
got initialised, eg pulse.
The solution implemented here is to insert of dummy call to
`__variable_initiazition`:
```
__variable_initialization(&s);
s.i := 42;
```
Then checkers can recognise that this builtin function does what its
name says.
Reviewed By: mbouaziz
Differential Revision: D12887122
fbshipit-source-id: 6e7214438
Summary:
Now that arrays are dealt with separately (see previous diff), we can
turn the join back into an over-approximation as far as invalid
locations are concerned.
Reviewed By: skcho
Differential Revision: D12881989
fbshipit-source-id: fd85e49c0
Summary:
This prevents the join from wrongly assuming that we haven't seen a
variable on one side of the join.
Reviewed By: skcho
Differential Revision: D12881987
fbshipit-source-id: 42a776adb
Summary:
For more deduplications of issues, this diff loosens the condition of
similar bounds. The previous condition of similar bounds was too
strict, so [0,0] and [0,+oo] were not similar.
Depends on D10851762
Reviewed By: mbouaziz
Differential Revision: D10866127
fbshipit-source-id: 4ba912a88
Summary: For `operator=(lhs, rhs)` we want to model it as an assignment if rhs is materialized temporary created in the constructor.
Reviewed By: jvillard
Differential Revision: D10462510
fbshipit-source-id: 998341e69
Summary: Do not create a new location for placement new argument if it already exists.
Reviewed By: jvillard
Differential Revision: D12839942
fbshipit-source-id: 758b67a82
Summary:
In order to know whether a global variable is an integral constant
expression in C, this diff adds a field for the results of isInitICE.
The controller you requested could not be found.: facebook-clang-plugins
Reviewed By: mbouaziz
Differential Revision: D12838521
fbshipit-source-id: 388bff1f3
Summary:
Get rid of `USE_AFTER_LIFETIME`. This could be useful to deploy pulse
alongside the ownership checker too.
Reviewed By: da319
Differential Revision: D12857477
fbshipit-source-id: 8e2a2a37c
Summary:
Keep `USE_AFTER_LIFETIME` for unclassified errors (for now it contains
vector invalidation too because I can't think of a good name for
them, and maybe it makes sense to wait until we have more types of them
to decide on a name).
Reviewed By: da319
Differential Revision: D12825060
fbshipit-source-id: bd75ef698
Summary:
Getting this right will be long and complex so for now the easiest is to
underreport and only consider as invalid the addresses we know to be invalid on
both sides of a join. In fact the condition for an address to be invalid after
a join is more complex than this: it is invalid only if *all* the addresses in
its equivalence class as discovered by the join are invalid.
Reviewed By: skcho
Differential Revision: D12823925
fbshipit-source-id: 2ca109356
Summary: Similarly as for destructors, we provide an address of an object as a first parameter to constructors. When constructor is called we want to create a fresh location for a new object.
Reviewed By: jvillard
Differential Revision: D10868433
fbshipit-source-id: b60f32953
Summary: We provide an address of an object as a parameter to destructor. When destructor is called the object itself is invalidated, but not the address.
Reviewed By: jvillard
Differential Revision: D12824032
fbshipit-source-id: 516eebcf8
Summary:
It terminates narrowing when new and old states are not comparable.
Since current narrowing does not use meet operations guaranteeing
termination of narrowing, it tries to terminate narrowing more
conservatively.
Reviewed By: mbouaziz
Differential Revision: D12815419
fbshipit-source-id: e8b45199e
Summary: It tries division on minmax value approximately, rather than just returning infinities. For example, `[0,2+min(6,s)] / 2` returns `[0,4]`.
Reviewed By: mbouaziz
Differential Revision: D10867091
fbshipit-source-id: d3f49987b
Summary:
This diff preserves values of offset and index separately, rather than
one value of their addition, because premature addition results in
imprecise FPs by the limited expressiveness of the domain.
Reviewed By: mbouaziz
Differential Revision: D10851393
fbshipit-source-id: 1685ead36
Summary:
The time has come to keep track of which tests pass and which are FP/FN
for pulse.
Reviewed By: mbouaziz
Differential Revision: D10854064
fbshipit-source-id: 60938e48f
Summary:
Turns out once a vector array became invalid it stayed that way, instead
of the vector getting a new valid internal array.
Reviewed By: skcho
Differential Revision: D10853532
fbshipit-source-id: f6f22407f
Summary:
Now the domain can reason about `&` and `*` too. When recording `&`
between two locations also record a back-edge `*`, and vice-versa.
Reviewed By: mbouaziz
Differential Revision: D10509335
fbshipit-source-id: 8091b6ec0
Summary: This is more flexible and allows us to give more details when reporting.
Reviewed By: mbouaziz
Differential Revision: D10509336
fbshipit-source-id: 79c3ac1c8
Summary: To avoid reporting on private methods, ignore those starting with underscore. Other cleanups.
Reviewed By: jvillard
Differential Revision: D10558970
fbshipit-source-id: 0572f1e70
Summary:
Invalidating addresses for destructors to catch use after destructor errors.
To pass ownership tests for use after destructor errors, we still need to:
(1) fix pointer arithmetic false positives
(2) add model for placement new to fix false positives
(3) add model for operator= to fix false positives
(4) support inter-procedural analysis for destructor_order_bad test
Reviewed By: jvillard
Differential Revision: D10450912
fbshipit-source-id: 2d9b1ee68
Summary:
It uses platform-dependent integer type widths information when
constructing Sizeof expressions which have a field(`nbytes`)
representing the static results of the evaluation of `sizeof(typ)`.
Reviewed By: mbouaziz
Differential Revision: D10504715
fbshipit-source-id: 0c79d37d8
Summary: Reports will now be issued for the class loads of the methods specified by the option `--class-loads-roots`.
Reviewed By: jvillard
Differential Revision: D10466492
fbshipit-source-id: 91456d723
Summary:
Instead of the non-sensical piecewise join we had until now write
a proper one. Hopefully the comments explain what it does. Main one:
```
(* high-level idea: maintain some union-find data structure to identify locations in one heap
with locations in the other heap. Build the initial join state as follows:
- equate all locations that correspond to identical variables in both stacks, eg joining
stacks {x=1} and {x=2} adds "1=2" to the unification.
- add all addresses reachable from stack variables to the join state heap
This gives us an abstract state that is the union of both abstract states, but more states
can still be made equal. For instance, if 1 points to 3 in the first heap and 2 points to 4
in the second heap and we deduced "1 = 2" from the stacks already (as in the example just
above) then we can deduce "3 = 4". Proceed in this fashion until no more equalities are
discovered, and return the abstract state where a canonical representative has been chosen
consistently for each equivalence class (this is what the union-find data structure gives
us). *)
```
Reviewed By: mbouaziz
Differential Revision: D10483978
fbshipit-source-id: f6ffd7528
Summary:
It avoids checking integer overflow when it definitely cannot happen.
For example, it does not check integer overflow of addition when one
of parameters is a negative number, or underflow of subtraction when
its first parameter is a positive number.
Reviewed By: mbouaziz
Differential Revision: D10446161
fbshipit-source-id: b8c86e1b2
Summary: We assume multiplication of 1 is safe. It happens sometimes by multiplying `sizeof(char)`.
Reviewed By: mbouaziz
Differential Revision: D10444680
fbshipit-source-id: 2f33be280
Summary: This diff changes pp of binary operation condition in order to avoid a `make test` failure. For the same `uint64_t` type, it is translated to `unsigned long long` in 64bit mac, but `unsigned long` in 64bit linux, which made a `make test` failure.
Reviewed By: mbouaziz
Differential Revision: D10459466
fbshipit-source-id: 449ab548e
Summary:
`Location` was clashing with the `Location` module, so use `Address`
instead.
When invalidating an address, remember the "actor" of its invalidation,
i.e. the access expression leading to the address and the source
location of the corresponding instruction.
When checking accesses, also pass the actor responsible for the access,
so that when we raise an error we know:
1. when and why a location was invalidated
2. when and why we tried to read it after that
Reviewed By: mbouaziz
Differential Revision: D10446282
fbshipit-source-id: 3ca4fb3d4
Summary:
Model `x[y]` and `x.push_back(i)` to catch the classic bug of "take
reference inside vector, invalidate, then use again".
Reviewed By: da319
Differential Revision: D10445824
fbshipit-source-id: 21ffd9677
Summary:
Do the intersection of the heap and stack domains, and the union of the
invalid location sets. This forgets invalid locations that appear only
in one heap, unfortunately. We can start with this and improve later.
Reviewed By: mbouaziz
Differential Revision: D10445825
fbshipit-source-id: cc24460af
Summary:
New analysis in foetal form to detect invalid use of C++ objects after their
lifetime has ended. For now it has:
- A domain consisting of a graph of abstract locations representing the heap, a map from program variables to abstract locations representing the stack, and a set of locations known to be invalid (their lifetime has ended)
- The heap graph is unfolded lazily when we resolve accesses to the heap down to an abstract location. When we traverse a memory location we check that it's not known to be invalid.
- A simple transfer function reads and updates the stack and heap in a rudimentary way for now
- C++ `delete` is modeled as adding the location that its argument resolves to to the set of invalid locations
- Also, the domain has a really crappy join and widening for now (see comments in the code)
With this we already pass most of the "use after delete" tests from the
Ownership checker. The ones we don't pass are only because we are missing
models.
Reviewed By: mbouaziz
Differential Revision: D10383249
fbshipit-source-id: f414664cb
Summary:
It avoids raising an exception when unexpected arguments are given to
placement new. We will revert this after fixing the frontend to parse
user defined `new` correctly in the future.
Reviewed By: mbouaziz
Differential Revision: D10378136
fbshipit-source-id: d494f781b
Summary:
Use same code for deciding whether two accesses conflict across java/clang, by adapting that of the clang version.
Eliminate/simplify some code.
Reviewed By: mbouaziz, jberdine
Differential Revision: D10217383
fbshipit-source-id: dc0986d05
Summary:
It unsets `var_exp_typ` of `trans_state` during the translations of
placement parameters, so they are translated independently against the
target variable and class of the `new` function.
Reviewed By: mbouaziz, jvillard
Differential Revision: D10161419
fbshipit-source-id: 7f588a91c
Summary: It enables placement_new to get three parameters, which happens when placement_new is overloaded (e.g. Boost).
Reviewed By: mbouaziz
Differential Revision: D10100324
fbshipit-source-id: 0ecb0a404
Summary:
Fix the logic for computing duplicate symbols. It was broken at some point and some duplicate symbols creeped into our tests. Fix these, and add a test to avoid duplicate symbols detection to regress again.
Also, this removes one use of `Cfg.load`, on the way to removing file-wide CFGs from the database.
Reviewed By: ngorogiannis
Differential Revision: D10173349
fbshipit-source-id: a0d2365b3
Summary: It uses big int, instead of 63bits int of OCaml, in the interval domain in order to get preciser numeric values in the future.
Reviewed By: jvillard
Differential Revision: D10123364
fbshipit-source-id: c217f4366
Summary:
Make distinct reports on strict mode violations.
For now, restrict to direct violations (UI threads calls transitively a violating method).
Will assess impact and enable indirect reports later (via locks).
Reviewed By: mbouaziz
Differential Revision: D10126780
fbshipit-source-id: 9c75930bc
Summary:
New clang in the plugin \o/
Changes that were needed:
- (minor) Some extra AST nodes
- defining a lambda and calling it in the same line (`[&x]() { x = 1; }()`) used to get translated as a call of the literal but now an intermediate variable gets created, which confuses uninit in one test. I added another test to showcase the limitation this is hitting: storing the lambda in a variable then calling it will not get caught by the checker.
The controller you requested could not be found.: facebook-clang-plugins
Reviewed By: jeremydubreil
Differential Revision: D10128626
fbshipit-source-id: 8ffd19f3c
Summary: Before this diff, the analysis would only lookup the attributes with the classname appearing in the instruction. However, it would fail to find those attributes for inherited and not overridden methods. With this diff, the attributes are now searched recursively in the super classes.
Reviewed By: mbouaziz
Differential Revision: D10007469
fbshipit-source-id: 77d721cba
Summary: This fixes some cases of false positives where the analysis will compare with the wrong overridden methods. This could later be improved with the possibility to do sub-typing comparison on the parameters.
Reviewed By: ngorogiannis
Differential Revision: D9985249
fbshipit-source-id: 7998d8619
Summary:
Keep `--analyzer` around for now for integrations that depend on it.
Also deprecate the `--infer-blacklist-path-regex`,
`--checkers-blacklist-path-regex`, etc. in favour of
`--report-blacklist-path-regex` which more accurately represents what these do
as of now.
Rely on the current subcommand instead of the analyzer where needed, as most of
the code already does.
Reviewed By: jeremydubreil
Differential Revision: D9942809
fbshipit-source-id: 9380e6036
Summary:
Goal of the stack: deprecate the `--analyzer` option in favour of turning
individual features on and off. This option is a mess: some of the options are
now subcommands (compile, capture), others are aliases (infer and checkers),
and they can all be replicated using some straightforward combination of other
options.
This diff: stop using `--analyzer` in tests. It's mostly `checkers` everywhere,
which is already the default. `linters` becomes `--no-capture --linters-only`.
`infer` is supposed to be `checkers` already. `crashcontext` is
`--crashcontext-only`.
Reviewed By: mbouaziz
Differential Revision: D9942689
fbshipit-source-id: 048281761
Summary: It is common on Android code to recycle the `View` object by nullifying them in the `onDestroy()` or `onDestroyView()` methods. In this case, the outer `Fragment` object structure is preserve while the inner `View` object are set to null for the garbage collect to release the memory. However, if the fields are only set to `null` in the `onDestroy*()` methods, those fields cannot be `null` during the active lifecycle of the `Fragment`, so it is not necessary to annotate those fields with `Nullable`.
Reviewed By: mbouaziz
Differential Revision: D10024458
fbshipit-source-id: b05e538d9
Summary:
The method matcher is now used sufficiently it warrants refactoring out into its own module.
Also, kill dev-android-strict-mode and leave starvation-strict-mode as the stronger option.
Reviewed By: jeremydubreil
Differential Revision: D9990753
fbshipit-source-id: 626a70a19
Summary:
The model for `getcwd` assumes the first argument should be non-null when in fact a NULL pointer is legitimate and results in allocation:
> As an extension to the POSIX.1-2001 standard, glibc's getcwd() allocates the buffer dynamically using mal‐
> loc(3) if buf is NULL. In this case, the allocated buffer has the length size unless size is zero, when buf
> is allocated as big as necessary. The caller should free(3) the returned buffer.
I suggest this glibc extension be used for the getcwd model to reduce false positives.
Pull Request resolved: https://github.com/facebook/infer/pull/925
Reviewed By: mbouaziz
Differential Revision: D9830450
Pulled By: jvillard
fbshipit-source-id: 95c4862b1
Summary: Always read the attributes from the attributes DB instead of trying to read the attributes from the analysis summaries
Reviewed By: mbouaziz
Differential Revision: D9845085
fbshipit-source-id: aef48e6bf
Summary: No longer report inconsistencies with the annotations with subtyping when the super class is in an external packages since those warnings are not necessarily accurate or actionable.
Reviewed By: ezgicicek
Differential Revision: D9845098
fbshipit-source-id: 1f2bcd739