Summary:
bigmacro_bender
There are 3 ways pulse tracks history. This is at least one too many. So
far, we have:
1. "histories": a humble list of "events" like "assigned here", "returned from call", ...
2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah"
3. "traces", which combine one history with one interproc action
This diff gets rid of interproc actions and makes histories include
"nested" callee histories too. This allows pulse to track and display
how a value got assigned across function calls.
Traces are now more powerful and interleave histories and interproc
actions. This allows pulse to track how a value is fed into an action,
for instance performed in callee, which itself creates some more
(potentially now interprocedural) history before going to the next step
of the action (either another call or the action itself).
This gives much better traces, and some examples are added to showcase
this.
There are a lot of changes when applying summaries to keep track of
histories more accurately than was done before, but also a few
simplifications that give additional evidence that this is the right
concept.
Reviewed By: skcho
Differential Revision: D17908942
fbshipit-source-id: 3b62eaf78
Summary:
- add the variable being declared so we can report it back in the trace in addition to its location
- distinguish between local vars and formals
Reviewed By: skcho
Differential Revision: D17930348
fbshipit-source-id: a5b863e64
Summary:
Instead of a string argument named `~str` pass `Formal | Global` and let
`add_to_errlog` figure out how to print it.
Reviewed By: ezgicicek
Differential Revision: D17907657
fbshipit-source-id: ed09aab72
Summary:
When we make the decision to go into a branch "v = N" where some
abstract value is compared to a constant, remember the corresponding
equality. This allows to prune simple infeasible paths
intra-procedurally.
Further work is needed to make this useful interprocedurally, for
instance either or both of these ideas could be explored:
- abduce v=N in the precondition and do not apply summaries when the
equalities in the pre are not satisfied
- prune post-conditions that lead to unsat states where a value has to
be equal to several different constants
Reviewed By: skcho
Differential Revision: D17906166
fbshipit-source-id: 5cc84abc2
Summary:
When we know "x = 3" and we have a condition "x != 3" we know we can
prune the corresponding path.
Reviewed By: skcho
Differential Revision: D17665472
fbshipit-source-id: 988958ea6
Summary: If we have no pulse summary (most likely caused by pulse finding a legit issue with the code), let's consider the function as impure.
Reviewed By: jvillard
Differential Revision: D17906016
fbshipit-source-id: 671d3e0ba
Summary:
Previously deduplication was always on which is not great for testing.
Also split tests so that we can still test deduplication separately.
Reviewed By: mityal
Differential Revision: D17686877
fbshipit-source-id: 280d91473
Summary:
Unfortunately it is very hard to predict when
`Typ.Procname.describe` will add `()` after the function name, so we
cannot make sure it is always there.
Right now we report clowny stuff like "error while calling `foo()()`",
which this change fixes.
Reviewed By: ezgicicek
Differential Revision: D17665470
fbshipit-source-id: ef290d9c0
Summary:
Introduce a new experimental checker (`--impurity`) that detects
impurity information, tracking which parameters and global variables
of a function are modified. The checker relies on Pulse to detect how
the state changes: it traverses the pre and post pairs starting from
the parameter/global variable and finds where the pre and post heaps
diverge. At diversion points, we expect to see WrittenTo/Invalid attributes
containing a trace of how the address was modified. We use these to
construct the trace of impurity.
This checker is a complement to the purity checker that exists mainly
for Java (and used for cost and loop-hoisting analyses). The aim of
this new experimental checker is to rely on Pulse's precise
memory treatment and come up with a more precise im(purity)
analysis. To distinguish the two checkers, we introduce a new issue
type `IMPURE_FUNCTION` that reports when a function is impure, rather
than when it is pure (as in the purity checker).
TODO:
- improve the analysis to rely on impurity information of external
library calls. Currently, all library calls are assumed to be nops,
hence pure.
- de-entangle Pulse reporting from analysis.
Reviewed By: skcho
Differential Revision: D17051567
fbshipit-source-id: 5e10afb4f
Summary: Use_after_free was used both for biabduction and pulse, and the biabduction version is blacklisted by default. As a result, the Pulse version was also disabled unintentionally. This changes the name of the old use_after_free so that now we can get use_after_free bugs whenever pulse is enabled.
Reviewed By: skcho
Differential Revision: D17182687
fbshipit-source-id: 539ca69de
Summary: With this predicate we are able to check for static global variables in AL.
Reviewed By: ddino
Differential Revision: D17164848
fbshipit-source-id: a3d10598c
Summary:
This diff uses the models of vector for modelling string in Cpp.
Depends on D16963153
Reviewed By: ezgicicek
Differential Revision: D16963166
fbshipit-source-id: 5effe2d72
Summary:
This is more powerful than `"symbols"` for more advanced use-cases. Keep
`"symbols"` unchanged to make migrating easier.
Differential Revision: D16985756
fbshipit-source-id: dfbb09393
Summary: Adding new predicate for checking whether a variable is defined as extern. May be useful in AL rules.
Reviewed By: jvillard
Differential Revision: D16961690
fbshipit-source-id: 0677077dc
Summary:
Use whatever information we can to decide whether to use C or Java
syntax when outputting an access expression, now that we store them as
such.
Also, make cluster callbacks explicitly set the language, as this was not done before and led to some confusion (Clang being set when analysing a Java file).
Reviewed By: skcho
Differential Revision: D16884160
fbshipit-source-id: 40adf9f35
Summary:
Change the logic of the annotation reachability checker in the following
ways:
1. Sanitizers take priority over sinks, i.e. a procedure that is both a
sink and a sanitizer is not a sink. This changes the existing tests
that seemed to assume the opposite. However I think that way is more
useful and goes better with the fact that sanitizers are specified as
"overrides".
2. When applying a summary, check again that we are not in a sanitizer
for the corresponding sink.
Without (2) this there was a subtle bug when several rules were
specified. For example, if `sink_wrapper()` wraps `sink()` for a rule
`R` then the summary of `sink_wrapper()` will be: `R-sink : call to sink()`.
Then, suppose `sanitizer()` calls `sink_wrapper()` and `sanitizer()` is
a sanitizer for `R` but not for another rule `R'`. The previous code
would add the call to `sink()` to the summary of `sanitizer()` because
it's not a sanitizer for `R'`, even though `sink()` is not a sink for
`R'`!
The current code will re-apply the rules correctly so that sinks are
matched only against the right sanitizers.
Reviewed By: skcho
Differential Revision: D16895577
fbshipit-source-id: 266cc4940
Summary:
- run the tests! they weren't hooked up to the main Makefile :/
- add some html debug messages
- formatting
Reviewed By: skcho
Differential Revision: D16895578
fbshipit-source-id: e96d737cc
Summary:
It adds a vector model of `data` method.
Depends on D16687280
Reviewed By: ezgicicek
Differential Revision: D16689400
fbshipit-source-id: 156016b3c
Summary:
It adds a model of vector::push_back
Depends on D16687225
Reviewed By: ezgicicek
Differential Revision: D16687269
fbshipit-source-id: 9d2a73fca
Summary:
It enables pruning of vector's size when the return value of the function call of `vector::size` is pruned.
Depends on D16687167
Reviewed By: ezgicicek
Differential Revision: D16687225
fbshipit-source-id: 793a21b3a
Summary:
It generates vector value ondemand when it is given as a parameter.
Depends on D16645589
Reviewed By: ezgicicek
Differential Revision: D16645624
fbshipit-source-id: 7498c8ab2
Summary:
These have proved to be too fragile to maintain as they would often break
compilation of user code. They have been off by default for more than a year
now (D7350715).
Removing the include models shows a more accurate picture of what infer results
look like in production. As such, lots of tests have changed, mostly
biabduction but also in inferbo. SIOF was using include-based models too but
now libc++ is better and iostreams are implemented in a way that SIOF
understands (instead of being magical creatures) so nothing changed there.
Reviewed By: skcho
Differential Revision: D16602171
fbshipit-source-id: ce38f045b
Summary:
This diff prevents that the latest prune value is overwritten as top
from callees.
Reviewed By: jvillard
Differential Revision: D16540391
fbshipit-source-id: bdd5b42ed
Summary:
This diff improves the precision of the mod operator.
For example, result of x % c (when x>=0 and c>0) is
(before) [0, c-1]
(after) [0, min(c-1,x)]
Reviewed By: ezgicicek
Differential Revision: D16518578
fbshipit-source-id: a68660ee7
Summary:
Pulse didn't treat local variables going out of scope as invalidating the corresponding address in memory. This diff fixes that by
- marking all local variables that exits the scope with the attribute `AddressOfStackVariable`
- before we write the summary for the proc, we make sure to invalidate all such addresses local to the procedure as `Invalid.` If such an address is read, then we would raise a use-after-lifetime issue.
Reviewed By: jvillard
Differential Revision: D16458355
fbshipit-source-id: 3686524cb
Summary:
It downgrades issues of void pointer to L5, because of its impreciseness. This is not
ideal but Inferbo cannot analyze arrays pointed by void pointers precisely at the moment.
Reviewed By: jvillard
Differential Revision: D16379911
fbshipit-source-id: f2c016aba
Summary:
A common gotcha is the new test. Model the minimum amount of
`std::basic_string` to catch it.
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121090
fbshipit-source-id: 66f06cb43
Summary:
Be more flexible in what type of function calls are allowed in `ViaCall ...` actions to be able to include models.
Also get rid of `here here` in traces /o\
As a side-effect, get more precise (=qualified) procedure names in
traces (but not in messages so as not to be too verbose).
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121092
fbshipit-source-id: fb51b02f8
Summary:
Replaced by pulse. `--ownership` is now a deprecated form of `--pulse`.
The ownership checker is starting to give wrong answers due to changes in the
clang frontend, so it's better to remove it in favour of pulse.
there_goes_my_hero
Reviewed By: ngorogiannis
Differential Revision: D16107650
fbshipit-source-id: bb2446a19
Summary:
So it turns out we need to translate even more cases. Pulse had a FP
before that this fixes.
Reviewed By: ezgicicek
Differential Revision: D16073629
fbshipit-source-id: c03460b5a
Summary:
This is needed to test some functionality in the next diff. Only one
test changes (no longer a FN), which is now documented. Also, stop
including the "header models" meant for biabduction!
Maybe one day we'll need to have several test modes for different C++
versions. Seems overkill for now, so let's wait until we see some actual
issues (eg FPs) that manifest in one version but not the other.
Reviewed By: mbouaziz
Differential Revision: D16073630
fbshipit-source-id: 1cfdfc933
Summary:
Sometimes the post of a function call has attributes on addresses that
were mentioned in the pre but are no longer reachable in the post. We
don't want to forget these, see added test.
Reviewed By: mbouaziz
Differential Revision: D16050050
fbshipit-source-id: 1ce522b97
Summary:
The previous code would call the destructor for the C++ temporary
*before* the prune nodes, which then try to dereference it. Wrong.
Quick fix: don't destroy temporaries in conditionals.
Reviewed By: mbouaziz
Differential Revision: D16030735
fbshipit-source-id: e11abad58
Summary:
We were skipping some instructions before and that was a problem for
pulse. See added pulse test.
Reviewed By: mbouaziz
Differential Revision: D16030150
fbshipit-source-id: 9c62e6213
Summary: Not sure if anyone uses this but there, now it's modelled.
Reviewed By: mbouaziz
Differential Revision: D16008162
fbshipit-source-id: f4795dcba
Summary:
Prevent false positives about variables captured by value gone out of
scope.
Reviewed By: ezgicicek
Differential Revision: D16008165
fbshipit-source-id: d70e47db4
Summary: We know how to do interprocedural calls so let's use that!
Reviewed By: mbouaziz
Differential Revision: D16008164
fbshipit-source-id: 4c34bf704
Summary:
`function::operator=` is called whenever we assign a literal lambda to a
variable, so it's pretty useful to be able to report anything on
lambdas.
Reviewed By: mbouaziz
Differential Revision: D16008163
fbshipit-source-id: a9d07668d
Summary:
Printing `Exp.Const (Cfun proc_name)` adds `_fun_` in front of the
procedure name, eg `_fun_foo` instead of `foo`. This showed up in pulse
traces.
Reviewed By: mbouaziz
Differential Revision: D16004606
fbshipit-source-id: 72ac6866f
Summary:
Fixes a false positive where the address of a C++ temporary is bound to
a static const reference variable then returned. The fix doesn't try to
establish that the variable is a const reference so could lead to false
negatives but that can be addressed later.
Reviewed By: ezgicicek
Differential Revision: D16004538
fbshipit-source-id: e403dbefe
Summary:
[apologies for the unreviewable diff...]
Get rid of HIL expressions in pulse. This finishes the HIL -> SIL
migration. The first step made pulse start from SIL instructions but
would translate most accesses to HIL to re-use most of the existing
pulse code. This diff gets rid of the intermediate translation of SIL
expressions to HIL expressions.
Big changes:
1. `PulseOperations` mostly rewritten, driven by using `Exp.t` instead of `HilExp.AccessExpression.t` for everything.
2. Stop trying to reverse-engineer what addresses mean in terms of
access paths from program variables. Rely on the trace pointing at
the right places in the code to be enough. This is because it wasn't
that useful (and could even be misleading when wrong) but could be
prohibitively expensive in degenerate cases (eg nodes with tens of
thousands of successive array accesses...)
3. `PulseAbductiveDomain.apply_post` now returns the computed return
value instead of recording it itself.
4. Change of vocabulary: `materialize` -> `eval`, `crumb` -> `event`
5. Function calls arguments are now evaluated prior to doing anything
else, which saves everything else from having to (remember to) do
that. In particular, this changes how models look quite a bit.
Reviewed By: mbouaziz
Differential Revision: D15986373
fbshipit-source-id: 1d79935de