Summary: Pulse evaluated (e1+e2) as a new value that is a sum of evaluation results of e1 and e2. However, when e2 is known as zero, we can have a simpler evaluation result without introducing the new value. This diff returns the evaluation result of e1 when e2 is known as zero.
Reviewed By: ngorogiannis
Differential Revision: D29736444
fbshipit-source-id: d5ce5a60e
Summary: Useful to write other tests, and also probably worth modelling.
Reviewed By: ezgicicek
Differential Revision: D29232545
fbshipit-source-id: ecb24f6f7
Summary:
Investigations into the number of disjuncts we get in summaries
revealed that we sometimes blow past the limit. Rework the code to make
sure that does not happen.
We can probably raise the disjunct limit as a result, more experiments
needed.
Reviewed By: ezgicicek
Differential Revision: D29229249
fbshipit-source-id: 4a06594fa
Summary:
Unknown functions may create false positives as well as false negatives
for Pulse. Let's consider that unknown functions behave "functionally",
or at least that a functional behaviour is a possible behaviour for
them: when called with the same parameter values, they should return the
same value.
This is implemented purely in the arithmetic domain by recording
`v_return = f_unknown(v1, v2, ..., vN)` for each call to unknown
functions `f_unknown` with values `v1`, `v2`, ..., `vN` (and return
`v_return`). The hope is that this will create more false negatives than
false positives, as several FPs have been observed on real code that
would be suppressed with this heuristic.
The other effect this has on reports is to record hypotheses made on the
return values of unknown functions into the "pruned" part of formulas,
which inhibits reporting on paths whose feasibility depends on the
return value of unknown functions (by making these issues latent
instead). This should allow us to control the amount of FPs until we
model more functions.
Reviewed By: skcho
Differential Revision: D27798275
fbshipit-source-id: d31cfb8b6
Summary:
Add a new `PathContext.t` component to the abstract state. For now it
tracks only the current "timestamp" of symbolic execution inside the
procedure, i.e. which step of symbolic execution we are in (bumped by 1
each time we've executed one instruction). In the future this will also
hold, eg, which conditionals we've been through on the path (for
reporting traces with that information).
Most of the diff is about propagating the path context through many of
the APIs.
We use timestamps only in `MustBeValid` attributes to report the first
incorrect access in a function call for now.
Reviewed By: skcho
Differential Revision: D28674726
fbshipit-source-id: 2cd825e73
Summary:
It's better to remember the first reason why an address must be valid,
etc.
Reviewed By: skcho
Differential Revision: D28674729
fbshipit-source-id: 3b69de7ef
Summary:
Spoiler alert: we don't. The next diffs fix that.
When there are several invalid accesses to report at a function call
instruction, we want to report the first one to occur within the
function. This is to avoid confusing reports where pulse reports, eg, a
null dereference for a pointer at a point where it's already been
dereferenced before in the same function.
Reviewed By: skcho
Differential Revision: D28674730
fbshipit-source-id: acb029e4b
Summary:
This is needed for the next diff. It was a bit annoying to report leaks
in two different places, now it's just in one.
Reviewed By: skcho
Differential Revision: D28576768
fbshipit-source-id: 4f23b43cb
Summary:
Add an option for realloc and fiddle with the other options' help for
consistency.
Moved the memory leak test to memory_leak.c and added more.
Moved the place where we take the options into account closer to their
corresponding models to defend a bit against modifying one without
modifying the other.
Reviewed By: da319
Differential Revision: D28543340
fbshipit-source-id: 75894d06d
Summary:
Let's model all the dynamic memory management functions as they all work
together and are important for a lot of C projects.
Reviewed By: ezgicicek
Differential Revision: D28543008
fbshipit-source-id: f130e1ab6
Summary:
This fixes a memory leak false positive. When collecting unreachable
values we should be careful to take the equality relation into account.
Equal values are normally canonicalised but only with respect to "known"
equalities. This makes sure variables that are live thanks to the
"pruned" equalities are not discarded from the state.
Reviewed By: skcho
Differential Revision: D28382642
fbshipit-source-id: 2b898d754
Summary:
This makes reports more readable: they were all at the end of functions,
currently.
This is actually quite tricky to do as it involves detecting which
locations are unreachable.
Some of this logic can/should probably be shared with
`AbductiveDomain.discard_unreachable` but at the moment that's not the
case.
Reviewed By: skcho
Differential Revision: D28382590
fbshipit-source-id: bd4239a0c
Summary:
Most/all of the time we expect the history of the value to faithfully
trace how it got allocated. That history was then added as a prefix of
the trace leading to the same place, leading to duplicate information in
the report trace.
We may need to do the same for other bug types.
Reviewed By: ezgicicek
Differential Revision: D28536891
fbshipit-source-id: a83a2d038
Summary: Showcase the trace duplication, fixed in a further diff.
Reviewed By: ezgicicek
Differential Revision: D28536889
fbshipit-source-id: f23636368
Summary:
Turns out the mistake was pretty simple: we just forgot to keep the
history of the return value in the callee and add it to the caller's.
Reviewed By: skcho
Differential Revision: D28385941
fbshipit-source-id: 40fe09c99
Summary:
- Changed "passed as argument to f" to "in call to f", as these do not
always correspond to passing an argument (eg could be a value returned
from f)
- Changed "assigned" to "returned" when appropriate
- Changed the model of malloc() to not say "allocated" in the null case
- Don't print "returned from f" when there was no event inside f: just
print "in call to f".
Reviewed By: da319
Differential Revision: D28413900
fbshipit-source-id: bc85625e3
Summary:
The order was reversed when printing the trace, leading to confusion.
Also make sure we indicate which part of the trace we are printing when
there is more than one part (either context + access or invalidation +
access, or all three).
Also start nesting at <calling context length> to better represent the
role of the calling context visually.
Reviewed By: da319
Differential Revision: D28329263
fbshipit-source-id: b691fb1f4
Summary:
There's been regressions in --pulse-isl. Without tests, everything is
temporary!
Note: the regressions are presumably still there, this just records the
current status of pulse.isl.
Also, no objective-C(++) at the moment. Should we add them too? (in
another diff)
Reviewed By: skcho
Differential Revision: D28256703
fbshipit-source-id: 700b2cc57
Summary:
There's already all the ingredients to treat function pointers pretty
well, even when stored inside (const) globals.
In OpenSSL they use something like the added tests but the globals are
not const... This may need tweaking via an option, eg to inline all
global initializers, or filtered by global names/file names. Or just
use the existing --pulse-model-{alloc,release}-pattern options.
Reviewed By: skcho
Differential Revision: D28221651
fbshipit-source-id: 5399f1141
Summary:
When garbage-collecting addresses we would also remove their attributes.
But even though the addresses are no longer allocated in the heap, they
might show up in the formula and so we need to remember facts about
them.
This forces us to detect leaks closer to the point where addresses are
deleted from the heap, in AbductiveDomain.ml. This is a nice refactoring
in itself: doing so fixes some other FNs where we sometimes missed leak
detection on dead addresses.
This also makes it unecessary to simplify InstanceOf eagerly when
variables get out of scope.
Some new {folly,std}::optionals false positives that either are similar to existing ones or involve unmodelled smart pointers.
Reviewed By: da319
Differential Revision: D28126103
fbshipit-source-id: e3a903282
Summary:
Building on the infra in the previous commits, "fix" all the call sites
that introduce invalidations to make sure they also update the
corresponding histories. This is only possible to do when the access
leading to the invalidation can be recorded. Right now the only place
that's untraceable is the model of `free`/`delete`, because it happens
to be the only place where we invalidate an address without knowing
where it comes from (`free(v)`: what was v's access path? we could track
this in the future).
Reviewed By: skcho
Differential Revision: D28118764
fbshipit-source-id: de67f449e
Summary:
Implements the translation of most clang atomic builtins to SIL, including those used in `stdatomic.h`. It does not attempt to model the atomicity of the operations, since I don't know of any way Infer can represent that. I didn't bother implementing the rarely used min/max builtins, so they're left as `BuiltinDecl` calls.
This is my first major OCaml project, so any feedback is appreciated!
Also, CONTRIBUTING.md says to update the [facebook-clang-plugins](https://github.com/facebook/facebook-clang-plugins) submodule, but it doesn't seem to be a submodule anymore, and the code has diverged from that repo. Should I still make a PR over there?
Pull Request resolved: https://github.com/facebook/infer/pull/1434
Reviewed By: skcho
Differential Revision: D28118300
Pulled By: jvillard
fbshipit-source-id: 121c4ad25
Summary:
In main() all latent issues are manifest issues as the only parameters
are user-controlled.
Reviewed By: skcho
Differential Revision: D28121535
fbshipit-source-id: eab54d5bc
Summary:
As explained in the previous diff: when the access trace goes through
the invalidation step there is no need to print the invalidation trace
at all.
Note: only a few sources of invalidation are handled at the moment. The
following diffs gradually fix the other sources of invalidation.
Reviewed By: skcho
Differential Revision: D28098335
fbshipit-source-id: 5a5e6481e
Summary:
The eventual goal is to stop having separate sections of the trace
("invalidation part" + "access part") when the "access part" already
goes through the invalidation step. For this, it needs to record when a
value is made invalid along the path.
This is also important for assignements to NULL/0/nullptr/nil: right now
the way we record that 0 is not a valid address is via an attribute
attached to the abstract value that corresponds to 0. This makes traces
inconsistent sometimes: 0 can appear in many places in the same function
and we won't necessarily pick the correct one. In other words, attaching
traces to *values* is fragile, as the same value can be produced in many
ways. On the other hand, histories are stored at the point of access, eg
x->f, so have a much better chance of being correct. See added test:
right now its traces is completely wrong and makes the 0 in `if
(utf16StringLen == 0)` the source of the NULL value instead of the
return of `malloc()`!
This diff makes the traces slightly more verbose for now but this is
fixed in a following diff as the traces that got longer are those that
don't actually need an "invalidation" trace.
Reviewed By: skcho
Differential Revision: D28098337
fbshipit-source-id: e17929259
Summary:
See added test: pulse sometimes insisted that an issue was latent even
though the condition that made it latent could not be influenced (hence
could the issue could never become manifest) by callers because it was
unrelated to the pre, i.e. it came from a mutation inside the function.
In these cases, we want to report the issue straight away instead of
keeping it latent.
Reviewed By: skcho
Differential Revision: D28002725
fbshipit-source-id: ce9e6f190
Summary:
Before returning a summary, restore formals to their initial values.
This gets rid of a false latent because the value in the path condition
is now garbage-collected.
Added a test for the tricky case of structs passed as values.
Reviewed By: skcho
Differential Revision: D28001229
fbshipit-source-id: 23dda5b43
Summary:
This diff removes additional inferbo options `--bufferoverrun` from cost tests, since printing
inferbo issues is not that useful to understand cost results.
Reviewed By: ngorogiannis
Differential Revision: D27592496
fbshipit-source-id: 6ab3e6528
Summary:
Whenever an equality "t = v" (t an arbitrary term, v a variable) is
added (or "v = t"), remember the "t -> v" mapping after canonicalising t
and v. Use this to detect when two variables are equal to the same term:
`t = v` and `t = v'` now yields `v = v'` to be added to the equality
relation of variables. This increases the precision of the arithmetic
engine.
Interestingly, the impact on most code I've tried is:
1. mostly same perfs as before, if a bit slower (could be within noise)
2. slightly more (latent) bugs reported in absolute numbers
I would have expected it to be more expensive and yield fewer bugs (as
fewer false positives), but there could be second-order effects at play
here where we get more coverage. We definitely get more latent issues
due to dereferencing pointers after testing nullness, as can be seen in
the unit tests as well, which may alone explain (2).
There's some complexity when adding term equalities where the term
is linear, as we also need to add it to `linear_eqs` but `term_eqs` and
`linear_eqs` are interested in slightly different normal forms.
Reviewed By: skcho
Differential Revision: D27331336
fbshipit-source-id: 7314e127a
Summary:
Adapting error messages in Pulse so that they become more intuitive for
developers.
Reviewed By: jvillard
Differential Revision: D26887140
fbshipit-source-id: 896970ba2
Summary: This diff finds a declared variable name or declared field names from trace, then constructs an error message including access paths.
Reviewed By: jvillard
Differential Revision: D26544275
fbshipit-source-id: 135c90a1b
Summary:
`add_edge_on_src` is to prepare a stack location for a local variable. Before this diff, it was
called several times for each fields.
Reviewed By: jvillard
Differential Revision: D26543715
fbshipit-source-id: 49ebf2b65
Summary:
Before this diff:
```
// Summary of const global
// { global -> v }
n$0 =* global
// n$0 -> {global}
x *= n$0
// x -> {global}
```
However, this is incorrect because we expect `x` have `v` instead of the abstract location of `global`.
To fix the issue, this diff lookups the initializer summary when `global` is evaluated as RHS of load statement.
After this diff:
```
// Summary of const global
// { global -> v }
n$0 =* global
// n$0 -> v
x *= n$0
// x -> v
```
Reviewed By: ezgicicek
Differential Revision: D26369645
fbshipit-source-id: 98b1ed085
Summary:
In practice, it is not easy to mark all of NOT initialized elements of array, so let's ignore the
array value at the moment.
Reviewed By: jvillard
Differential Revision: D25372449
fbshipit-source-id: 02b2e217c
Summary:
Having different behaviours inter-procedurally and intra-procedurally
sounds like a bad design in retrospect. The model of free() should not
depend on whether we currently know the value is not null as that means
some specs are missing from the summary.
Reviewed By: skcho
Differential Revision: D26019712
fbshipit-source-id: 1ac4316a5
Summary:
The test compiled with warnings, not sure how to prevent this in the
future as `infer` will suppress all warnings anyway (I wanted to add
`-Werror` to the test Makefile but that was defeated by infer itself).
Reviewed By: ezgicicek
Differential Revision: D26019682
fbshipit-source-id: d7f8fc2d8
Summary:
When there was an assignment of C struct, `x = y;`, it was translated to the statements of load and store.
```
n$0 = *y
*x = n$0
```
However, this is incorrect in Sil, because a struct is not a value that can be assigned to registers. This diff fixes the translation as assignments of each field values :
```
n$0 = *y.field1
*x.field1 = n$0
n$0 = *y.field2
*x.field2 = n$0
...
```
It copies field values of C structs on:
* assign statement
* return statement
* declarations.
It supports nested structs.
Reviewed By: jvillard
Differential Revision: D25952894
fbshipit-source-id: 355f8db9c
Summary:
This diff fixes a bug in the translation of an empty for-loop. When both initialization and
incrementation statements did not introduce a new node, the frontend generated an incorrect results
where the for-loop was unreachable from the entry node.
Fixes https://github.com/facebook/infer/issues/1374
Reviewed By: jvillard
Differential Revision: D25912142
fbshipit-source-id: 15b65cb84
Summary:
When the body of the loop doesn't created a node then they don't get
wired correctly to the rest of the loop and end up dangling. Force node
creation to fix that.
Fixes https://github.com/facebook/infer/issues/1373
Reviewed By: ezgicicek
Differential Revision: D25804185
fbshipit-source-id: 85108bdd9