Summary: This diff prints cost and config impact checkers' json reports only for the changed files.
Reviewed By: ezgicicek
Differential Revision: D28707192
fbshipit-source-id: d949771f2
Summary: Litho (https://fblitho.com/) does some operations in the background. Add RacerD messaging specific to Litho.
Reviewed By: ezgicicek
Differential Revision: D28675504
fbshipit-source-id: e76f9f538
Summary: Same as MustBeValid: we want to report the first error on the path.
Reviewed By: skcho
Differential Revision: D28674724
fbshipit-source-id: a2ac04b5b
Summary:
Add a new `PathContext.t` component to the abstract state. For now it
tracks only the current "timestamp" of symbolic execution inside the
procedure, i.e. which step of symbolic execution we are in (bumped by 1
each time we've executed one instruction). In the future this will also
hold, eg, which conditionals we've been through on the path (for
reporting traces with that information).
Most of the diff is about propagating the path context through many of
the APIs.
We use timestamps only in `MustBeValid` attributes to report the first
incorrect access in a function call for now.
Reviewed By: skcho
Differential Revision: D28674726
fbshipit-source-id: 2cd825e73
Summary:
It's better to remember the first reason why an address must be valid,
etc.
Reviewed By: skcho
Differential Revision: D28674729
fbshipit-source-id: 3b69de7ef
Summary:
Spoiler alert: we don't. The next diffs fix that.
When there are several invalid accesses to report at a function call
instruction, we want to report the first one to occur within the
function. This is to avoid confusing reports where pulse reports, eg, a
null dereference for a pointer at a point where it's already been
dereferenced before in the same function.
Reviewed By: skcho
Differential Revision: D28674730
fbshipit-source-id: acb029e4b
Summary:
That was just broken before, but apparently nothing cared. It's needed
for the next diffs.
Reviewed By: skcho
Differential Revision: D28674731
fbshipit-source-id: 2f080238b
Summary:
Each Erlang function now has a Procdesc in `results.db`. The
ProcAttributes record if a function is exported or not by using the
access Public or Private, respectively.
This adds also `ErlangTypeName`. We use a fixed set of "type names" for
the different types of values in Erlang (i.e., for Erlang's "dynamic types").
Reviewed By: jvillard
Differential Revision: D28385954
fbshipit-source-id: f8278505a
Summary:
Update the website to include nil related issues for objective-c
Also fixed a typo + syntax highlighting
Reviewed By: jvillard
Differential Revision: D28638621
fbshipit-source-id: 148f2dd3f
Summary:
This is needed for the next diff. It was a bit annoying to report leaks
in two different places, now it's just in one.
Reviewed By: skcho
Differential Revision: D28576768
fbshipit-source-id: 4f23b43cb
Summary:
Add an option for realloc and fiddle with the other options' help for
consistency.
Moved the memory leak test to memory_leak.c and added more.
Moved the place where we take the options into account closer to their
corresponding models to defend a bit against modifying one without
modifying the other.
Reviewed By: da319
Differential Revision: D28543340
fbshipit-source-id: 75894d06d
Summary:
Let's model all the dynamic memory management functions as they all work
together and are important for a lot of C projects.
Reviewed By: ezgicicek
Differential Revision: D28543008
fbshipit-source-id: f130e1ab6
Summary: We even have matchers in PulseModels that can do the same thing.
Reviewed By: skcho
Differential Revision: D28540278
fbshipit-source-id: 4bfd8a13e
Summary:
It's unclear whether this can happen but it doesn't cost much to do a
last check before reporting an error to the user.
Reviewed By: skcho
Differential Revision: D28382670
fbshipit-source-id: e23f07ebd
Summary:
This fixes a memory leak false positive. When collecting unreachable
values we should be careful to take the equality relation into account.
Equal values are normally canonicalised but only with respect to "known"
equalities. This makes sure variables that are live thanks to the
"pruned" equalities are not discarded from the state.
Reviewed By: skcho
Differential Revision: D28382642
fbshipit-source-id: 2b898d754
Summary:
This makes reports more readable: they were all at the end of functions,
currently.
This is actually quite tricky to do as it involves detecting which
locations are unreachable.
Some of this logic can/should probably be shared with
`AbductiveDomain.discard_unreachable` but at the moment that's not the
case.
Reviewed By: skcho
Differential Revision: D28382590
fbshipit-source-id: bd4239a0c
Summary:
Hi all,
This is just a small fix tries to resolve the leaked type lost issue. It was excluded from previous CIL race condition PR due to irrelevance.
Thanks!
Pull Request resolved: https://github.com/facebook/infer/pull/1446
Reviewed By: skcho
Differential Revision: D28566755
Pulled By: ngorogiannis
fbshipit-source-id: 1c9938d9c
Summary: For Remodel-generated class (https://github.com/facebook/remodel), their properties are stored/loaded at internal fields named "_<property name>". This diff prepends "_" to property names when writing field info to the type environment when the field is of Remodel-generated class.
Reviewed By: ezgicicek
Differential Revision: D28541495
fbshipit-source-id: d0a1e5a4f
Summary: An anonymous class name includes an index number, for example `AnonymousClass$2` represents that it is the 2nd anonymous class implemented in the `AnonymousClass` class. Problem is that when we insert a new anonymous class, all index of anonymous class names below increase by one, which introduces incorrect comparison on reportdiff.
Reviewed By: ezgicicek
Differential Revision: D28568753
fbshipit-source-id: 2a6c576eb
Summary: Objective-C dispatch methods are not specialized, but have a special case during symbolic execution in biabduction. Reuse the same approach for Pulse: retrieve the given block name and its arguments and call it.
Reviewed By: skcho
Differential Revision: D28550468
fbshipit-source-id: 5017bb71e
Summary: Move Objective-C dispatch models to IR to be able to reuse the same approach in Pulse.
Reviewed By: skcho
Differential Revision: D28550389
fbshipit-source-id: 163826647
Summary:
This diff adds fields for ObjC properties to the type environment. For example,
```
property type fieldname;
```
when the property name is "fieldname", this diff adds a struct field of the same name.
The missing type information were problematic in inferbo, since its semantics depend on types.
Reviewed By: ezgicicek
Differential Revision: D28421998
fbshipit-source-id: e24059846
Summary:
Most/all of the time we expect the history of the value to faithfully
trace how it got allocated. That history was then added as a prefix of
the trace leading to the same place, leading to duplicate information in
the report trace.
We may need to do the same for other bug types.
Reviewed By: ezgicicek
Differential Revision: D28536891
fbshipit-source-id: a83a2d038
Summary: Showcase the trace duplication, fixed in a further diff.
Reviewed By: ezgicicek
Differential Revision: D28536889
fbshipit-source-id: f23636368
Summary:
Make it more obvious why we don't add an Allocated attribute in these
models.
Reviewed By: ezgicicek
Differential Revision: D28536892
fbshipit-source-id: 643539ae6
Summary:
More straightforward (and better asymptotic complexity, not that it
matters) that way. Also log when a leak is found in the debug html.
Reviewed By: ezgicicek
Differential Revision: D28536443
fbshipit-source-id: 08c329100
Summary:
The returned options were never used or only used in cases when they can
only be `None`, as far as I can tell.
Reviewed By: da319
Differential Revision: D28536428
fbshipit-source-id: c16ed4698
Summary:
As explained in the code comment, these reports are generally
non-actionable at best and false positives at worst:
skip reporting for constant dereference (eg null dereference) if the source of the null value is
not on the path of the access, otherwise the report will probably be too confusing: the actual
source of the null value can be obscured as any value equal to 0 (or the constant) can be
selected as the candidate for the trace, even if it has nothing to do with the error besides
being equal to the value being dereferenced
Reviewed By: da319
Differential Revision: D28350193
fbshipit-source-id: 0cd76d252
Summary:
Turns out the mistake was pretty simple: we just forgot to keep the
history of the return value in the callee and add it to the caller's.
Reviewed By: skcho
Differential Revision: D28385941
fbshipit-source-id: 40fe09c99
Summary:
This PR adds race condition detection support on CIL backed languages, such as .NET platform languages.
We will add unit tests later since we are still fine tunning Infer# translation module.
Pull Request resolved: https://github.com/facebook/infer/pull/1443
Reviewed By: jvillard
Differential Revision: D28505195
Pulled By: ngorogiannis
fbshipit-source-id: f263f6ba6
Summary:
This diff fixes inefficient config impact data checking.
Problem: When writing `config-impact-report.json`, it checks if a procedure (`f`) is included in the config impact data set as follows. `cut_parameter` is a function that removes parameters from ObjC method names.
```
ConfigProcnameSet.exists (fun g -> cut_parameter f = cut_parameter g) config_data
```
However, this was very inefficient because it must have iterated all members in the set always. This diff changes it to call `Set.mem` by preparing revised config impact data set (`config_data'`) in which parameters were cut in advance:
```
ConfigProcnameSet.mem (cut_parameter f) config_data'
```
Reviewed By: ezgicicek
Differential Revision: D28506113
fbshipit-source-id: 434d1f083
Summary: Similar as for NSDictionary, nil issues for array literals are caught because of the additional load instruction in the frontend, and we leave modelling arrayWithObjects:count: for later.
Reviewed By: jvillard
Differential Revision: D28442767
fbshipit-source-id: a2f0d4dbf
Summary:
Follow similar approach as in the translation of dictionary literal to insert load instruction to catch nil insertion into collection issues. The missing load instruction was causing false negatives in biabduction. This will also help Pulse to catch nil insertion into collection issues for array literals.
Facebook
Reviewed By: skcho
Differential Revision: D28442642
fbshipit-source-id: b530ac21b
Summary: The counter that accumulates the number of modified source files was logged before it is computed, leading to always zero results.
Reviewed By: jvillard
Differential Revision: D28505378
fbshipit-source-id: 833fb6072
Summary: Similar as for other collections we leave modelling setWithObjects:count: and initWithObjects:count for later.
Reviewed By: skcho
Differential Revision: D28473361
fbshipit-source-id: 4bf57035a
Summary:
In Buck/Java the global type environments of each buck target captured need to be merged. So do the capture DBs. These two tasks can be done concurrently, as both have a computation and an I/O component, and interleaving them should improve perf.
Indeed, profiling the merge process with `offcputime.py` and `cpudist.py` (BPF tools) showed a significant amount of off-cpu time in tests (>40%) as well as a distribution of timings for off-cpu intervals that agrees with IO on a fast medium (ssd).
This diff forks a process to merge the type environments while doing the DB merge as normal. Initial results show an almost 2x improvement.
Reviewed By: skcho
Differential Revision: D28438808
fbshipit-source-id: 89c96f25b
Summary: This diff comments out a test that introduces non-deterministic analysis result.
Reviewed By: rgrig
Differential Revision: D28440794
fbshipit-source-id: 95e6fbe06
Summary:
Collect imports and exports in a data structure ("names environment")
that is easy to look up.
Background:
A function call f(a1,...,an) is shorthand for m:f(a1,...,an) if there is
a -import(m, [..., f/n, ...]); otherwise it is shorthand for c:f(a1,...,an)
where c is the current module. There is an implicit import of the
special "erlang" module. Any ambiguity (e.g., imported twice, or
imported and local) is an error. Also, if there is a -export([...,
f/n,...]) then f/n should be marked as public (ProcAttributes)
Reviewed By: jvillard
Differential Revision: D28290252
fbshipit-source-id: f6d777eb6
Summary: `dictionaryWithObjectsForKeysCount` is a bit more complicated as we need to know if an element of an array is nil. Leaving it for later.
Reviewed By: skcho
Differential Revision: D28413859
fbshipit-source-id: 7b5116de8
Summary:
- Changed "passed as argument to f" to "in call to f", as these do not
always correspond to passing an argument (eg could be a value returned
from f)
- Changed "assigned" to "returned" when appropriate
- Changed the model of malloc() to not say "allocated" in the null case
- Don't print "returned from f" when there was no event inside f: just
print "in call to f".
Reviewed By: da319
Differential Revision: D28413900
fbshipit-source-id: bc85625e3
Summary: This diff copies each field values inside setter/getter of ObjC++.
Reviewed By: da319
Differential Revision: D28413584
fbshipit-source-id: 4c663fc9e
Summary: There is no need to model anything, Pulse is able to catch nil insertion into NSDictionary literals because the frontend dereferences keys and values during the translation of NSDictionary literals
Reviewed By: jvillard
Differential Revision: D28383176
fbshipit-source-id: 01a064daf
Summary: Current traces are difficult to read since they keep mentioning the same leaf call at each step. This diff improves the traces by tracking the intermediate callers.
Reviewed By: skcho
Differential Revision: D28384762
fbshipit-source-id: 78c4cbf7f
Summary:
The order was reversed when printing the trace, leading to confusion.
Also make sure we indicate which part of the trace we are printing when
there is more than one part (either context + access or invalidation +
access, or all three).
Also start nesting at <calling context length> to better represent the
role of the calling context visually.
Reviewed By: da319
Differential Revision: D28329263
fbshipit-source-id: b691fb1f4
Summary:
This diff addresses `GenericArrayBackedCollection.field` and others as pointers. The modeled fields are used as non-pointer struct fields, but their actual semantics are pointers that may have side effects.
For example, `GenericArrayBackedCollection.field` is used for keeping an information that the previous vector's address could be invalid.
```
void foo(vector v) {
v.push_back(0); // v's previous address may be invalid after push_back
// PRE: {v -> {backing_array -> v1}}
// POST: {v -> {backing_array -> v2}}
// ATTR: {v1 may be invalidated}
}
```
However, if we revert the modeled field values, it will return incorrect summary as follows, by reverting non-pointer parameter values.
```
// PRE: {v -> {backing_array -> v1}}
// POST: {v -> {backing_array -> v1}}
// ATTR: {v1 may be invalidated}
```
Reviewed By: jvillard
Differential Revision: D28324161
fbshipit-source-id: 96451d4b0
Summary:
`mutableDictionary[key] = value`, crashes if key is nil, however, if value is nil, any object corresponding to a key will be removed from the dictionary.
Under the hood, `NSMutableDictionary.setObject:forKeyedSubscript:` is called by `mutableDictionary[key] = value`.
Reviewed By: ezgicicek
Differential Revision: D28288789
fbshipit-source-id: e4e1c4288
Summary:
Rebar3.capture now calls into ErlangTranslator to obtain Sil. For now,
ErlangTranslator does nothing interesting.
Reviewed By: skcho
Differential Revision: D28261799
fbshipit-source-id: 0603db671
Summary:
This diff compares ObjC method names loosely when checking whether it is in the config impact data
file or not. This is to cover the cases where method parameters changed.
Reviewed By: jvillard
Differential Revision: D28259169
fbshipit-source-id: e6070df9c
Summary:
The wrapper in `infer/lib/erlang/erlang.sh` dumps Erlang AST forms [1]
in a JSON format. The current commit parses that JSON to obtain an
internal representation (ErlangAst). The main parts of the commit are:
- data structures for Erlang AST
- parser (Erlang abstract forms in JSON format -> Eralng AST)
- Rebar3.ml now drives the parser
[1] https://erlang.org/doc/apps/erts/absform.html
Reviewed By: mmarescotti, jvillard
Differential Revision: D28096896
fbshipit-source-id: b21263817
Summary:
There's been regressions in --pulse-isl. Without tests, everything is
temporary!
Note: the regressions are presumably still there, this just records the
current status of pulse.isl.
Also, no objective-C(++) at the moment. Should we add them too? (in
another diff)
Reviewed By: skcho
Differential Revision: D28256703
fbshipit-source-id: 700b2cc57
Summary:
Added a simple Erlang project to be used as a test for Rebar3
integration, in the following commits. Also, updated the copyright
linter to understand Erlang.
Reviewed By: ngorogiannis, mmarescotti
Differential Revision: D28096899
fbshipit-source-id: 94f15c277
Summary:
A previous change made pulse look into value histories for causes of
invalidation in case the access trace of a value already contained the
reason why that value is invalid, in order to save printing the
invalidation trace in addition to the access trace. It also made
reporting more accurate for null dereference as the source of null was
often better identified (in cases where several values are null or
zero).
But, the history is also relevant to the bug type and the error message.
Make these take histories into account too.
Also fix a bug where we didn't look inside the sub-histories contained
within function calls when looking for an invalidation along the
history.
Reviewed By: da319
Differential Revision: D28254334
fbshipit-source-id: 5ca00ee54
Summary:
There's already all the ingredients to treat function pointers pretty
well, even when stored inside (const) globals.
In OpenSSL they use something like the added tests but the globals are
not const... This may need tweaking via an option, eg to inline all
global initializers, or filtered by global names/file names. Or just
use the existing --pulse-model-{alloc,release}-pattern options.
Reviewed By: skcho
Differential Revision: D28221651
fbshipit-source-id: 5399f1141
Summary:
When garbage-collecting addresses we would also remove their attributes.
But even though the addresses are no longer allocated in the heap, they
might show up in the formula and so we need to remember facts about
them.
This forces us to detect leaks closer to the point where addresses are
deleted from the heap, in AbductiveDomain.ml. This is a nice refactoring
in itself: doing so fixes some other FNs where we sometimes missed leak
detection on dead addresses.
This also makes it unecessary to simplify InstanceOf eagerly when
variables get out of scope.
Some new {folly,std}::optionals false positives that either are similar to existing ones or involve unmodelled smart pointers.
Reviewed By: da319
Differential Revision: D28126103
fbshipit-source-id: e3a903282
Summary: This is a warning not a critical issue/error. Let's downgrade it to Warning to reflect that.
Reviewed By: jvillard
Differential Revision: D28220415
fbshipit-source-id: b2d8f040c
Summary:
Building on the infra in the previous commits, "fix" all the call sites
that introduce invalidations to make sure they also update the
corresponding histories. This is only possible to do when the access
leading to the invalidation can be recorded. Right now the only place
that's untraceable is the model of `free`/`delete`, because it happens
to be the only place where we invalidate an address without knowing
where it comes from (`free(v)`: what was v's access path? we could track
this in the future).
Reviewed By: skcho
Differential Revision: D28118764
fbshipit-source-id: de67f449e
Summary: Not tracking values for global constants might cause nullptr_dereference false positives. In particular, if the code has multiple checks and uses a global constant by its name in one check and its value in another check (see added test case), we are not able prune infeasible paths. This diff addresses such false positives by inlining initializers of global constants when they are being used. An assumption is that most the time the initialization of global constants would not have side effects.
Reviewed By: jvillard
Differential Revision: D25994898
fbshipit-source-id: 26360c4de
Summary:
Implements the translation of most clang atomic builtins to SIL, including those used in `stdatomic.h`. It does not attempt to model the atomicity of the operations, since I don't know of any way Infer can represent that. I didn't bother implementing the rarely used min/max builtins, so they're left as `BuiltinDecl` calls.
This is my first major OCaml project, so any feedback is appreciated!
Also, CONTRIBUTING.md says to update the [facebook-clang-plugins](https://github.com/facebook/facebook-clang-plugins) submodule, but it doesn't seem to be a submodule anymore, and the code has diverged from that repo. Should I still make a PR over there?
Pull Request resolved: https://github.com/facebook/infer/pull/1434
Reviewed By: skcho
Differential Revision: D28118300
Pulled By: jvillard
fbshipit-source-id: 121c4ad25
Summary:
Warn if either an object or a key is nil for NSMutableDictionary setObject:forKey:.
Next steps: introduce new special issue type and model more collections
Reviewed By: ezgicicek
Differential Revision: D28189382
fbshipit-source-id: 1697829ee
Summary:
Function calls that accept blocks as arguments may have additional arguments for the captured variables of the block. Cost models for these functions only considered the case where the block arguments didn't capture variables. This diff extends the model so that we handle captured variable case.
This fixes some FNs in the analysis.
Reviewed By: skcho
Differential Revision: D28183071
fbshipit-source-id: 6a045e80e
Summary:
Losing attributes is a bug, not sure when it would matter. There's a
TODO there that's still valid.
Reviewed By: da319
Differential Revision: D28125940
fbshipit-source-id: 923ceedb8
Summary: Cleaner + makes it easier to change details of the implementation.
Reviewed By: da319
Differential Revision: D28125725
fbshipit-source-id: ac9258908
Summary:
In main() all latent issues are manifest issues as the only parameters
are user-controlled.
Reviewed By: skcho
Differential Revision: D28121535
fbshipit-source-id: eab54d5bc
Summary: I should have listened to skcho on D28002725 (7207e05682). Needed for the next diff.
Reviewed By: skcho
Differential Revision: D28121203
fbshipit-source-id: 96c01b141
Summary:
On big analyses, we routinely observe analyses nesting levels of the
order of ~5k (i.e. 5342> on the interactive taskar). This call depth
is meaningless given the size of the codebase we analyze.
Instead, I believe this is due to the fact that Ondemand does not
properly snapshot/restore the nesting level when starting an analysis,
and more importantly when abandoning parts of it. This commit
fixes that oversight.
Pull Request resolved: https://github.com/facebook/infer/pull/1431
Reviewed By: ngorogiannis
Differential Revision: D28118297
Pulled By: jvillard
fbshipit-source-id: 0faf28f84
Summary:
the mapping for computing 'leq' relation in isomorphic graphs was sometimes mixed with opposite mappings due to typos in the code.
Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for how to set up your development environment and run tests.
Pull Request resolved: https://github.com/facebook/infer/pull/1424
Reviewed By: ngorogiannis
Differential Revision: D28118324
Pulled By: jvillard
fbshipit-source-id: 56e813bd1
Summary: Some code generation mechanisms produce code under `buck-out/gen/<hash>`. This diff allows handling such sources by producing deterministic capture DBs via omitting the hash.
Reviewed By: skcho
Differential Revision: D27501193
fbshipit-source-id: 3c2ac92a9
Summary:
As explained in the previous diff: when the access trace goes through
the invalidation step there is no need to print the invalidation trace
at all.
Note: only a few sources of invalidation are handled at the moment. The
following diffs gradually fix the other sources of invalidation.
Reviewed By: skcho
Differential Revision: D28098335
fbshipit-source-id: 5a5e6481e
Summary:
The eventual goal is to stop having separate sections of the trace
("invalidation part" + "access part") when the "access part" already
goes through the invalidation step. For this, it needs to record when a
value is made invalid along the path.
This is also important for assignements to NULL/0/nullptr/nil: right now
the way we record that 0 is not a valid address is via an attribute
attached to the abstract value that corresponds to 0. This makes traces
inconsistent sometimes: 0 can appear in many places in the same function
and we won't necessarily pick the correct one. In other words, attaching
traces to *values* is fragile, as the same value can be produced in many
ways. On the other hand, histories are stored at the point of access, eg
x->f, so have a much better chance of being correct. See added test:
right now its traces is completely wrong and makes the 0 in `if
(utf16StringLen == 0)` the source of the NULL value instead of the
return of `malloc()`!
This diff makes the traces slightly more verbose for now but this is
fixed in a following diff as the traces that got longer are those that
don't actually need an "invalidation" trace.
Reviewed By: skcho
Differential Revision: D28098337
fbshipit-source-id: e17929259
Summary:
See added test: pulse sometimes insisted that an issue was latent even
though the condition that made it latent could not be influenced (hence
could the issue could never become manifest) by callers because it was
unrelated to the pre, i.e. it came from a mutation inside the function.
In these cases, we want to report the issue straight away instead of
keeping it latent.
Reviewed By: skcho
Differential Revision: D28002725
fbshipit-source-id: ce9e6f190
Summary:
This diff adds semantics for temporary boolean variables to keep config values.
* It extended value domain to have `TempBool` that is basically a pair of `ConfigChecks.t`; one is a
set of config values checked when the temporary variable is true, and the other is that when the
temporary variable is false.
* It assigns the `TempBool` value when `temp=1` or `temp=0`.
* It uses the `TempBool` value when pruning condition expression.
For example, when there is an `if` statement of
```
return (config && b);
```
it is translated in SIL,
```
if (config) {
if (b) {
temp = 1; // (1)
} else {
temp = 0; // (2)
}
} else {
temp = 0; // (3)
}
return temp;
```
then we can say
* When `temp` is true, i.e. at (1), it is gated by `config`
* When `temp` is false, i.e. at (2) and (3), we are not sure about the the gatedness; at (2) it is gated by `config` but at (3) it is gated by `!config`.
So, we record such information as a `TempBool.t` value.
Next, when we use the return value at its caller,
```
if (ret) {
// then branch
} else {
// else branch
}
```
We can say "then branch" part is gated by `config`, but we are not sure if "else branch" part is gated, by using the `TempBool.t` value of `ret`.
Reviewed By: ezgicicek
Differential Revision: D28056490
fbshipit-source-id: e90d8afd3