Summary:
This diff fixes inefficient config impact data checking.
Problem: When writing `config-impact-report.json`, it checks if a procedure (`f`) is included in the config impact data set as follows. `cut_parameter` is a function that removes parameters from ObjC method names.
```
ConfigProcnameSet.exists (fun g -> cut_parameter f = cut_parameter g) config_data
```
However, this was very inefficient because it must have iterated all members in the set always. This diff changes it to call `Set.mem` by preparing revised config impact data set (`config_data'`) in which parameters were cut in advance:
```
ConfigProcnameSet.mem (cut_parameter f) config_data'
```
Reviewed By: ezgicicek
Differential Revision: D28506113
fbshipit-source-id: 434d1f083
Summary: Similar as for NSDictionary, nil issues for array literals are caught because of the additional load instruction in the frontend, and we leave modelling arrayWithObjects:count: for later.
Reviewed By: jvillard
Differential Revision: D28442767
fbshipit-source-id: a2f0d4dbf
Summary:
Follow similar approach as in the translation of dictionary literal to insert load instruction to catch nil insertion into collection issues. The missing load instruction was causing false negatives in biabduction. This will also help Pulse to catch nil insertion into collection issues for array literals.
Facebook
Reviewed By: skcho
Differential Revision: D28442642
fbshipit-source-id: b530ac21b
Summary: The counter that accumulates the number of modified source files was logged before it is computed, leading to always zero results.
Reviewed By: jvillard
Differential Revision: D28505378
fbshipit-source-id: 833fb6072
Summary: Similar as for other collections we leave modelling setWithObjects:count: and initWithObjects:count for later.
Reviewed By: skcho
Differential Revision: D28473361
fbshipit-source-id: 4bf57035a
Summary:
In Buck/Java the global type environments of each buck target captured need to be merged. So do the capture DBs. These two tasks can be done concurrently, as both have a computation and an I/O component, and interleaving them should improve perf.
Indeed, profiling the merge process with `offcputime.py` and `cpudist.py` (BPF tools) showed a significant amount of off-cpu time in tests (>40%) as well as a distribution of timings for off-cpu intervals that agrees with IO on a fast medium (ssd).
This diff forks a process to merge the type environments while doing the DB merge as normal. Initial results show an almost 2x improvement.
Reviewed By: skcho
Differential Revision: D28438808
fbshipit-source-id: 89c96f25b
Summary: This diff comments out a test that introduces non-deterministic analysis result.
Reviewed By: rgrig
Differential Revision: D28440794
fbshipit-source-id: 95e6fbe06
Summary:
Collect imports and exports in a data structure ("names environment")
that is easy to look up.
Background:
A function call f(a1,...,an) is shorthand for m:f(a1,...,an) if there is
a -import(m, [..., f/n, ...]); otherwise it is shorthand for c:f(a1,...,an)
where c is the current module. There is an implicit import of the
special "erlang" module. Any ambiguity (e.g., imported twice, or
imported and local) is an error. Also, if there is a -export([...,
f/n,...]) then f/n should be marked as public (ProcAttributes)
Reviewed By: jvillard
Differential Revision: D28290252
fbshipit-source-id: f6d777eb6
Summary: `dictionaryWithObjectsForKeysCount` is a bit more complicated as we need to know if an element of an array is nil. Leaving it for later.
Reviewed By: skcho
Differential Revision: D28413859
fbshipit-source-id: 7b5116de8
Summary:
- Changed "passed as argument to f" to "in call to f", as these do not
always correspond to passing an argument (eg could be a value returned
from f)
- Changed "assigned" to "returned" when appropriate
- Changed the model of malloc() to not say "allocated" in the null case
- Don't print "returned from f" when there was no event inside f: just
print "in call to f".
Reviewed By: da319
Differential Revision: D28413900
fbshipit-source-id: bc85625e3
Summary: This diff copies each field values inside setter/getter of ObjC++.
Reviewed By: da319
Differential Revision: D28413584
fbshipit-source-id: 4c663fc9e
Summary: There is no need to model anything, Pulse is able to catch nil insertion into NSDictionary literals because the frontend dereferences keys and values during the translation of NSDictionary literals
Reviewed By: jvillard
Differential Revision: D28383176
fbshipit-source-id: 01a064daf
Summary: Current traces are difficult to read since they keep mentioning the same leaf call at each step. This diff improves the traces by tracking the intermediate callers.
Reviewed By: skcho
Differential Revision: D28384762
fbshipit-source-id: 78c4cbf7f
Summary:
The order was reversed when printing the trace, leading to confusion.
Also make sure we indicate which part of the trace we are printing when
there is more than one part (either context + access or invalidation +
access, or all three).
Also start nesting at <calling context length> to better represent the
role of the calling context visually.
Reviewed By: da319
Differential Revision: D28329263
fbshipit-source-id: b691fb1f4
Summary:
This diff addresses `GenericArrayBackedCollection.field` and others as pointers. The modeled fields are used as non-pointer struct fields, but their actual semantics are pointers that may have side effects.
For example, `GenericArrayBackedCollection.field` is used for keeping an information that the previous vector's address could be invalid.
```
void foo(vector v) {
v.push_back(0); // v's previous address may be invalid after push_back
// PRE: {v -> {backing_array -> v1}}
// POST: {v -> {backing_array -> v2}}
// ATTR: {v1 may be invalidated}
}
```
However, if we revert the modeled field values, it will return incorrect summary as follows, by reverting non-pointer parameter values.
```
// PRE: {v -> {backing_array -> v1}}
// POST: {v -> {backing_array -> v1}}
// ATTR: {v1 may be invalidated}
```
Reviewed By: jvillard
Differential Revision: D28324161
fbshipit-source-id: 96451d4b0
Summary:
`mutableDictionary[key] = value`, crashes if key is nil, however, if value is nil, any object corresponding to a key will be removed from the dictionary.
Under the hood, `NSMutableDictionary.setObject:forKeyedSubscript:` is called by `mutableDictionary[key] = value`.
Reviewed By: ezgicicek
Differential Revision: D28288789
fbshipit-source-id: e4e1c4288
Summary:
Rebar3.capture now calls into ErlangTranslator to obtain Sil. For now,
ErlangTranslator does nothing interesting.
Reviewed By: skcho
Differential Revision: D28261799
fbshipit-source-id: 0603db671
Summary:
This diff compares ObjC method names loosely when checking whether it is in the config impact data
file or not. This is to cover the cases where method parameters changed.
Reviewed By: jvillard
Differential Revision: D28259169
fbshipit-source-id: e6070df9c
Summary:
The wrapper in `infer/lib/erlang/erlang.sh` dumps Erlang AST forms [1]
in a JSON format. The current commit parses that JSON to obtain an
internal representation (ErlangAst). The main parts of the commit are:
- data structures for Erlang AST
- parser (Erlang abstract forms in JSON format -> Eralng AST)
- Rebar3.ml now drives the parser
[1] https://erlang.org/doc/apps/erts/absform.html
Reviewed By: mmarescotti, jvillard
Differential Revision: D28096896
fbshipit-source-id: b21263817
Summary:
There's been regressions in --pulse-isl. Without tests, everything is
temporary!
Note: the regressions are presumably still there, this just records the
current status of pulse.isl.
Also, no objective-C(++) at the moment. Should we add them too? (in
another diff)
Reviewed By: skcho
Differential Revision: D28256703
fbshipit-source-id: 700b2cc57
Summary:
Added a simple Erlang project to be used as a test for Rebar3
integration, in the following commits. Also, updated the copyright
linter to understand Erlang.
Reviewed By: ngorogiannis, mmarescotti
Differential Revision: D28096899
fbshipit-source-id: 94f15c277
Summary:
A previous change made pulse look into value histories for causes of
invalidation in case the access trace of a value already contained the
reason why that value is invalid, in order to save printing the
invalidation trace in addition to the access trace. It also made
reporting more accurate for null dereference as the source of null was
often better identified (in cases where several values are null or
zero).
But, the history is also relevant to the bug type and the error message.
Make these take histories into account too.
Also fix a bug where we didn't look inside the sub-histories contained
within function calls when looking for an invalidation along the
history.
Reviewed By: da319
Differential Revision: D28254334
fbshipit-source-id: 5ca00ee54
Summary:
There's already all the ingredients to treat function pointers pretty
well, even when stored inside (const) globals.
In OpenSSL they use something like the added tests but the globals are
not const... This may need tweaking via an option, eg to inline all
global initializers, or filtered by global names/file names. Or just
use the existing --pulse-model-{alloc,release}-pattern options.
Reviewed By: skcho
Differential Revision: D28221651
fbshipit-source-id: 5399f1141
Summary:
When garbage-collecting addresses we would also remove their attributes.
But even though the addresses are no longer allocated in the heap, they
might show up in the formula and so we need to remember facts about
them.
This forces us to detect leaks closer to the point where addresses are
deleted from the heap, in AbductiveDomain.ml. This is a nice refactoring
in itself: doing so fixes some other FNs where we sometimes missed leak
detection on dead addresses.
This also makes it unecessary to simplify InstanceOf eagerly when
variables get out of scope.
Some new {folly,std}::optionals false positives that either are similar to existing ones or involve unmodelled smart pointers.
Reviewed By: da319
Differential Revision: D28126103
fbshipit-source-id: e3a903282
Summary: This is a warning not a critical issue/error. Let's downgrade it to Warning to reflect that.
Reviewed By: jvillard
Differential Revision: D28220415
fbshipit-source-id: b2d8f040c
Summary:
Building on the infra in the previous commits, "fix" all the call sites
that introduce invalidations to make sure they also update the
corresponding histories. This is only possible to do when the access
leading to the invalidation can be recorded. Right now the only place
that's untraceable is the model of `free`/`delete`, because it happens
to be the only place where we invalidate an address without knowing
where it comes from (`free(v)`: what was v's access path? we could track
this in the future).
Reviewed By: skcho
Differential Revision: D28118764
fbshipit-source-id: de67f449e
Summary: Not tracking values for global constants might cause nullptr_dereference false positives. In particular, if the code has multiple checks and uses a global constant by its name in one check and its value in another check (see added test case), we are not able prune infeasible paths. This diff addresses such false positives by inlining initializers of global constants when they are being used. An assumption is that most the time the initialization of global constants would not have side effects.
Reviewed By: jvillard
Differential Revision: D25994898
fbshipit-source-id: 26360c4de
Summary:
Implements the translation of most clang atomic builtins to SIL, including those used in `stdatomic.h`. It does not attempt to model the atomicity of the operations, since I don't know of any way Infer can represent that. I didn't bother implementing the rarely used min/max builtins, so they're left as `BuiltinDecl` calls.
This is my first major OCaml project, so any feedback is appreciated!
Also, CONTRIBUTING.md says to update the [facebook-clang-plugins](https://github.com/facebook/facebook-clang-plugins) submodule, but it doesn't seem to be a submodule anymore, and the code has diverged from that repo. Should I still make a PR over there?
Pull Request resolved: https://github.com/facebook/infer/pull/1434
Reviewed By: skcho
Differential Revision: D28118300
Pulled By: jvillard
fbshipit-source-id: 121c4ad25
Summary:
Warn if either an object or a key is nil for NSMutableDictionary setObject:forKey:.
Next steps: introduce new special issue type and model more collections
Reviewed By: ezgicicek
Differential Revision: D28189382
fbshipit-source-id: 1697829ee
Summary:
Function calls that accept blocks as arguments may have additional arguments for the captured variables of the block. Cost models for these functions only considered the case where the block arguments didn't capture variables. This diff extends the model so that we handle captured variable case.
This fixes some FNs in the analysis.
Reviewed By: skcho
Differential Revision: D28183071
fbshipit-source-id: 6a045e80e
Summary:
Losing attributes is a bug, not sure when it would matter. There's a
TODO there that's still valid.
Reviewed By: da319
Differential Revision: D28125940
fbshipit-source-id: 923ceedb8
Summary: Cleaner + makes it easier to change details of the implementation.
Reviewed By: da319
Differential Revision: D28125725
fbshipit-source-id: ac9258908
Summary:
In main() all latent issues are manifest issues as the only parameters
are user-controlled.
Reviewed By: skcho
Differential Revision: D28121535
fbshipit-source-id: eab54d5bc
Summary: I should have listened to skcho on D28002725 (7207e05682). Needed for the next diff.
Reviewed By: skcho
Differential Revision: D28121203
fbshipit-source-id: 96c01b141
Summary:
On big analyses, we routinely observe analyses nesting levels of the
order of ~5k (i.e. 5342> on the interactive taskar). This call depth
is meaningless given the size of the codebase we analyze.
Instead, I believe this is due to the fact that Ondemand does not
properly snapshot/restore the nesting level when starting an analysis,
and more importantly when abandoning parts of it. This commit
fixes that oversight.
Pull Request resolved: https://github.com/facebook/infer/pull/1431
Reviewed By: ngorogiannis
Differential Revision: D28118297
Pulled By: jvillard
fbshipit-source-id: 0faf28f84
Summary:
the mapping for computing 'leq' relation in isomorphic graphs was sometimes mixed with opposite mappings due to typos in the code.
Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for how to set up your development environment and run tests.
Pull Request resolved: https://github.com/facebook/infer/pull/1424
Reviewed By: ngorogiannis
Differential Revision: D28118324
Pulled By: jvillard
fbshipit-source-id: 56e813bd1
Summary: Some code generation mechanisms produce code under `buck-out/gen/<hash>`. This diff allows handling such sources by producing deterministic capture DBs via omitting the hash.
Reviewed By: skcho
Differential Revision: D27501193
fbshipit-source-id: 3c2ac92a9
Summary:
As explained in the previous diff: when the access trace goes through
the invalidation step there is no need to print the invalidation trace
at all.
Note: only a few sources of invalidation are handled at the moment. The
following diffs gradually fix the other sources of invalidation.
Reviewed By: skcho
Differential Revision: D28098335
fbshipit-source-id: 5a5e6481e
Summary:
The eventual goal is to stop having separate sections of the trace
("invalidation part" + "access part") when the "access part" already
goes through the invalidation step. For this, it needs to record when a
value is made invalid along the path.
This is also important for assignements to NULL/0/nullptr/nil: right now
the way we record that 0 is not a valid address is via an attribute
attached to the abstract value that corresponds to 0. This makes traces
inconsistent sometimes: 0 can appear in many places in the same function
and we won't necessarily pick the correct one. In other words, attaching
traces to *values* is fragile, as the same value can be produced in many
ways. On the other hand, histories are stored at the point of access, eg
x->f, so have a much better chance of being correct. See added test:
right now its traces is completely wrong and makes the 0 in `if
(utf16StringLen == 0)` the source of the NULL value instead of the
return of `malloc()`!
This diff makes the traces slightly more verbose for now but this is
fixed in a following diff as the traces that got longer are those that
don't actually need an "invalidation" trace.
Reviewed By: skcho
Differential Revision: D28098337
fbshipit-source-id: e17929259
Summary:
See added test: pulse sometimes insisted that an issue was latent even
though the condition that made it latent could not be influenced (hence
could the issue could never become manifest) by callers because it was
unrelated to the pre, i.e. it came from a mutation inside the function.
In these cases, we want to report the issue straight away instead of
keeping it latent.
Reviewed By: skcho
Differential Revision: D28002725
fbshipit-source-id: ce9e6f190
Summary:
This diff adds semantics for temporary boolean variables to keep config values.
* It extended value domain to have `TempBool` that is basically a pair of `ConfigChecks.t`; one is a
set of config values checked when the temporary variable is true, and the other is that when the
temporary variable is false.
* It assigns the `TempBool` value when `temp=1` or `temp=0`.
* It uses the `TempBool` value when pruning condition expression.
For example, when there is an `if` statement of
```
return (config && b);
```
it is translated in SIL,
```
if (config) {
if (b) {
temp = 1; // (1)
} else {
temp = 0; // (2)
}
} else {
temp = 0; // (3)
}
return temp;
```
then we can say
* When `temp` is true, i.e. at (1), it is gated by `config`
* When `temp` is false, i.e. at (2) and (3), we are not sure about the the gatedness; at (2) it is gated by `config` but at (3) it is gated by `!config`.
So, we record such information as a `TempBool.t` value.
Next, when we use the return value at its caller,
```
if (ret) {
// then branch
} else {
// else branch
}
```
We can say "then branch" part is gated by `config`, but we are not sure if "else branch" part is gated, by using the `TempBool.t` value of `ret`.
Reviewed By: ezgicicek
Differential Revision: D28056490
fbshipit-source-id: e90d8afd3
Summary: Small refactor as this function belongs more in Pvar than Var.
Reviewed By: skcho
Differential Revision: D28091618
fbshipit-source-id: 259bd82d5
Summary: To be fair that doesn't seem to matter at all, with no test affected.
Reviewed By: ngorogiannis
Differential Revision: D28091608
fbshipit-source-id: 172bd2ff1
Summary:
Just moving stuff around.
This is possibly useful for making Pvar depend on ProcAttributes for
other things, eg checking if a pvar is captured by a procedure (which
would be awkward to have in the API of ProcAttributes and not Pvar).
Overall it forced me to move a few other things around in a way that I
feel makes more sense anyway.
Reviewed By: skcho
Differential Revision: D28091497
fbshipit-source-id: 367a1f17c
Summary:
Before returning a summary, restore formals to their initial values.
This gets rid of a false latent because the value in the path condition
is now garbage-collected.
Added a test for the tricky case of structs passed as values.
Reviewed By: skcho
Differential Revision: D28001229
fbshipit-source-id: 23dda5b43
Summary:
This diff adds semantics for long-typed config values.
* It extended branch types to keep condition expressions passed,
* then used it to in the prune semantics.
Reviewed By: ezgicicek
Differential Revision: D28055936
fbshipit-source-id: 0d12930cf
Summary:
This diff introduces [ISys.file_exists] that is similar to [Sys.file_exists_exn], but returns
[false] when the result is known, instead of raising an exception.
Reviewed By: jvillard
Differential Revision: D28059863
fbshipit-source-id: d54851cfb
Summary: This diff adds an abstract semantics for returning config values at function calls.
Reviewed By: ezgicicek
Differential Revision: D28055544
fbshipit-source-id: 5fe51c538
Summary:
This looks a bit better as it makes it easier to ignore parts of the
arguments in models, which happens all the time. Also easier to add more
to the record in the future, which is the real reason.
Reviewed By: skcho
Differential Revision: D27997695
fbshipit-source-id: a7c680025
Summary:
This diff avoids dereference of C struct, in its frontend and its semantics of Pulse. In SIL, C
struct is not first-class value, thus dereferencing on it does not make sense.
Reviewed By: ezgicicek
Differential Revision: D27953258
fbshipit-source-id: 348d56338
Summary: This diff copies each field values inside setter/getter of ObjC.
Reviewed By: ezgicicek
Differential Revision: D27940521
fbshipit-source-id: 9977cae75
Summary:
This diff does refactoring for the following diff.
* Define Mangled.return_param and Mangled.is_return_param and use it instead of
Ident.name_return_param.
* Share common code from objc_setter and objc_getter
* Move struct_copy to CStructUtils.ml
Reviewed By: da319
Differential Revision: D27940125
fbshipit-source-id: 84eb3109b
Summary:
Looking at the recent silent analysis results, it seems that we report many direct unknown library calls (often cheap)... However, if these were called inside some other callee we wouldn't report them because their costs would be assumed to be constant by the cost analysis.
This is a bit awkward. We should either report all unknown calls or suppress them altogether.
Since we have too many reports per day and not a good way to determine whether an unknown library call is cheap/expensive, let's take option 2.
Then, we would be only relying on two things to determine whether to report/not:
- instantiated cost's degree > 1
- explicitly known to be expensive (i.e. modeled in ConfigImpact, like string append)
Reviewed By: skcho
Differential Revision: D27909003
fbshipit-source-id: 0391d226d
Summary:
This is mostly useful to avoid duplicating error states, which are
propagated unchanged through both branches of, say, conditionals, and
can end up duplicated if the join is not careful:
```
{[Abort(Error 1), Abort(Error 2), Continue σ']}
if (..) { .. } else { .. }
{JOIN([Abort(Error 1), Abort(Error 2), Continue σ_then],
[Abort(Error 1), Abort(Error 2), Continue σ_else])}
{[Abort(Error 1), Abort(Error 2), Continue σ_then, Continue σ_else]}
```
Whereas before this diff we got
```
{[Abort(Error 1), Abort(Error 2), Continue σ_then, Abort(Error 1), Abort(Error 2), Continue σ_else]}
```
Detect states that do not change simply using `phys_equal` as they
should literally not change. Refactor the code to be able to re-use the
same logic in the stronger join used in widening, that compares states
using the domain's `leq` relation to establish implication.
Reviewed By: ezgicicek
Differential Revision: D27908529
fbshipit-source-id: b461165da
Summary:
When a block value is passed via more than one-depth of function calls, it is not analyzed correctly
because current inlining mechanism (specializing objc block parameters) of the frontend works for
only one-depth of block passing. This diff gives up analyzing initialized-ness of captured
variables in ObjC to avoid FPs.
Reviewed By: da319
Differential Revision: D27885395
fbshipit-source-id: fc6b4663c
Summary: We have some FPs due to unknown init methods that are added dynamically.
Reviewed By: ezgicicek
Differential Revision: D27856371
fbshipit-source-id: b6fb46df3
Summary:
This diff filters known expensive callees when cost is constant.
previous:
```
foo() {
known_expensive_call();
}
```
current:
```
foo() {
known_expensive_call();
goo();
}
// cost is constant
goo() {
known_expensive_call();
unknown_call();
}
```
When callee's cost is constant and its summary includes known expensive callees, the checker addressed it as a non-constant-cost callee, i.e., it copies all ungated callees from the callee's summary. However, sometimes this full-copying introduces unexpected issues. For example, suppose a callee `goo` is added and `goo`'s cost is constant as above. Since it includes `known_expensive_call`, all ungated callees of its summary is copied to the caller `foo`'s summary:
* `foo`'s ungated callees (before): {`known_expensive_call`}
* `foo`'s ungated callees (after): {`known_expensive_call`, `unknown_call`}
As a result, it would report about `unknown_call` is added. However, this is not what we intended: In the example, `unknown_call` is reported because it is called in the same function with `known_expensive_call`, not because it is expensive.
To fix that issue, this diff filters known expensive callees from `goo`'s summary in that case.
Reviewed By: ezgicicek
Differential Revision: D27852552
fbshipit-source-id: d207eef1c
Summary:
It introduced a FP due to reporting addition of `__cast`.
* This diff added known cheap model for `__cast`.
* In addition, moved `match_builtin` to `BuiltinDecl`.
Reviewed By: ezgicicek
Differential Revision: D27791495
fbshipit-source-id: 55aec1728
Summary:
Previously we were only taking constexpr into account on constructors.
Add this info to ProcAttributes.t instead by exporting it from the
plugin for all functions.
This allows SIOF to take constexpr into account in more cases as it's not
always good at capturing which functions *can* be constexpr-evaluated,
which caused false positives.
Delete now-useless is_constexpr in constructor types. This generated the
changes in frontend tests.
Some minor renamings of variants of is_const_expr -> is_constexpr.
Reviewed By: da319
Differential Revision: D27503433
fbshipit-source-id: 3d1972900
Summary:
When a field is assigned by a value,
```
_.field = exp;
```
it should collect the field when the abstract config value of `exp` is non-bottom, rather than
non-top.
Reviewed By: ezgicicek
Differential Revision: D27766188
fbshipit-source-id: a0b1f2c28
Summary: Added a new issue type for sending a message to nil when its return type is non-POD. To distinguish these issues from other nullptr dereference issues, we extend the `MustBeValid` attribute to contain the reason of why an address must be valid. For now a reason can only have `SelfOfNonPODReturnMethod` as it's value, but in the future we will use it for other nullability issue types, such as nil insertion into collections.
Reviewed By: jvillard
Differential Revision: D27762333
fbshipit-source-id: 689e5a431
Summary:
In order to use Inferbo's analysis result, a checker should know current instruction index.
However, for the checkers using `ProcCfg.Normal` CFG, it was impossible to get the instruction
index. To solve the issue, this diff changes the AbsInt framework to give the index together to
`exec_instr`.
Reviewed By: ezgicicek
Differential Revision: D27680894
fbshipit-source-id: 1dc8ff0fb
Summary:
The problem is that `Models.is_field_nonnullable` didn't differentiate
between
- having a nullable model (in which case we want the field to be NULLABLE),
- not having a model at all (in which case we want the field to be THIRDPARTY_NONNULL).
The problem was noticed only now because previously we didn't have any
NULLABLE field models.
Reviewed By: ngorogiannis
Differential Revision: D27709508
fbshipit-source-id: b98c8f86f
Summary:
This diff evaluates a cpp vector given as a parameter symbolically. Especially, it addresses it as an array, so the cost checker can use its symbolic length correctly.
**About handling `cpp.vector_elem` field:**
The field is a virtual field of vector object that points to the array of vector elements. It was introduced in Inferbo to model semantics of vector operations.
Since many semantics of Inferbo depends on type information, it had collected type information of vector elements, whenever `cpp.vector_elem` field was introduced, as a *side-effect*. A problem is that it has *side-effect*, which means it may introduce non-deterministic analysis results depending on the type information of the virtual field.
This diff changes it not to collect the type information on `cpp.vector_elem` as a side-effect. Instead, it tries to write the information to the abstract states (abstract location) when possible.
Reviewed By: ezgicicek
Differential Revision: D27674935
fbshipit-source-id: f3d52cae7
Summary:
In D27430485 (a6ab4d38cf), we used the static cost of the callee to determine whether it was cheap/expensive. This diff improves on that by taking the whole instantiated cost of the function call (not just the callee's cost).
Also, if the callee is an unmodeled call, we consider it to be expensive as before.
Note: cost instantiation was used by hoisting. I refactored bunch of code there to reuse as much as code possible.
Reviewed By: skcho
Differential Revision: D27649302
fbshipit-source-id: 07d11f3dd
Summary: When instantiating the callee's cost, we have picked up the InferBo memory at the node corresponding to the last instruction. Instead, we should pick up right at the call instruction. Picking it up later might cause arguments to go out of scope.
Reviewed By: skcho
Differential Revision: D27652474
fbshipit-source-id: 5ab35cabb
Summary:
The output differs on Java 11 compared to Java 8: one prints an
interface, the other resolves to a class name.
Reviewed By: ezgicicek
Differential Revision: D27678552
fbshipit-source-id: c5a5d0c39
Summary: We have been referring to the arguments of a function call as "params". This has been bothering me. Let's fix it!
Reviewed By: ngorogiannis
Differential Revision: D27649158
fbshipit-source-id: 10e0b28cb
Summary:
To avoid too big abstract states due to instantiated templates in C++,
this diff loosens the compare functions of field names and ungated
callees.
Reviewed By: ezgicicek
Differential Revision: D27625775
fbshipit-source-id: e33e9d34c
Summary:
Nullsafe/biabduction tests were sensitive to Java version: they were recorded for Java 8 but if the machine that is used to run the tests had Java 11, tests would fail. This diff aims to resolve this issue by
- making our tests produce java8-compatible bytecode so that tests don't fail on Java 11 machines
- removing nullsafe tests that exercise obscure Java 8 behavior that cannot be alleviated with backward compatible bytecode on Java 11
- changing lambda argument printing to be Java 11 compatible
Reviewed By: martintrojer
Differential Revision: D27500731
fbshipit-source-id: 77fe302ea
Summary:
Reporting all ungated (un configed?) function calls causes many FPs. Instead, we rely on complexity analysis to determine whether a function is cheap/expensive: if the callee's complexity is not symbolic (e.g. constant), we consider it as cheap and don't keep track of it.
Note that we don't take the instantiated/modeled cost into account yet. So, if we have `foo(int n)` with complexity `O(n)`, and call it as `foo(3)`, we would still keep track of it. Similarly, if `foo` is a modeled function with constant time complexity, we would have no summary for it hence would keep track of it.
These will be improved later.
Reviewed By: skcho
Differential Revision: D27430485
fbshipit-source-id: d5f66320d
Summary:
This diff removes additional inferbo options `--bufferoverrun` from cost tests, since printing
inferbo issues is not that useful to understand cost results.
Reviewed By: ngorogiannis
Differential Revision: D27592496
fbshipit-source-id: 6ab3e6528
Summary:
Whenever an equality "t = v" (t an arbitrary term, v a variable) is
added (or "v = t"), remember the "t -> v" mapping after canonicalising t
and v. Use this to detect when two variables are equal to the same term:
`t = v` and `t = v'` now yields `v = v'` to be added to the equality
relation of variables. This increases the precision of the arithmetic
engine.
Interestingly, the impact on most code I've tried is:
1. mostly same perfs as before, if a bit slower (could be within noise)
2. slightly more (latent) bugs reported in absolute numbers
I would have expected it to be more expensive and yield fewer bugs (as
fewer false positives), but there could be second-order effects at play
here where we get more coverage. We definitely get more latent issues
due to dereferencing pointers after testing nullness, as can be seen in
the unit tests as well, which may alone explain (2).
There's some complexity when adding term equalities where the term
is linear, as we also need to add it to `linear_eqs` but `term_eqs` and
`linear_eqs` are interested in slightly different normal forms.
Reviewed By: skcho
Differential Revision: D27331336
fbshipit-source-id: 7314e127a
Summary:
It's better (=possibly more efficient) to take the opportunity to
normalize linear terms when we can instead of possibly having to apply
the same normalization over and over on individual terms until the next
round of proper normalization.
Reviewed By: skcho
Differential Revision: D27464885
fbshipit-source-id: 0dc01a089
Summary:
When we don't know the value being shifted it may help to translate
bit-shifting into multiplication by a constant as it might surface
linear terms, eg `x<<1` is `2*x`.
Reviewed By: skcho
Differential Revision: D27464847
fbshipit-source-id: 9b3b5f0d0
Summary:
The simplifications done by `simplify_shallow` are all taken care of by
`eval_const_shallow` as well, they just also happen to help when not
*all* of the term is a constant. However, they might be less
precise/efficient than in the constant case, in particular in the next
diff that translates `x << c` into `x * 2^c` when `c` is constant.
Reviewed By: skcho
Differential Revision: D27464805
fbshipit-source-id: 452bc6ab1
Summary:
On some pathological examples of crypto primitives like libsodium, later
diffs make pulse grind to a halt due to an explosion in the size of
literals. This is at least partly due to the fact the arithmetic doesn't
operate modulo 2^64.
Due to the fact the arithmetic is confused in any case when we reach
such large numbers, cap them, currently at 2^128. This removes pathological
cases for now, even now on libsodium Pulse is ~5 times faster than before!
Take this opportunity to put the modified Q/Z modules in the own files.
Reviewed By: jberdine
Differential Revision: D27463933
fbshipit-source-id: 342d941e2
Summary: Just some scaffolding to save a bit of churn from the next diff.
Reviewed By: skcho
Differential Revision: D27328348
fbshipit-source-id: 4f5bfcc65
Summary:
This was added in C++14. Was investigating how SIOF dealt with this but
it turns out it already does the right thing as the translation unit of
global variable templates shows up as the place they are instantiated
(not the one where they are declared), which works well for SIOF
checking.
Reviewed By: da319
Differential Revision: D27500998
fbshipit-source-id: b8b9b9c48
Summary:
This is better suited than the generic "cGeneral_utils", and saves
exporting one of them too.
Reviewed By: da319
Differential Revision: D27500933
fbshipit-source-id: f4224f63b
Summary: One source of non-deterministic diff result is when there are multiple overloaded methods the cardinals of unchecked callees of which are the same. This diff tries to select one of them in a more deterministic manner.
Reviewed By: ezgicicek, ngorogiannis
Differential Revision: D27430757
fbshipit-source-id: 38ba5d8dc
Summary: Error message was accidentally changed to a specific nullptr error message (D26887140 (cba144b779)) for any invalidation (use after delete, etc). This diff reverts back the error message for a general case and keeps the special case for nullptr dereference. Also fixed spacing for nullptr dereference error message.
Reviewed By: jvillard
Differential Revision: D27407628
fbshipit-source-id: 2649f3032
Summary:
The title
Also notice that there is a duplication of an error.
Reviewed By: skcho
Differential Revision: D27426933
fbshipit-source-id: dbd2f861a
Summary: Autogenerated methods sometimes lead to false positives. Also, clean up a little the models file.
Reviewed By: da319
Differential Revision: D27393933
fbshipit-source-id: f79b1a6eb
Summary: To support objc nil messaging for unknown function calls we prune `self` to be positive in the `normal` specification and add additional specification to handle nil case.
Reviewed By: skcho
Differential Revision: D27360757
fbshipit-source-id: 119999b30
Summary:
This addresses a test difference between java versions. Infer's java tests are recorded with Java8 where string concat with a constant string uses `toString`. However, if tests are run on a machine where Java 11 is used, string concat is done via `makeConcatWithConstants` which causes tests to fail.
As a workaround, we replace the test so that Java version dependent string concat is not used.
Reviewed By: ngorogiannis
Differential Revision: D27394621
fbshipit-source-id: dfe1af2ac
Summary:
Fixing `IsInstanceOf` term simplification for null case. Before, this
was only being done if value was known to be null at the moment of the
call to `instanceof`. Otherwise, the `IsInstanceOf` term would remain in
the formula unnecessarily.
Reviewed By: jvillard
Differential Revision: D27361025
fbshipit-source-id: 2d958a757
Summary:
Models for Java Map interface.
This consists of `Map.init()`, `Map.put(key, value)`, `Map.get(key)`,
`Map.containsKey(key)` and
`Map.isEmpty()`. With the exception of `Map.get(key)` and `Map.containsKey(key)`, these functions were modelled using the respective similar ones provided by the Java Collection interface.
Reviewed By: jvillard
Differential Revision: D27326716
fbshipit-source-id: e07f0c952
Summary:
This diff add semantics for collecting all object fields that may have config values. The collected information is used to instantiate conditional unchecked callees introduced in the previous diff.
How it works:
* The summary is extended to have `config_fields:Fields.t`. It has all fields that may have config values intra-procedurally.
* Before reporting to `config-impact-report.json`, it unions all `config_fields` from all specs.
* Using `all_config_fields`, it instantiates each summaries and writes results to `config-impact-report.json`.
Reviewed By: ezgicicek
Differential Revision: D27326306
fbshipit-source-id: 42f16ca45
Summary:
This diff extends domain and semantics to understand object fields that may have config values.
Now, `Summary.t` has one more field `unchecked_callees_conditional`, which is a map from a set of object fields to a set of callees. The meaning is that the callees are called depending of the fields, ie
* if one of field of the fields is known to be an actual config value, the callees are safely gated,
* otherwise, the callees are ungated.
For example,
```
void foo() {
if (mField1) {
if (mField2) {
callee1();
}
callee2();
}
}
```
`foo` will have `unchecked_callees_conditional` value of
```
{ {mField1,mField2} -> {callee1},
{mField1} -> {callee2} }
```
Later, if we know that `mField2` has a config value, we can say `callee1` is gated, or if we know that `mField1` has a config value, we can say `callee1` and `callee2` are gated.
The following diff will add an analysis that collects object fields that may have config values.
Reviewed By: ezgicicek
Differential Revision: D27325522
fbshipit-source-id: d4aff58cb
Summary:
Copied the documentation from a document created by rgrig
(thanks!!).
Reviewed By: rgrig
Differential Revision: D27325829
fbshipit-source-id: 118e1a2be
Summary:
The explicit marker for nondeterministic states was used to speed up the
shallow implementations of Topl, which ar enow removed.
Reviewed By: jvillard
Differential Revision: D27297019
fbshipit-source-id: 0fce93817
Summary:
refactoring Java Integer model so that it uses the new
API designed for manipulating fields in Java.
Reviewed By: jvillard
Differential Revision: D27231810
fbshipit-source-id: 0d9e3c951
Summary:
## Issue:
On `master`, it seems that there is a missing newline when Infer prints the `tenv` for a structure type:
```bash
avj@platypus /tmp/infer_bug$ cat test.c
typedef struct {
int a;
} st1;
typedef struct {
int b;
} st2;
avj@platypus /tmp/infer_bug$ infer --version
Infer version v1.0.0-55871dd28
Copyright 2009 - present Facebook. All Rights Reserved.
avj@platypus /tmp/infer_bug$ rm -rf infer-out && infer --debug run -P -- gcc -c test.c
Logs in /tmp/infer_bug/infer-out/logs
Capturing in make/cc mode...
Found 1 source file to analyze in /tmp/infer_bug/infer-out
No issues found
avj@platypus /tmp/infer_bug$ grep -A1 "dummy" infer-out/captured/*/*.tenv.debug
dummy: falsestruct st1
fields: {
--
dummy: falsestruct st2
fields: {
--
dummy: falsestruct objc_class
fields: {}
```
(notice that `dummy: false` and `struct objc_class` are on the same line, with no spacing)
## Resolution
Their PR adds an explicit newline at the end of pretty-printing a structured value, such that it is formatted correctly in the `tenv`:
```bash
avj@platypus /tmp/infer_bug$ infer --version
Infer version v1.1.0-bb5a33506
Copyright 2009 - present Facebook. All Rights Reserved.
avj@platypus /tmp/infer_bug$ rm -rf infer-out && infer --debug run -P -- gcc -c test.c
Logs in /tmp/infer_bug/infer-out/logs
Capturing in make/cc mode...
Found 1 source file to analyze in /tmp/infer_bug/infer-out
No issues found
avj@platypus /tmp/infer_bug$ grep -A1 "dummy" infer-out/captured/*/*.tenv.debug
dummy: false
struct st1
--
dummy: false
struct st2
--
dummy: false
struct objc_class
--
dummy: false
```
(*edit*: I forgot to build after committing; now with updated hash)
Signed-off-by: Andrew V. Jones <andrewvaughanj@gmail.com>
Pull Request resolved: https://github.com/facebook/infer/pull/1416
Reviewed By: skcho
Differential Revision: D27264518
Pulled By: jvillard
fbshipit-source-id: 3b86b4c22
Summary:
Before this diff, TOPL had 3 implementations:
1. a post-processing of biabduction summaries
2. a post-processing of pulse summaries
3. a deep embedding in pulse
1 and 2 additionally require instrumenting SIL to generate monitors for
the TOPL properties. 3 is faster than both 1 and 2, by a good lot, and
doesn't require instrumenting the SIL code. Thus, delete 1 and 2!
Also harmonise the CLI so that TOPL is activated by --topl, which
actives it as a checker, like other analyses.
Reviewed By: rgrig
Differential Revision: D27270178
fbshipit-source-id: e86cf972b
Summary:
Changing model for Java `Collection` interface. Every collection has now two internal fields, initially set to `null`. We also keep an extra field to compute emptiness. This model was implemented based on the [preexisting model for HashMap](https://github.com/facebook/infer/blob/master/infer/models/java/src/java/util/HashMap.java).
Existing models (`add`, `remove`, `set` and `is_empty`) have been updated accordingly and new models are provided: `init` and `clear`.
This model is not yet compatible with the `Map` interface but this change will happen very soon.
Reviewed By: ezgicicek
Differential Revision: D27126815
fbshipit-source-id: 79a5fe306
Summary: This diff ignores java.lang.Math method calls since they are all cheap.
Reviewed By: ezgicicek
Differential Revision: D27267282
fbshipit-source-id: ad0a4ef4f
Summary:
There could still be divisions by zero, eg in the "mod" case: consider
"x mod (1/2)" (doesn't matter what x is). Then we'd check "1/2 =? 0" and
since it's false conclude that it's safe to take the modulo... oops!
To make things safer, harden `Z` to not throw anymore.
Also add a layer of defense in depth by wrapping the functions that do
Z/Q operations in another layer of exception catching because we really
don't want to crash the entire analysis due to that.
Reviewed By: martintrojer
Differential Revision: D27262569
fbshipit-source-id: e22187ca0