Summary:
From the user perspective, the current behavior is confusing.
The users intutitively expect the inner class to inherit Nullsafe mode
from the outer one.
Having a class that is Nullsafe but the inner is not is hence dangerous
and misleading.
For the sake of completeness and to support gradual strictification, we
allow the nested class to improse additional strictness. Particularly,
the inner class can be Nullsafe but the outer can be not.
A follow up to this diff will include warnings telling about redundant and wrong usages of nested annotations.
Reviewed By: artempyanykh
Differential Revision: D21228055
fbshipit-source-id: 75755ad1d
Summary: Iterator invalidation traces were based on vector rather than iterator itself.
Reviewed By: ezgicicek
Differential Revision: D21202047
fbshipit-source-id: 62ce8a488
Summary:
We model Enum.name as returing a constant name, rather than getting real field names. We did this
because we couldn't think of any big gains, in terms of analysis precision/performance, from getting
the real names.
Reviewed By: ezgicicek
Differential Revision: D21201730
fbshipit-source-id: a2dc01a44
Summary: This diff revises the models of Collection.set and get to handle its elements.
Reviewed By: ezgicicek
Differential Revision: D21201242
fbshipit-source-id: 9c248453d
Summary:
We ignored allocator models for vectors, and were not able to initialize vectors properly. This diff fixes this issue.
It also adds a test which was a FN before.
Reviewed By: skcho, jvillard
Differential Revision: D21089492
fbshipit-source-id: 6906cd1d1
Summary: D21155014 replaced `skip` call with a Load but this was not right. Instead, let's add a new builtin function (rather than skip) so that other analyses can freely model it as they want.
Reviewed By: jvillard
Differential Revision: D21178286
fbshipit-source-id: c214ccfb0
Summary: Java has this pattern of wrapping non-thread-safe containers in factory methods producing identically-typed results, but wrapped in a synchronised shell. This diff teaches RacerD about some common factory methods and uses the attribute domain to track the dynamic type of their results.
Reviewed By: ezgicicek
Differential Revision: D21155538
fbshipit-source-id: 42ebe6251
Summary: Complete the set of models for java containers that Infer should not report thread safety violations.
Reviewed By: ezgicicek
Differential Revision: D21138280
fbshipit-source-id: 01e1944b6
Summary: Models were partial and/or simply missing (`Map` writes!). Now the modelled containers use inheritance for conciseness (`List` reads are only those not caught by the `Collection` matcher, etc). Also, add URLs to documentation sources.
Reviewed By: ezgicicek
Differential Revision: D21132069
fbshipit-source-id: fefb360f0
Summary: `CFBridgingRelease` and `__bridge_transfer` which I'll model later, transfer the memory model from manual memory ref count to ARC (automatic ref count), so to avoid false positives this needs to be modelled. We can simply remove the Allocated attribute from the state, which means we won't try to track that memory anymore.
Reviewed By: skcho
Differential Revision: D21088218
fbshipit-source-id: 3520a0d59
Summary: This diff suppresses cost issues on lambda and auto-generated procedures, since they were too noisy.
Reviewed By: ezgicicek
Differential Revision: D21153619
fbshipit-source-id: 65ad6dcc3
Summary:
Replace horrible hack with ok hack.
The main difficulty in implementing the disjunctive domain is to avoid
the quadratic time complexity of executing the same disjuncts over and
over again when going around loops:
First time around a loop, assuming for example a single disjunct `d`:
```
[d]
loop body
[d1' \/ d2']
```
Second time around the same loop: the new pre will be the join of the
posts of predecessor nodes, so `old_pre \/ post(loop,old_pre)`, i.e.
`d \/ d1' \/ d2'`. Now we need to execute `loop body` again
*without running the symbolic execution of `d` again* (and the time after
that we'll want to not execute `d`, `d1'`, or `d2'`).
Horrible hack (before): Disjuncts have a boolean "visited" attached
that does its best to keep track of whether a given disjunct is old or
new. When executing a single *instruction* look at the flag and skip the
state if it's old. Of course we have no way to know for sure so it turns
out it was often wrongly re-executing old disjuncts. This was also
producing the wrong results over even simple loops: only the last
iteration would make it outside the loop for some reason. Overall, the
semantics were pretty untractable and shady at best.
New hack (this diff): only run instructions of a given *node* on
disjuncts that are not physically equal to the "pre" ones already in the
invariant map for the current node.
This gives the correct result over simple loops and a nice performance
improvement in general (probably the old heuristic was hitting the
quadratic bad case more often).
Reviewed By: skcho
Differential Revision: D21154063
fbshipit-source-id: 5ee38c68c
Summary:
We translated the expression `CXXStdInitializerListExpr` naively in D3058895 as a call to
a skip function, with the hope that it would be translated better in the future. However, the naive means that we lose access to the initialized list/array because we are simply skipping it. So, even if we want to model the initializer properly, we have to deal with the skip specially.
This diff tries to solve this problem by removing the skip call whenever
possible. Instead, we translate the underlying array/list as a Load, so
that when it is passed to the constructor, we can pick it up.
For the following initialization:
``` std::vector<int*> vec = {nullptr};
```
Before, we translated it as
```
*&0$?%__sil_tmpSIL_materialize_temp__n$7[0]:int* const =null
n$8=_fun___infer_skip_function(&0$?%__sil_tmpSIL_materialize_temp__n$7:int* const [1*8] const )
n$9=_fun_std::vector<int*,std::allocator<int*>>::vector(&vec:std::vector<int*,std::allocator<int*>>*,n$8:std::initializer_list<int*>)
```
However, this means, `n$8` would be result of something skipped which we can't reason about. Instead, we just pass the underlying initialized array now, so we get the following translation:
```
*&0$?%__sil_tmpSIL_materialize_temp__n$7[0]:int* const =null
n$8=*&0$?%__sil_tmpSIL_materialize_temp__n$7:int* const [1*8] const
n$9=_fun_std::vector<int*,std::allocator<int*>>::vector(&vec:std::vector<int*,std::allocator<int*>>*,n$8:std::initializer_list<int*>)
```
Reviewed By: jvillard
Differential Revision: D21155014
fbshipit-source-id: 75850b1e6
Summary:
When encountering a constant, pulse creates an abstract value (a
variable) to represent it, and remembers that it's equal to it. The
problem is that pulse doesn't yet know how to deal with the fact that
some variables are going to be equal to each other.
This hacks around this issue in the case of constants, within the same
procedure, by remembering which constants have been assigned to which
place-holder variables, and serving those variables again when the same
constant is translated again.
Limitation: this doesn't work across procedure calls as the "constant
maps" are not saved in summaries.
Something to look out for: we don't want to make `if (p == NULL)` create
a path where `p` is invalid (we only make null invalid when we see an
assignment from 0, i.e. `p = NULL;`).
Reviewed By: ezgicicek
Differential Revision: D21089961
fbshipit-source-id: 5ebb85d0a
Summary:
1. Package will make the error too verbose.
2. We don't even need to say it is "class" because we say it in the error
description ("Class has 0 issues and can be marked Nullsafe").
Reviewed By: artempyanykh
Differential Revision: D21131998
fbshipit-source-id: 6ccca7615
Summary:
One source of false positives on container races is when the container member field is initialised to a concurrent version in a constructor, but the static type of the field doesn't reflect the thread safety of it.
This solution
- tracks flows from constructors of safe data structures to abstract addresses;
- initialises the initial attribute state when analysing a non-constructor method to that achieved by all constructors/class-initializers.
- checks for that attribute when recording container accesses.
Reviewed By: jvillard
Differential Revision: D21089428
fbshipit-source-id: 02a88f6e8
Summary: Modeling vector iterator with two internal fields: an internal array and an internal pointer. The internal array field points to the internal array field of a vector; the internal pointer field represents the current element of the array. For now `operator++` creates a fresh element inside the array.
Reviewed By: ezgicicek
Differential Revision: D21043304
fbshipit-source-id: db3be49ce
Summary:
Add a path condition to each symbolic state, represented in sledge's arithmetic domain. This gives a precise account of arithmetic constraints. In particular, it is relation and thus is more robust in the face of inter-procedural analysis.
This is gated behind a flag for now as there are performance issues with the new arithmetic.
Reviewed By: jberdine
Differential Revision: D20393947
fbshipit-source-id: b780de22a
Summary:
There are two types of anonymous classes (not user defined classes):
- classic anonymous classes (defined as $<int> suffixes)
- lambda classes (corresponding to lambda expressions). Experimentally,
they all have form `$Lambda$_<int>_<int>`, but the code just uses
`$Lambda$` as a heuristic so it is potentially more robust.
# Problem this diff solves
When generate meta-issues for nullsafe, we are interested only in
user-defined classes, so we merge all nested anonymous stuff to
corresponding user-defined classes and hence aggregate the issues.
Without this diff, for each lambda in the code, we would report this as
a separate meta-issue, which would both screw up stats and be confusing
for the user (when we start reporting mode promo suggestions!).
Reviewed By: artempyanykh
Differential Revision: D21042928
fbshipit-source-id: a7be266af
Summary:
This diff revises how to handle the unknown location in inferbo in two ways:
* stop appending field to the `Unknown` location, e.g. `Unknown.x.a` is evaluated to `Unknown`
* redesign the abstract of multiple locations, like `Bottom` < `Unknown` < `Known` locations
I am doing them in one diff since applying only one of them showed bad results.
Background: `Unknown` was adopted for abstracting all unknown concrete locations, so we could avoid missing semantics of assignments to unknown locations. We tried to keep soundness. However, it brought some other problems related to precision and performance.
1. Sometimes especially when Inferbo failed to reason precise pointer values, `Unknown` may point to many other abstract locations.
2. At that time, value assignments to `*Unknown` makes the situation worse: many abstract locations are updated with imprecise values.
This problem harmed not only its precision, but also its performance since it introduced more location entries in the abstract memory.
Reviewed By: jvillard
Differential Revision: D21017789
fbshipit-source-id: 0bb6bd8b5
Summary: The flags `--biabduction-fallback-model-alloc-pattern` and `--biabduction-fallback-model-free-pattern` were unused because we removed the models from .inferconfig a while ago because of too many false positives. We are implementing a better memory leak check based on Pulse, and are adding the similar flags `--pulse-model-alloc-pattern` and `--pulse-model-free-pattern`.
Reviewed By: jvillard
Differential Revision: D21061511
fbshipit-source-id: 1b3476c22
Summary:
See the code comment re: why don't we also recommend "strict" at this
stage. We can always change it later when we think users are happy with
strict.
Reviewed By: artempyanykh
Differential Revision: D21039553
fbshipit-source-id: 758ccf32c
Summary:
This diff is a step forward to the state when the list of type violations is
independent of the mode (and we use mode solely to decide re: whether to
report or not).
This fixes a case when we incorrectly defined possible promo mode (see
the test payload)
Reviewed By: artempyanykh
Differential Revision: D20948897
fbshipit-source-id: 616b96f96
Summary:
See the comments in the code why it makes logical sense.
This diff is a step forward the state when list of type violations is
independent of the mode (and we use mode solely to decide re: whether to
report or not).
This fixes majority of cases in ModePromotions.java
Reviewed By: artempyanykh
Differential Revision: D20948656
fbshipit-source-id: 82c0d530b
Summary:
Currently we exlude only if the method is based on deprecated config
packages.
Lets use the proper method, which covers both cases (config +
user-defined third party repo).
Reviewed By: artempyanykh
Differential Revision: D20946506
fbshipit-source-id: c3332667f
Summary:
Previously, we learned to detect if Default mode class can be made
Nullsafe(LOCAL).
Lets generalize it and calculate the precise mode.
NOTE 1: We don't distinct shades of "Trust some". We also don't
recommend trust some and recommend "Trust all" instead.
NOTE 2: As you can see from the test payload (see ModePromotions.java),
the precise calculation is not working as expected. This is due to a bug
in nullsafe implementation/design. See follow up diffs that will fix
this test.
Reviewed By: artempyanykh
Differential Revision: D20941345
fbshipit-source-id: 2255359ba
Summary: Unify the models of malloc and for the Create and Copy functions for Core Graphics. This add the null case from the malloc model to the Core Graphics models.
Reviewed By: jvillard
Differential Revision: D20890956
fbshipit-source-id: 278ac9d2f
Summary:
As soon as pulse detects an error, it completely stops the analysis and loses the state where the error occurred. This makes it difficult to debug and understand the state the program failed. Moreover, other analyses that might build on pulse (e.g. impurity), cannot access the error state.
This diff aims to restore and display the state at the time of the error in `PulseExecutionState` along with the diagnostic by extending it as follows:
```
type exec_state =
| represents the state at the program point that caused an error *)
```
As a result, since we don't immediately stop the analysis as soon as we find an error, we detect both errors in conditional branches simultaneously (see test result changes for examples).
NOTE: We need to extend `PulseOperations.access_result` to keep track of the failed state as follows:
```
type 'a access_result = ('a, Diagnostic.t * t [denoting the exit state] ) result
```
Reviewed By: jvillard
Differential Revision: D20918920
fbshipit-source-id: 432ac68d6
Summary: Consider functions that simply exit as impure by extending the impurity domain with `AbstractDomain.BooleanOr` that signifies whether the program exited.
Reviewed By: skcho
Differential Revision: D20941628
fbshipit-source-id: 19bc90e66
Summary:
This information can be useful for tooling responsible for further
processing (e.g. metric calculation and logging)
Reviewed By: artempyanykh
Differential Revision: D20914583
fbshipit-source-id: 61804d88f
Summary: The heuristics is to find a method in non-abstract sub-classes. See D20647101.
Reviewed By: jvillard
Differential Revision: D20491461
fbshipit-source-id: 759713ef4
Summary:
This diff distinguishes array declaration and size-setting in trace. For example, when there is an
assume statement on an array size, the array size can be pruned to another value. In which case, we
want to see "Set array size" in the trace, instead of "Array declaration".
Reviewed By: jvillard
Differential Revision: D20914930
fbshipit-source-id: 0253fb69e
Summary:
This diff lifts the `PulseAbductiveDomain.t` in `PulseExecutionState` by tracking whether the program continues the analysis normally or exits unusually (e.g. by calling `exit` or `throw`):
```
type exec_state =
| ContinueProgram of PulseAbductiveDomain.t (** represents the state at the program point *)
| ExitProgram of PulseAbductiveDomain.t
(** represents the state originating at exit/divergence. *)
```
Now, Pulse's actual domain is tracked by `PulseExecutionState` and as soon as we try to analyze an instruction at `ExitProgram`, we simply return its state.
The aim is to recover the state at the time of the exit, rather than simply ignoring them (i.e. returning empty disjuncts). This allows us to get rid of some FNs that we were not able to detect before. Moreover, it also allows the impurity analysis to be more precise since we will know how the state changed up to exit.
TODO:
- Impurity analysis needs to be improved to consider functions that simply exit as impure.
- The next goal is to handle error state similarly so that when pulse finds an error, we recover the state at the error location (and potentially continue to analyze?).
Disclaimer: currently, we handle throw statements like exit (as was the case before). However, this is not correct. Ideally, control flow from throw nodes follows catch nodes rather than exiting the program entirely.
Reviewed By: jvillard
Differential Revision: D20791747
fbshipit-source-id: df9e5445a
Summary:
Malloc returns either an allocated object or a null pointer if there is no memory available. Modelling that.
This has always been a bit contentious because this leads to NPEs that people often ignores because they don't care. But if we don't model this, then we have FPs when people do take this into account when freeing the memory.
Reviewed By: jvillard
Differential Revision: D20791692
fbshipit-source-id: 6fd259f12
Summary:
This diff limits the depth of abstract location by a constant.
problem: Inferbo generated too many of abstract locations, especially when struct types had many pointer fields and Inferbo was not able to analyze the objects precisely. Since the number of generated abstract locations were exponential to the number of fields, it resulted in OOM in the end.
(reported by zyh1121 in https://github.com/facebook/infer/issues/1246)
Reviewed By: jvillard
Differential Revision: D20818471
fbshipit-source-id: f8af27e5c
Summary:
Currenlty the cost issue is printed at the first node of a function, which is usually the first
statment of the function. This may give a wrong impression that the cost of the statement is
changed.
This diff re-locate where to print issues with heuristics. Going backward from the first node
lines, it looks up a line satisfying,
1. A line should start with <fname> or should include " <fname>".
2. The <fname> found in 1 should be followed by a space, '<', '(', or end of line.
Reviewed By: jvillard
Differential Revision: D20766876
fbshipit-source-id: b4fee3180
Summary:
It's easy to create large arrays in code, eg `int x[1UL << 16];`, but
these can generate huge nodes in SIL because zero-initialization is
translated by zero-ing structures element by element. Introduce a
builtin to use instead. Keep the naive method for small structures (with
a configurable limit on "small").
Reviewed By: dulmarod
Differential Revision: D20836836
fbshipit-source-id: 6bf5410f8
Summary: Modelling `CG.*Release ` and `CFRelease` as `free`. This is what we were doing in biabduction.
Reviewed By: skcho
Differential Revision: D20767174
fbshipit-source-id: c77c1cdc6
Summary:
This models all the Create and Copy functions from CoreGraphics, examples in the tests.
These functions all allocate memory that needs to be manually released.
The modelling of the release functions will happen in a following diff. Until then, we have some false positives in the tests.
This check is currently in biabduction, and we aim to move it to Pulse.
Reviewed By: jvillard
Differential Revision: D20626395
fbshipit-source-id: b39eae2d9
Summary:
- Add `no_return` models for Java's `exit(...)` methods (can be extended further later on)
- handle throw-catch better by short-cutting throw nodes to not exit node but to all **catch nodes** that are reachable by the node. If there is no catch node, we short-cut to the exit node as before.
This removes a FP from deadstore tests because before we simply were not able to handle CF from throw-> catch nodes at all.
Reviewed By: skcho
Differential Revision: D20769039
fbshipit-source-id: e978f6cdb
Summary:
To find a method in non-abstract sub-classes, this diff applies the
same heuristics of inferbo.
* If the class is an interface: Find its unique sub-class and apply the heuristics recursively.
* If the class is an abstract class: Find/use its own summary if possible. If not found, find
one (arbitrary but deterministic) summary from its sub-classes.
* Otherwise: Find its own summary.
Reviewed By: ezgicicek
Differential Revision: D20647101
fbshipit-source-id: 2f8f3ff81
Summary: When looking at some reports I realised that adding the place where the memory becomes unreachable to the trace makes it more readable.
Reviewed By: skcho
Differential Revision: D20790277
fbshipit-source-id: d5df69e68
Summary:
The attribute `[no_return]` signifies that a function doesn't return. Previously, pre-analysis had cut the links to successor nodes of such no-return function nodes. This was intended to help with suppressing reporting on unreachable paths for some analyses. However, this results in having these nodes as dangling, with no connection to exit nodes.
This diff additionally shortcuts these no-return function nodes to exit node. This would allow us to enhance inter-procedural analyses like pulse to kepp track of paths that do not return since we will be keeping their connections at exit node rather than completely cutting them of as before. It would also allow us to assume that all paths start at the one start node and end at the one exit node (at least syntactically in the CFG).
Reviewed By: skcho
Differential Revision: D20736043
fbshipit-source-id: 0eace1bdb