Summary:
When garbage-collecting addresses we would also remove their attributes.
But even though the addresses are no longer allocated in the heap, they
might show up in the formula and so we need to remember facts about
them.
This forces us to detect leaks closer to the point where addresses are
deleted from the heap, in AbductiveDomain.ml. This is a nice refactoring
in itself: doing so fixes some other FNs where we sometimes missed leak
detection on dead addresses.
This also makes it unecessary to simplify InstanceOf eagerly when
variables get out of scope.
Some new {folly,std}::optionals false positives that either are similar to existing ones or involve unmodelled smart pointers.
Reviewed By: da319
Differential Revision: D28126103
fbshipit-source-id: e3a903282
Summary:
Function calls that accept blocks as arguments may have additional arguments for the captured variables of the block. Cost models for these functions only considered the case where the block arguments didn't capture variables. This diff extends the model so that we handle captured variable case.
This fixes some FNs in the analysis.
Reviewed By: skcho
Differential Revision: D28183071
fbshipit-source-id: 6a045e80e
Summary:
As explained in the previous diff: when the access trace goes through
the invalidation step there is no need to print the invalidation trace
at all.
Note: only a few sources of invalidation are handled at the moment. The
following diffs gradually fix the other sources of invalidation.
Reviewed By: skcho
Differential Revision: D28098335
fbshipit-source-id: 5a5e6481e
Summary:
The eventual goal is to stop having separate sections of the trace
("invalidation part" + "access part") when the "access part" already
goes through the invalidation step. For this, it needs to record when a
value is made invalid along the path.
This is also important for assignements to NULL/0/nullptr/nil: right now
the way we record that 0 is not a valid address is via an attribute
attached to the abstract value that corresponds to 0. This makes traces
inconsistent sometimes: 0 can appear in many places in the same function
and we won't necessarily pick the correct one. In other words, attaching
traces to *values* is fragile, as the same value can be produced in many
ways. On the other hand, histories are stored at the point of access, eg
x->f, so have a much better chance of being correct. See added test:
right now its traces is completely wrong and makes the 0 in `if
(utf16StringLen == 0)` the source of the NULL value instead of the
return of `malloc()`!
This diff makes the traces slightly more verbose for now but this is
fixed in a following diff as the traces that got longer are those that
don't actually need an "invalidation" trace.
Reviewed By: skcho
Differential Revision: D28098337
fbshipit-source-id: e17929259
Summary:
This diff avoids dereference of C struct, in its frontend and its semantics of Pulse. In SIL, C
struct is not first-class value, thus dereferencing on it does not make sense.
Reviewed By: ezgicicek
Differential Revision: D27953258
fbshipit-source-id: 348d56338
Summary: This diff copies each field values inside setter/getter of ObjC.
Reviewed By: ezgicicek
Differential Revision: D27940521
fbshipit-source-id: 9977cae75
Summary:
When a block value is passed via more than one-depth of function calls, it is not analyzed correctly
because current inlining mechanism (specializing objc block parameters) of the frontend works for
only one-depth of block passing. This diff gives up analyzing initialized-ness of captured
variables in ObjC to avoid FPs.
Reviewed By: da319
Differential Revision: D27885395
fbshipit-source-id: fc6b4663c
Summary:
In D27430485 (a6ab4d38cf), we used the static cost of the callee to determine whether it was cheap/expensive. This diff improves on that by taking the whole instantiated cost of the function call (not just the callee's cost).
Also, if the callee is an unmodeled call, we consider it to be expensive as before.
Note: cost instantiation was used by hoisting. I refactored bunch of code there to reuse as much as code possible.
Reviewed By: skcho
Differential Revision: D27649302
fbshipit-source-id: 07d11f3dd
Summary:
This diff removes additional inferbo options `--bufferoverrun` from cost tests, since printing
inferbo issues is not that useful to understand cost results.
Reviewed By: ngorogiannis
Differential Revision: D27592496
fbshipit-source-id: 6ab3e6528
Summary: The translation of captured by reference variables has been fixed for ObjC blocks (D26945575 (778c629401)), so we do not need to ignore them in uninit analysis anymore.
Reviewed By: skcho
Differential Revision: D27063663
fbshipit-source-id: 447084d37
Summary:
We use `procedure_name` which is coming from `Procname.get_method` in explanation of cost issues. For blocks, procedure name includes a prefix `objc_block` and a suffix with `_x` where x is the block counter. However, displaying this name to the user is not pretty. Especially when we have nested blocks, procedure name looks like `objc_blockobjc_blockdirectUIMessageFromContentAndMetadata_10_23`.
This diff drops the block index suffix and replaces `objc_block` with a prettier version `^`(signifying block).
so instead, in the cost report, we will have `^^blockdirectUIMessageFromContentAndMetadata`.
Reviewed By: skcho
Differential Revision: D26945333
fbshipit-source-id: 9d135423c
Summary: Variables captured by reference do not have correct type in objc blocks. They are missing one reference. This diff sets the correct type of captured reference variables inside procdesc, similarly as we already have for cpp lambdas. The translation of block's body will then take into account the type of captured variable from procdesc.
Reviewed By: ezgicicek
Differential Revision: D26945575
fbshipit-source-id: 06a9d9cc6
Summary:
Adapting error messages in Pulse so that they become more intuitive for
developers.
Reviewed By: jvillard
Differential Revision: D26887140
fbshipit-source-id: 896970ba2
Summary:
This diff uses config-impact-issues.exp instead of issues.exp, like in
the cost checker.
Reviewed By: ezgicicek
Differential Revision: D26723761
fbshipit-source-id: 9c6779479
Summary:
Currently, we report on all functions that are not config checked. However, the aim of the analysis is to only report on these for specific functions. Moreover, this has performance implications in practice.
This diff instead reports on functions that occur on a json file that is passed by the command line option `config-data-file`.
Reviewed By: skcho
Differential Revision: D26666336
fbshipit-source-id: 290cd3ada
Summary: In Objective-C, `static const int var = ..` is not recognized as ICE (integral constant expression) unlike C++. To handle such loads better, this diff adds a check for `constant_global_array` as a workaround.
Reviewed By: skcho
Differential Revision: D26369461
fbshipit-source-id: e2dae11f1
Summary: D25952894 (1bce54aaf3) changes translation of struct assignments. This diff adopts to this change for loads from global struct arrays.
Reviewed By: skcho
Differential Revision: D26398627
fbshipit-source-id: cc1fb47ab
Summary:
Before this diff:
```
// Summary of const global
// { global -> v }
n$0 =* global
// n$0 -> {global}
x *= n$0
// x -> {global}
```
However, this is incorrect because we expect `x` have `v` instead of the abstract location of `global`.
To fix the issue, this diff lookups the initializer summary when `global` is evaluated as RHS of load statement.
After this diff:
```
// Summary of const global
// { global -> v }
n$0 =* global
// n$0 -> v
x *= n$0
// x -> v
```
Reviewed By: ezgicicek
Differential Revision: D26369645
fbshipit-source-id: 98b1ed085
Summary:
This diff resets the id generator before generating ObjC getter/setter, so parsed results are the
same without regard to the generation order. Note that the order may change when we change the type
of Procname.t since their hash values are used for the hash set of procnames.
Reviewed By: ngorogiannis
Differential Revision: D26277348
fbshipit-source-id: a66d77845
Summary:
Dear Infer team,
To contribute to Infer community, I would like to integrate infer#'s language agnostic layer into Infer.
Please help to review, discuss and consider to merge this feature.
Thanks,
Xiaoyu
Pull Request resolved: https://github.com/facebook/infer/pull/1361
Reviewed By: skcho
Differential Revision: D25928458
Pulled By: jvillard
fbshipit-source-id: 7726150b8
Summary:
**Existing heuristic**: If we have a call `foo(n)` that has no model and summary for `foo`, we underestimate its cost as constant[1].
However, if we have a model for `foo` (e.g.with modeled cost O(n)) but applying the model to arguments causes the cost to be Top (e.g if `n` has Top size), then we could have Top-poisoning where all the callers up the call chain will have Top costs [2].
To prevent these unintended Top-poisioning when adding models, this diff applies *the same heuristic* to modeled calls with Top cost and gives them constant cost. This way, when adding models, we wouldn't be introducing more Tops than if we were to have no models in the first place.
[1] This is problematic in itself and causes many FPs at diff time, but otherwise we would be getting Tops everywhere and would not be able to give any meaningful cost. E.g. for fblite, if we were to give unknown calls Top cost, #procedures with Top cost increases form 5% to 38% and #procedures with linear cost reduces by 99.75%.
[2] This was observed for `containsValue` for Instagram where %Tops increased by 88% :(
Reviewed By: skcho
Differential Revision: D26174644
fbshipit-source-id: 232354923
Summary: This diff adds an additional parameter of struct return type in ObjC's methods. The additional parameter had been supported only in C/C++ functions/methods for 5 years (D2865091 (ec80d40bdd)). If there is no specific reason not to do that, let's do it and fix the incorrect frontend translations.
Reviewed By: jvillard
Differential Revision: D26049748
fbshipit-source-id: 414b3011f
Summary:
When there was an assignment of C struct, `x = y;`, it was translated to the statements of load and store.
```
n$0 = *y
*x = n$0
```
However, this is incorrect in Sil, because a struct is not a value that can be assigned to registers. This diff fixes the translation as assignments of each field values :
```
n$0 = *y.field1
*x.field1 = n$0
n$0 = *y.field2
*x.field2 = n$0
...
```
It copies field values of C structs on:
* assign statement
* return statement
* declarations.
It supports nested structs.
Reviewed By: jvillard
Differential Revision: D25952894
fbshipit-source-id: 355f8db9c
Summary: This diff revises the trace generation of the uninitialized value checker, by introducing a new diagnostics for it.
Reviewed By: jvillard
Differential Revision: D25433775
fbshipit-source-id: 1279c0de4
Summary: This diff gives semantics of dispatch_sync to call the closure parameter.
Reviewed By: jvillard
Differential Revision: D25423175
fbshipit-source-id: a45309073
Summary:
This diff supports inter-procedural uninit analysis in pulse.
* Added `MustBeInitialized` attribute to pre state when an address is read
* Remove `Uninitialized` attribute when callee has `WrittenTo` for the
same address
Reviewed By: jvillard
Differential Revision: D25368492
fbshipit-source-id: cbc74d4dc
Summary:
This diff adds uninitialized value check in pulse. For now, it supports only simple cases,
- declared variables with a type of integer, float, void, and pointer
- malloced pointer variables that points to integer, float, void, and pointer
TODOs: I will add more cases in the following diffs.
- declared/malloced array
- declared/malloced struct
- inter-procedural checking
Reviewed By: jvillard
Differential Revision: D25269073
fbshipit-source-id: 317df9a85
Summary:
The frontend of ObjC regarding to captured variable was incorrect: it set capturing mode as
by-reference always, but it actually translaged as if all captured variables were passed with
by-value. This diff fixes this based on the document.
https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/Blocks/Articles/bxVariables.html
* global variable: by reference
* local variable: by value
* `static` local variable: by reference
* `__block` local variable: by reference
* parameter: by value
Reviewed By: jvillard
Differential Revision: D25306122
fbshipit-source-id: ec499d705
Summary:
This diff adds trace of closure in autoreleasepool checker. We introduce a symbolic trace value for closure variable.
* It is added to the trace when closure variable is called
* and is substituted to concrete one when actual closure is given later.
Reviewed By: ezgicicek
Differential Revision: D25025883
fbshipit-source-id: a6e246be7
Summary:
Existing closure substitution only supported direct block calls to formals.
The following didn't work since the domain was only keeping track of loads/calls from formals, but didn't support stores.
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
This diff adds support for assigning a block to a local variable so that we can specialize the above example.
We now have a pair domain
- existing mapping from ids to block vars
- a new mapping from mangled to block specializations
the latter allows us to update the mapping in local block assignment (via store).
Reviewed By: skcho
Differential Revision: D25030234
fbshipit-source-id: 3f172341c
Summary:
Specialized closure substitution was broken for conditionals:
```
void foo(dispatch_block_t block1){
if (x){
block1(); // not replaced with specialized implementation
}
}
```
The problem was that when substituting function calls, it only used memory state at the exit node, rather than at each program point.
We could solve this by
- reverting the domain change in D24418560 (c47911359a), i.e. collecting all possible mappings conservatively (e.g. switch the domain back to `Map`)
- pass the `invariant_map` for substitutions at each program point.
We go with the second option here.
The closure substitution is still somewhat broken as exemplified by the following example:
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
Reviewed By: skcho
Differential Revision: D24993962
fbshipit-source-id: ebadddb58
Summary:
This was left as a TODO before: where to place calls to destructors for
C++ temporaries that are only conditionally creating when evaluating an
expression. This can happen inside the branches of a conditional
operation `b?e:f` or in potentially-short-circuited conditions on the
righ-hand side of `&&` and `||` operators.
Following the compilation scheme of clang (observed by looking at the
generated LLVM bitcode), we instrument the program with "marker"
variables, so that for instance `X x = true?X():y;` becomes (following
the execution on the true branch):
```
marker1 = 0; // initialize all markers to 0
PRUNE(true) // entering true branch
X::X(&temporary); // create temporary...
marker1 = 1; // ...triggers setting its marker to 1
X::X(&x, &temporary); // finish expression
if (marker1) {
X::~X(&temporary); // conditionally destroy the temporary
}
```
In this diff, you'll find code for:
- associating markers to temporaries that need them
- code to initialize markers to 0 before full-expressions
- code to conditionally destroy temporaries based on the values of the
markers once the full-expression has finished evaluating
Reviewed By: da319
Differential Revision: D24954070
fbshipit-source-id: cf15df7f7
Summary:
Split the translation of return more aggressively between:
1. the instruction that has to happen before the translation of the sub-expr
2. the sub-expr
3. the instruction that has to happen after the sub-expr
This is needed for the next diff which creates potentially large CFGs in
(2).
Reviewed By: da319
Differential Revision: D24954071
fbshipit-source-id: a7e7e2527
Summary:
This is several inter-connected changes together to keep the tests
happy.
The ConditionalOperator `b?t:e` is translated by first creating a
placeholder variable to temporarily store the result of the evaluation
in each branch, then the real thing we want to assign to reads that
variable. But, there are situations where that changes the semantics of
the expression, namely when the value created is a struct on the stack
(eg, a C++ temporary). This is because in SIL we cannot assign the
*address* of a program variable, only its contents, so by the time we're
out of the conditional operator we cannot set the struct value
correctly anymore: we can only set its content, which we did, but that
results in a "shifted" struct value that is one dereference away from
where it should be.
So a batch of changes concern `conditionalOperator_trans`:
- instead of systematically creating a temporary for the conditional,
use the `trans_state.var_exp_typ` provided from above if available
when translating `ConditionalOperator`
- don't even set anything if that variable was already initialized by
merely translating the branch expression, eg when it's a constructor
- fix long-standing TODO to propagate these initialization facts
accurately for ConditionalOperator (used by `init_expr_trans` to also
figure out if it should insert a store to the variable being
initialised or not)
The rest of the changes adapt some relevant other constructs to deal
with conditionalOperator properly now that it can set the current
variable itself, instead of storing stuff inside a temp variable. This
change was a problem because some constructs, eg a variable declaration,
will insert nodes that set up the variable before calling its
initialization, and now the initialization happens *before* that setup,
in the translation of the inner conditional operator, which naturally
creates nodes above the current one.
- add a generic helper to force a sequential order between two
translation results, forcing node creation if necessary
- use that in `init_expr_trans` and `cxxNewExpr_trans`
- adjust many places where `var_exp_typ` was incorrectly not reset when translating sub-expressions
The sequentiality business creates more nodes when used, and the
conditionalOperator business uses fewer temporary variables, so the
frontend results change quite a bit.
Note that biabduction tests were invaluable in debugging this. There
could be other constructs to adjust similarly to cxxNewExpr that were
not covered by the tests though.
Added tests in pulse that exercises the previous bug.
Reviewed By: da319
Differential Revision: D24796282
fbshipit-source-id: 0790c8d17
Summary:
Namely `enumerateObjectsUsingBlock` which takes a collection and a block as an arg and the iterates over the collection and applies the block to each element of the collection.
This is a common way to iterate over Objc collections, so let's add a model for it.
Reviewed By: skcho
Differential Revision: D24604590
fbshipit-source-id: 1ceeb4b40
Summary:
This diff revises the translation of message expression's arguments in ObjC frontend. In the
frontend, it massages the arguments when calling a static method, so the class or object value is
not given to the static method as the first parameter.
The problem is that it used a raise-exception-and-catch way to detect where we remove the first
parameter. This way of using an exception is not only hard to understand, but also incorrectly
removed the first parameter, with breaking abstract semantics sometimes. (See the added test.) This diff
avoids using the exception.
Reviewed By: jvillard
Differential Revision: D24565513
fbshipit-source-id: 0a84ca394
Summary:
The `make clean` did not remove objects and dot files, so
```
infer/tests/codetoanalyze/objc/frontend$ make test
infer/tests/codetoanalyze/objc/frontend$ make clean
infer/tests/codetoanalyze/objc/frontend$ make test
```
the second `make test` did nothing. This diff adds additional regular
expressions to clean all objects and dot files generated.
Reviewed By: ngorogiannis
Differential Revision: D24566169
fbshipit-source-id: b8c50c922
Summary: This diff fixes on-demand symbolic value generation of a class that inherits NSEnumerator.
Reviewed By: ngorogiannis
Differential Revision: D24504955
fbshipit-source-id: bcb20e8aa