Summary:
When C and C++ code handle a common struct typed value, the struct
type is handled as a `CStruct` in the C code, but as a `CppClass` in
the C++ code. On the other hand, `Fieldname.t` contains a string of
field and **the struct type**. As a result, even if a same field is
accessed in C and C++ code, the accessed fieldnames are different.
```
void callee_in_c(struct s* x) {
x->a = 3;
}
void caller_in_cpp() {
struct s x;
x.a = 5;
callee_in_c(&x);
// HERE
}
```
For example, in the above code, `caller_in_cpp` sets the field `a` as
5, then calls `callee_in_c`, which sets the field `a` as 3. However,
at `HERE`, the value of `x` in Pulse is `{a -> 5, a -> 3}`, because the two
fieldnames are addressed as different ones.
To avoid the issue, this diff loosens the fieldname comparison in
Pulse.
Reviewed By: jvillard
Differential Revision: D26000812
fbshipit-source-id: 77142ebda
Summary:
Previously, only names containing '$' were considered synthetic. We need
to extend the logic and look for "_UL_" in the name as well.
Also I deduped 4 different impls of "is_synthetic/generated/autogen".
Reviewed By: ngorogiannis
Differential Revision: D25899232
fbshipit-source-id: 9463eca6b
Summary:
When the body of the loop doesn't created a node then they don't get
wired correctly to the rest of the loop and end up dangling. Force node
creation to fix that.
Fixes https://github.com/facebook/infer/issues/1373
Reviewed By: ezgicicek
Differential Revision: D25804185
fbshipit-source-id: 85108bdd9
Summary:
In `Config`, the lists generated by `mk_string_list`, `mk_path_list`, `mk_rest_actions` are reversed implicitly, which made it hard for developers to use them correctly. What the previous and this diff will do is to change the list variables of the `Config` to not-reversed one.
* diff1: First diff adds `RevList` to distinguish reversed lists explicitly. All usages of the reversed list should be changed to use `RevList`'s lib calls.
* diff2: Then this diff will change types of `Config` variables to not-reversed, normal list.
Reviewed By: ngorogiannis
Differential Revision: D25562303
fbshipit-source-id: 4cbc6d234
Summary:
In `Config`, the lists generated by `mk_string_list`, `mk_path_list`, `mk_rest_actions` are reversed implicitly, which made it hard for developers to use them correctly. What this and the next diff will do is to change the list variables of the `Config` to not-reversed one.
* diff1: First this diff adds `RevList` to distinguish reversed lists explicitly. All usages of the reversed list should be changed to use `RevList`'s lib calls.
* diff2: Then the next diff will change types of `Config` variables to not-reversed, normal list.
Reviewed By: ngorogiannis
Differential Revision: D25562297
fbshipit-source-id: b96622336
Summary: The Ondemand entry point `analyze_proc_desc` exists purely to support specialisation under biabduction. After fixing the storing of specialised `proc_desc`s for java it suffices to use `analyze_proc_name` which will work just fine in its place.
Reviewed By: jvillard
Differential Revision: D25421763
fbshipit-source-id: b162feec3
Summary:
Since D20736043 (d84fea52ae) is adding edges from the noreturn function node to exit node, analyzers should
handle the state differently to normal states.
Reviewed By: ezgicicek
Differential Revision: D25402576
fbshipit-source-id: a98e41b0c
Summary:
This diff adds the ability to skip translation with `... && neg ( pattern)` logic so that we can skip translation of some files if the source does not contain a pattern.
Note that `skip-translation` expects a list of patterns as disjunctions:
https://www.internalfb.com/intern/diffusion/INFER/browse/master/infer/src/IR/inferconfig.ml?commit=76ae5fa0d3376573f6d04814e47ff6b5a9dd9746&lines=74
whereas we want the ability to have conjuctions inside.
## Context
Immutability analysis requires analyzing generated code which might have `Immutable` annotations. When analysing fbandroid, we skip all generated code:
```
"skip-translation": [
{
"source_contains": "generated",
"language": "Java"
}
],
```
However, rather than analyzing all generated code (which might be expensive across all targets) by removing the above, with this diff, we only analyze generated code that doesn't contain e.g. `Immutable` and skip all other generated code as before:
```
"skip-translation": [
{
"source_contains": "generated",
"source_not_contains": "Immutable",
"language": "Java"
}
],
```
Reviewed By: ngorogiannis
Differential Revision: D25328931
fbshipit-source-id: 3ae6ae92a
Summary:
This diff uses Pvar.t in CapturedVar.t, so that
* it can include additional info in Pvar.t
* it can avoid some `Pvar.mk` calls when using the captured variables
Reviewed By: jvillard
Differential Revision: D25331763
fbshipit-source-id: 4e0c2ab4a
Summary:
Models are currently not specialised*. Remove the ability entirely, to allow further changes.
*this is proved by the fact that Infer doesn't crash, which it would if it tried to specialise models, because their procdesc is the already pre-analysed one, and pre-analysing twice a procdesc will crash.
Reviewed By: jvillard
Differential Revision: D25027698
fbshipit-source-id: 76ae5fa0d
Summary:
A Topl "small step" is a call to a method that is of interest to the
automaton. When such a call of interest is made, the topl component of
PulseAbductiveDomain.t is updated. This means that intra-procedural
Topl should now work entirely inside Pulse, without instrumenting Sil.
Main TODOs:
- add error extraction
- implement inter-procedural (PulseTopl.large_step)
Reviewed By: jvillard
Differential Revision: D25028286
fbshipit-source-id: e31a96d13
Summary:
Specialized closure substitution was broken for conditionals:
```
void foo(dispatch_block_t block1){
if (x){
block1(); // not replaced with specialized implementation
}
}
```
The problem was that when substituting function calls, it only used memory state at the exit node, rather than at each program point.
We could solve this by
- reverting the domain change in D24418560 (c47911359a), i.e. collecting all possible mappings conservatively (e.g. switch the domain back to `Map`)
- pass the `invariant_map` for substitutions at each program point.
We go with the second option here.
The closure substitution is still somewhat broken as exemplified by the following example:
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
Reviewed By: skcho
Differential Revision: D24993962
fbshipit-source-id: ebadddb58
Summary:
This was left as a TODO before: where to place calls to destructors for
C++ temporaries that are only conditionally creating when evaluating an
expression. This can happen inside the branches of a conditional
operation `b?e:f` or in potentially-short-circuited conditions on the
righ-hand side of `&&` and `||` operators.
Following the compilation scheme of clang (observed by looking at the
generated LLVM bitcode), we instrument the program with "marker"
variables, so that for instance `X x = true?X():y;` becomes (following
the execution on the true branch):
```
marker1 = 0; // initialize all markers to 0
PRUNE(true) // entering true branch
X::X(&temporary); // create temporary...
marker1 = 1; // ...triggers setting its marker to 1
X::X(&x, &temporary); // finish expression
if (marker1) {
X::~X(&temporary); // conditionally destroy the temporary
}
```
In this diff, you'll find code for:
- associating markers to temporaries that need them
- code to initialize markers to 0 before full-expressions
- code to conditionally destroy temporaries based on the values of the
markers once the full-expression has finished evaluating
Reviewed By: da319
Differential Revision: D24954070
fbshipit-source-id: cf15df7f7
Summary: Cleanup `Typ` by moving all constant types to `StdTyp`. Also remove `Typ.typ` as it's just `Typ.t` now.
Reviewed By: jberdine
Differential Revision: D24620397
fbshipit-source-id: 4764f87ef
Summary:
The global type environment is loaded in all analysis workers, so it is best to minimise its memory footprint. `MaxSharing` does a great job at this but
- it is very slow
- it may be involved in observed segfaults, since it uses potentially unsafe operations
This diff replaces `MaxSharing` with a post-facto poor-man's hashcons pass through the parts involved in a java type environment. It is not as efficient as `MaxSharing` at reducing footprint, but it's much faster.
Reviewed By: jberdine
Differential Revision: D24590400
fbshipit-source-id: c37100325
Summary:
Emit the crucial parts of Pulse summaries as json to enable
post-processing by external tools. Stop somewhat arbitrarily at some
datatypes that are just emitted as "opaque" values.
For example:
```
$ infer debug --procedures --procedures-summary-json --select 0
[[["pulse",[["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v7",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v7","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}],["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v8",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v8","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}]]]]]
```
Reviewed By: ezgicicek
Differential Revision: D24503387
fbshipit-source-id: 9bd08e93b
Summary:
This diff revises memory model of enumerator in ObjC to enable passing it as a parameter.
The cost checker was not able to analyze a function precisely when it gets an enumerator as a
parameter because the offset of an enumerator was available only when the analyzer knew the correct
relation between the enumerator and an array.
This diff simplifies the enumerator to have a similar value with `array->elements`, so its offset can
be taken without the relation between enumerator and array to get them.
Reviewed By: ezgicicek
Differential Revision: D24446574
fbshipit-source-id: 27cdc051e
Summary:
In ObjC, `NSObject.copy` returns the object returned by `copyWithZone:` on the given class. This method must be implemented if the class complies with `NSCopying` protocol. Since we don't have access to `NSObject`'s code, to follow calls into `copyWithZone:`, we replace such `copy` calls with calls to `copyWithZone:` when a) such a method exists in the class and b) the class conforms to `NSCopying` protocol.
This is done in the preanalysis because
- we need to know if there is a `copyWithZone:` method in the class.
- so that other analyses also benefit (as opposed to doing this in cost and inferbo models).
Note that `NSObject` doesn't itself conform to `NSCopying` but all its subclasses must confrom to the protocol and support the same behavior as above.
https://developer.apple.com/documentation/objectivec/nsobject/1418807-copy
Similarly for `mutableCopy` -> `mutableCopyWithZone:` for classes implementing `NSMutableCopying` protocol.
Reviewed By: skcho
Differential Revision: D24218102
fbshipit-source-id: 42900760e
Summary: This will be needed in the next diff so that we can find all classes that conform to `NSCopying` protocol.
Reviewed By: skcho
Differential Revision: D24216549
fbshipit-source-id: 297b527a6
Summary:
This diff extends inferbo's domain to include closure values. The goal of this extension is to
follow missing semantics where closures are handled as values, for example, a closure is assigned to
an object field, then it is got later to call.
Due to the bottom-up nature of the analyzer, sometimes we don't know which values are written in a
field, which is the same for the other non-closure values.
Reviewed By: ezgicicek
Differential Revision: D23932186
fbshipit-source-id: 4a575d0de
Summary:
Take another page from the Incorrectness Logic book and refrain from reporting issues on paths unless we know for sure that this path will be taken.
Previously, we would report on paths that are merely *not impossible*. This goes very far in the other direction, so it's possible we'll want to go back to some sort of middle ground. Or maybe not. See the changes in the tests to get a sense of what we're missing.
Reviewed By: ezgicicek
Differential Revision: D24014719
fbshipit-source-id: d451faf02
Summary:
This diff substitutes closure parameter when it is given via variable. For example,
```
x = ^{ ... }
foo(x)
```
this diff substitutes the closure variable `x`,
```
x = ^{ ... }
foo(^{ ... })
```
so that the specialization of `foo` can be done by `CCallSpecializaedWithClosures.process`.
Reviewed By: jvillard
Differential Revision: D23814595
fbshipit-source-id: a89f1530f
Summary:
`WithBlockParameters` is generated by a pre-analysis to express concrete block parameters. However,
before this diff, the block parameters only have names, which is insufficient to find their
summaries. This diff change `WithBlockParameters` to have `Block.t` that includes the parameters of
the block, so we can find blocks' summaries.
Reviewed By: ezgicicek
Differential Revision: D23785148
fbshipit-source-id: 9034f4f8d
Summary:
This diff increases autoreleasepool size when
* caller is non-ARC-compiled
* callee is ARC-compiled
* return type is a pointer to objc object
To distinguish non-ARC-/ARC-compiled:
* extended `translation_unit_decl_info` to have a boolean field `is_objc_arc_on`
* then copied it to `ProcAttributes.t` for each procedures.
Reviewed By: ezgicicek
Differential Revision: D23565003
fbshipit-source-id: dee22ea82
Summary:
Following the previous diff, the reason we mistakingly introduced new block as complexity increases is because when we are computing the hash of block, we return the string `block` to compute the hash for all block.
This diff fixes this case by returning the non-verbose name of the block when returning hashable name.
Reviewed By: ezgicicek
Differential Revision: D23729710
fbshipit-source-id: a345d3045
Summary:
The `attr_kind` column has now exactly two possible values: defined and undefined (since D22187238 (61ae2d1e1b)). This is also what should be stored in the field `is_defined` in the procedure attributes. Finally, the cfg can be `NULL` or not. This diff makes all three of these agree, in order to allow for the removal of the `attr_kind` column up the stack, since it will be equivalent to `cfg is NOT NULL`.
The changes in the tests indicate improved correctness: previously, specialising a modelled callee would result in a dummy entry in the procedures table, because all clang procedures were given a procdesc, so the specialisation would specialise the empty procdesc. Now, the model is not shadowed by the specialised dummy procdesc, and is specialised itself (and that's why it also ends up in the capture DB, which is slightly counter-intuitive).
Reviewed By: jvillard
Differential Revision: D23536931
fbshipit-source-id: 983216cb6
Summary:
This diff adds a new experimental checker for detecting size of objects in autorelease pool in ObjC. The basic mechanism is almost the same with the previous cost calculation:
* Autorelease pool size is increased at explicit `autorelease` call
* Autorelease pool size is set as zero by the `autoreleasepool` block.
While it only supports the explicit calls as of now, we will extend the checker to handle more cases in the following diffs.
Reviewed By: ezgicicek
Differential Revision: D23473145
fbshipit-source-id: 416488176
Summary: Most of the time, when the procdesc of a callee is requested, all that is really required is the procedure attributes. However, requesting the procdesc may return `None` when the procedure is undefined (in Java, and soon for Clang too). So, change all callsites to using attributes instead, where possible.
Reviewed By: jvillard
Differential Revision: D23539422
fbshipit-source-id: 3b1a52d48
Summary: Given the brittleness of DB-indexing over marshalled values (one example exposed in D23191601), the serializer "for comparison" must be removed. Serialized procnames are no longer used as keys, so their serializer can be converted to the one which preserves sharing (saving some space in the process).
Reviewed By: jberdine
Differential Revision: D23425123
fbshipit-source-id: 9c82da778
Summary:
Store model summaries in the `model_specs` database table instead of in spec files.
This table is populated when a new database is created by loading a dump of the `specs` table in the models database. This avoids the perf and reliability implications of ATTACHing the same, non-read-only models-DB by many processes.
- `BiabductionModels` is moved into `IR` so that `JsonReports` can access it.
- The binary `sqlite3` is now required on the host compiling infer.
Reviewed By: skcho
Differential Revision: D23191601
fbshipit-source-id: 1532481ee
Summary: Since the unique string ID used as a key is also human-readable, remove the now-redundant `proc_name_hum` field, which is only used for debug purposes anyway.
Reviewed By: jberdine
Differential Revision: D23446223
fbshipit-source-id: 5027066ee
Summary: There is still a bug in using marshalled values as keys (exposed up the diff stack) where structurally equal procnames get serialised to distinct keys. This diff addresses that bug by using the existing functionality for string unique IDs.
Reviewed By: ezgicicek
Differential Revision: D23133761
fbshipit-source-id: 3cbafb51b
Summary:
Move Sqlite registered statements inside the functions that use them, since these are only ever used in one place.
Also, reformat some SQL.
Also, make `mark_all_source_files_stale` use an un-prepared statement, as it is used at most once per run, wasting resources if registered.
Reviewed By: skcho
Differential Revision: D23423844
fbshipit-source-id: 07362d19c
Summary:
This diff finishes the migration from the specialization of methods that take blocks as arguments. Here we delete all the old code and change the way we model dispatch functions so that the tests pass.
- Remove the code for specializing the methods in biabduction.
- Remove the call flags `cf_with_block_parameters` that was only used in this algorithm.
- Removes models for dispatch functions.
- Adds models for dispatch functions as program transformation only in biabduction. To be added in other checkers in the future.
Reviewed By: ngorogiannis
Differential Revision: D23345342
fbshipit-source-id: b5e8542ce
Summary:
This preanalysis in general aims to create specialized clones of methods that have blocks as arguments and that are called with concrete closures, and then call these clone methods instead of the original ones.
One complication is what happens with the captured variables in the closure. What we do is we add them to the formals of the cloned method and passed them through to the concrete blocks.
We do this transformation in two steps:
1. Go through all the callers of methods with blocks as parameters, and create the clone methods. In this preanalysis we only create the attributes for the new method, not the code. We also update the call instructions in the callers to represent a call to the cloned method with updated arguments: we don't need to pass closures arguments anymore, we instead pass the captured variables as new arguments.
2. We add the corresponding code to the newly created clones: this means swapping the call to the block variable with a call to the corresponding block. Moreover, we add some of the new formals (that correspond to the captured variables) to the arguments of the call.
This diff implements step 1 of the analysis. The next diff D23216021 implements step 2.
Reviewed By: ngorogiannis
Differential Revision: D23109204
fbshipit-source-id: 91a5eb16b
Summary:
`Obj.reachable_words` can be very slow on large values, so only call it in debug mode.
Also, measure the time we spend for compressing/storing the global type environment.
Reviewed By: jvillard
Differential Revision: D23264532
fbshipit-source-id: 4a9456ab7
Summary:
These are the only ones we need, it turns out the other types (string,
proc names, ...) were dead code. The changes the integer constants to
rational constants, to match the domain of the linear arithmetic engine.
Reviewed By: skcho
Differential Revision: D23164136
fbshipit-source-id: 755c3f526
Summary:
Instead of alternating between a normal form and a tree structure,
always keep a normal form. Except the normal form is not always fully
normalized. Overall, it's a bit faster than the previous iteration,
while being more precise! In particular, linear arithmetic aims at being
much more complete.
Reviewed By: skcho
Differential Revision: D23134209
fbshipit-source-id: 5f9ec6ece