Summary:
A Topl "small step" is a call to a method that is of interest to the
automaton. When such a call of interest is made, the topl component of
PulseAbductiveDomain.t is updated. This means that intra-procedural
Topl should now work entirely inside Pulse, without instrumenting Sil.
Main TODOs:
- add error extraction
- implement inter-procedural (PulseTopl.large_step)
Reviewed By: jvillard
Differential Revision: D25028286
fbshipit-source-id: e31a96d13
Summary:
Specialized closure substitution was broken for conditionals:
```
void foo(dispatch_block_t block1){
if (x){
block1(); // not replaced with specialized implementation
}
}
```
The problem was that when substituting function calls, it only used memory state at the exit node, rather than at each program point.
We could solve this by
- reverting the domain change in D24418560 (c47911359a), i.e. collecting all possible mappings conservatively (e.g. switch the domain back to `Map`)
- pass the `invariant_map` for substitutions at each program point.
We go with the second option here.
The closure substitution is still somewhat broken as exemplified by the following example:
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
Reviewed By: skcho
Differential Revision: D24993962
fbshipit-source-id: ebadddb58
Summary:
This was left as a TODO before: where to place calls to destructors for
C++ temporaries that are only conditionally creating when evaluating an
expression. This can happen inside the branches of a conditional
operation `b?e:f` or in potentially-short-circuited conditions on the
righ-hand side of `&&` and `||` operators.
Following the compilation scheme of clang (observed by looking at the
generated LLVM bitcode), we instrument the program with "marker"
variables, so that for instance `X x = true?X():y;` becomes (following
the execution on the true branch):
```
marker1 = 0; // initialize all markers to 0
PRUNE(true) // entering true branch
X::X(&temporary); // create temporary...
marker1 = 1; // ...triggers setting its marker to 1
X::X(&x, &temporary); // finish expression
if (marker1) {
X::~X(&temporary); // conditionally destroy the temporary
}
```
In this diff, you'll find code for:
- associating markers to temporaries that need them
- code to initialize markers to 0 before full-expressions
- code to conditionally destroy temporaries based on the values of the
markers once the full-expression has finished evaluating
Reviewed By: da319
Differential Revision: D24954070
fbshipit-source-id: cf15df7f7
Summary: Cleanup `Typ` by moving all constant types to `StdTyp`. Also remove `Typ.typ` as it's just `Typ.t` now.
Reviewed By: jberdine
Differential Revision: D24620397
fbshipit-source-id: 4764f87ef
Summary:
The global type environment is loaded in all analysis workers, so it is best to minimise its memory footprint. `MaxSharing` does a great job at this but
- it is very slow
- it may be involved in observed segfaults, since it uses potentially unsafe operations
This diff replaces `MaxSharing` with a post-facto poor-man's hashcons pass through the parts involved in a java type environment. It is not as efficient as `MaxSharing` at reducing footprint, but it's much faster.
Reviewed By: jberdine
Differential Revision: D24590400
fbshipit-source-id: c37100325
Summary:
Emit the crucial parts of Pulse summaries as json to enable
post-processing by external tools. Stop somewhat arbitrarily at some
datatypes that are just emitted as "opaque" values.
For example:
```
$ infer debug --procedures --procedures-summary-json --select 0
[[["pulse",[["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v7",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v7","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}],["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v8",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v8","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}]]]]]
```
Reviewed By: ezgicicek
Differential Revision: D24503387
fbshipit-source-id: 9bd08e93b
Summary:
This diff revises memory model of enumerator in ObjC to enable passing it as a parameter.
The cost checker was not able to analyze a function precisely when it gets an enumerator as a
parameter because the offset of an enumerator was available only when the analyzer knew the correct
relation between the enumerator and an array.
This diff simplifies the enumerator to have a similar value with `array->elements`, so its offset can
be taken without the relation between enumerator and array to get them.
Reviewed By: ezgicicek
Differential Revision: D24446574
fbshipit-source-id: 27cdc051e
Summary:
In ObjC, `NSObject.copy` returns the object returned by `copyWithZone:` on the given class. This method must be implemented if the class complies with `NSCopying` protocol. Since we don't have access to `NSObject`'s code, to follow calls into `copyWithZone:`, we replace such `copy` calls with calls to `copyWithZone:` when a) such a method exists in the class and b) the class conforms to `NSCopying` protocol.
This is done in the preanalysis because
- we need to know if there is a `copyWithZone:` method in the class.
- so that other analyses also benefit (as opposed to doing this in cost and inferbo models).
Note that `NSObject` doesn't itself conform to `NSCopying` but all its subclasses must confrom to the protocol and support the same behavior as above.
https://developer.apple.com/documentation/objectivec/nsobject/1418807-copy
Similarly for `mutableCopy` -> `mutableCopyWithZone:` for classes implementing `NSMutableCopying` protocol.
Reviewed By: skcho
Differential Revision: D24218102
fbshipit-source-id: 42900760e
Summary: This will be needed in the next diff so that we can find all classes that conform to `NSCopying` protocol.
Reviewed By: skcho
Differential Revision: D24216549
fbshipit-source-id: 297b527a6
Summary:
This diff extends inferbo's domain to include closure values. The goal of this extension is to
follow missing semantics where closures are handled as values, for example, a closure is assigned to
an object field, then it is got later to call.
Due to the bottom-up nature of the analyzer, sometimes we don't know which values are written in a
field, which is the same for the other non-closure values.
Reviewed By: ezgicicek
Differential Revision: D23932186
fbshipit-source-id: 4a575d0de
Summary:
Take another page from the Incorrectness Logic book and refrain from reporting issues on paths unless we know for sure that this path will be taken.
Previously, we would report on paths that are merely *not impossible*. This goes very far in the other direction, so it's possible we'll want to go back to some sort of middle ground. Or maybe not. See the changes in the tests to get a sense of what we're missing.
Reviewed By: ezgicicek
Differential Revision: D24014719
fbshipit-source-id: d451faf02
Summary:
This diff substitutes closure parameter when it is given via variable. For example,
```
x = ^{ ... }
foo(x)
```
this diff substitutes the closure variable `x`,
```
x = ^{ ... }
foo(^{ ... })
```
so that the specialization of `foo` can be done by `CCallSpecializaedWithClosures.process`.
Reviewed By: jvillard
Differential Revision: D23814595
fbshipit-source-id: a89f1530f
Summary:
`WithBlockParameters` is generated by a pre-analysis to express concrete block parameters. However,
before this diff, the block parameters only have names, which is insufficient to find their
summaries. This diff change `WithBlockParameters` to have `Block.t` that includes the parameters of
the block, so we can find blocks' summaries.
Reviewed By: ezgicicek
Differential Revision: D23785148
fbshipit-source-id: 9034f4f8d
Summary:
This diff increases autoreleasepool size when
* caller is non-ARC-compiled
* callee is ARC-compiled
* return type is a pointer to objc object
To distinguish non-ARC-/ARC-compiled:
* extended `translation_unit_decl_info` to have a boolean field `is_objc_arc_on`
* then copied it to `ProcAttributes.t` for each procedures.
Reviewed By: ezgicicek
Differential Revision: D23565003
fbshipit-source-id: dee22ea82
Summary:
Following the previous diff, the reason we mistakingly introduced new block as complexity increases is because when we are computing the hash of block, we return the string `block` to compute the hash for all block.
This diff fixes this case by returning the non-verbose name of the block when returning hashable name.
Reviewed By: ezgicicek
Differential Revision: D23729710
fbshipit-source-id: a345d3045
Summary:
The `attr_kind` column has now exactly two possible values: defined and undefined (since D22187238 (61ae2d1e1b)). This is also what should be stored in the field `is_defined` in the procedure attributes. Finally, the cfg can be `NULL` or not. This diff makes all three of these agree, in order to allow for the removal of the `attr_kind` column up the stack, since it will be equivalent to `cfg is NOT NULL`.
The changes in the tests indicate improved correctness: previously, specialising a modelled callee would result in a dummy entry in the procedures table, because all clang procedures were given a procdesc, so the specialisation would specialise the empty procdesc. Now, the model is not shadowed by the specialised dummy procdesc, and is specialised itself (and that's why it also ends up in the capture DB, which is slightly counter-intuitive).
Reviewed By: jvillard
Differential Revision: D23536931
fbshipit-source-id: 983216cb6
Summary:
This diff adds a new experimental checker for detecting size of objects in autorelease pool in ObjC. The basic mechanism is almost the same with the previous cost calculation:
* Autorelease pool size is increased at explicit `autorelease` call
* Autorelease pool size is set as zero by the `autoreleasepool` block.
While it only supports the explicit calls as of now, we will extend the checker to handle more cases in the following diffs.
Reviewed By: ezgicicek
Differential Revision: D23473145
fbshipit-source-id: 416488176
Summary: Most of the time, when the procdesc of a callee is requested, all that is really required is the procedure attributes. However, requesting the procdesc may return `None` when the procedure is undefined (in Java, and soon for Clang too). So, change all callsites to using attributes instead, where possible.
Reviewed By: jvillard
Differential Revision: D23539422
fbshipit-source-id: 3b1a52d48
Summary: Given the brittleness of DB-indexing over marshalled values (one example exposed in D23191601), the serializer "for comparison" must be removed. Serialized procnames are no longer used as keys, so their serializer can be converted to the one which preserves sharing (saving some space in the process).
Reviewed By: jberdine
Differential Revision: D23425123
fbshipit-source-id: 9c82da778
Summary:
Store model summaries in the `model_specs` database table instead of in spec files.
This table is populated when a new database is created by loading a dump of the `specs` table in the models database. This avoids the perf and reliability implications of ATTACHing the same, non-read-only models-DB by many processes.
- `BiabductionModels` is moved into `IR` so that `JsonReports` can access it.
- The binary `sqlite3` is now required on the host compiling infer.
Reviewed By: skcho
Differential Revision: D23191601
fbshipit-source-id: 1532481ee
Summary: Since the unique string ID used as a key is also human-readable, remove the now-redundant `proc_name_hum` field, which is only used for debug purposes anyway.
Reviewed By: jberdine
Differential Revision: D23446223
fbshipit-source-id: 5027066ee
Summary: There is still a bug in using marshalled values as keys (exposed up the diff stack) where structurally equal procnames get serialised to distinct keys. This diff addresses that bug by using the existing functionality for string unique IDs.
Reviewed By: ezgicicek
Differential Revision: D23133761
fbshipit-source-id: 3cbafb51b
Summary:
Move Sqlite registered statements inside the functions that use them, since these are only ever used in one place.
Also, reformat some SQL.
Also, make `mark_all_source_files_stale` use an un-prepared statement, as it is used at most once per run, wasting resources if registered.
Reviewed By: skcho
Differential Revision: D23423844
fbshipit-source-id: 07362d19c
Summary:
This diff finishes the migration from the specialization of methods that take blocks as arguments. Here we delete all the old code and change the way we model dispatch functions so that the tests pass.
- Remove the code for specializing the methods in biabduction.
- Remove the call flags `cf_with_block_parameters` that was only used in this algorithm.
- Removes models for dispatch functions.
- Adds models for dispatch functions as program transformation only in biabduction. To be added in other checkers in the future.
Reviewed By: ngorogiannis
Differential Revision: D23345342
fbshipit-source-id: b5e8542ce
Summary:
This preanalysis in general aims to create specialized clones of methods that have blocks as arguments and that are called with concrete closures, and then call these clone methods instead of the original ones.
One complication is what happens with the captured variables in the closure. What we do is we add them to the formals of the cloned method and passed them through to the concrete blocks.
We do this transformation in two steps:
1. Go through all the callers of methods with blocks as parameters, and create the clone methods. In this preanalysis we only create the attributes for the new method, not the code. We also update the call instructions in the callers to represent a call to the cloned method with updated arguments: we don't need to pass closures arguments anymore, we instead pass the captured variables as new arguments.
2. We add the corresponding code to the newly created clones: this means swapping the call to the block variable with a call to the corresponding block. Moreover, we add some of the new formals (that correspond to the captured variables) to the arguments of the call.
This diff implements step 1 of the analysis. The next diff D23216021 implements step 2.
Reviewed By: ngorogiannis
Differential Revision: D23109204
fbshipit-source-id: 91a5eb16b
Summary:
`Obj.reachable_words` can be very slow on large values, so only call it in debug mode.
Also, measure the time we spend for compressing/storing the global type environment.
Reviewed By: jvillard
Differential Revision: D23264532
fbshipit-source-id: 4a9456ab7
Summary:
These are the only ones we need, it turns out the other types (string,
proc names, ...) were dead code. The changes the integer constants to
rational constants, to match the domain of the linear arithmetic engine.
Reviewed By: skcho
Differential Revision: D23164136
fbshipit-source-id: 755c3f526
Summary:
Instead of alternating between a normal form and a tree structure,
always keep a normal form. Except the normal form is not always fully
normalized. Overall, it's a bit faster than the previous iteration,
while being more precise! In particular, linear arithmetic aims at being
much more complete.
Reviewed By: skcho
Differential Revision: D23134209
fbshipit-source-id: 5f9ec6ece
Summary:
This is needed to make dune auto-updating of unit tests introduced in
the next diff cohabit peacefully with our tests to make sure code stays
correctly formatted wrt ocamlformat.
Also, more auto-formatting = better.
Reviewed By: da319
Differential Revision: D22865004
fbshipit-source-id: 91c47ab08
Summary: Since there is no discernible downside in using the write daemon unless in single-thread mode or in buck, make it only depend on these circumstances, not a command line flag.
Reviewed By: skcho
Differential Revision: D23004451
fbshipit-source-id: 5c1d06ed1
Summary:
This is part of work aimed to reduce usage of language-agnostics modules
in Java-specific parts of nullsafe.
This diff:
1/ migrates AssignmentRule
2/ passes Procname.Java.t around from the start of analysis.
For now we have places where we have both procname and java procname.
This is fine as we can get rid of it in follow up diffs.
Reviewed By: artempyanykh
Differential Revision: D23052226
fbshipit-source-id: 0125de297
Summary:
This is part of work aimed to reduce number of language-agnostic methods
used in Nullsafe codebase.
Reviewed By: artempyanykh
Differential Revision: D23052328
fbshipit-source-id: 2b69f5f7a
Summary: To avoid dead store false positives we skip initialization of a variable that has an `unused` attribute. However, this causes uninitialized value false positives when the variable is later used in macros. To fix this, instead of skipping initialization we record the information about `unused` attribute in local variable data that we can later use for filtering out dead store issues.
Reviewed By: jvillard
Differential Revision: D22868050
fbshipit-source-id: 4a2d0e680
Summary:
Synthetic/autogenerated methods/fields usually contain `$` in their names.
Reporting nullability violations on such code doesn't make much sense
since the violations are not actionable for users and likely need to be
resolved on another level.
This diff contributes:
1. Several test cases that involve synthetic code of different
complexity.
2. Code that handles some particular types of errors (but not all!).
Reviewed By: mityal
Differential Revision: D22984578
fbshipit-source-id: d25806209
Summary:
Use `Typ.t` for parameter types in procnames, instead of `JavaSplitName.t`. This allows more precise and usable types stored in procnames, as well as not using the string-based representation of `JavaSplitName` which meant parsing strings whenever one needed a `Typ.t`.
This also makes `JavaSplitName` dead code, so it is removed.
Reviewed By: skcho
Differential Revision: D20500423
fbshipit-source-id: a72728e3f
Summary: Since types from the java frontend are a subtype of `Typ.t` provide the means to check and enforce that for return types in procnames.
Reviewed By: ezgicicek
Differential Revision: D20495527
fbshipit-source-id: b99c784af
Summary:
`java_type` aka `JavaSplitName.t` is a pair of strings. It's used in a procname to store the return type. Replace that with `Typ.t`.
This is step one of deleting `JavaSplitName`. Step two will be to replace parameters with `Typ.t`.
Reviewed By: skcho
Differential Revision: D20323748
fbshipit-source-id: f0029c3ca
Summary:
This diff removes a dead field `Struct.subs`, which was used in
heuristics finding methods from sub-classes.
Reviewed By: ezgicicek
Differential Revision: D22945346
fbshipit-source-id: 4b3bf0093
Summary:
In Cost/Inferbo checkers, it tried to find a subclass with heuristics when a method of interface or
abstract class is called. While it makes preciser analysis results in general, sometimes introduced
tricky FPs since the behavior of the heuristics depends on targets. This diff revert the heuristics
to suppress the FPs.
Reviewed By: ngorogiannis
Differential Revision: D22924485
fbshipit-source-id: 9f151231f
Summary: This diff adds an option to shard spec files in `infer-out/specs`. For some big analysis targets, there can be too many of spec files in the one directory, which slows down IO speed for reading the spec files.
Reviewed By: jvillard
Differential Revision: D20002128
fbshipit-source-id: bd7722883
Summary:
Changing the order of the superclasses of a struct exposes a bug in both biabduction and the devirtualiser where a method would be resolved into a still virtual method (an interface method).
The reason is that we don't check whether a super class is an interface before exploring it, and seemingly we assume that there is only one (first) superclass worth exploring. This also ignores multiple inheritance in C++.
To fix this, refactor the resolution to a complete search (not just the first super class!) which ignores Java interface methods. Also moved it to `Tenv` so that both biabduction and the devirtualiser can use it.
Reviewed By: jvillard
Differential Revision: D22357488
fbshipit-source-id: 54b96c1f4