Summary: When using the restart scheduler incrementing the analyzed count before the analysis itself gives wrong results.
Reviewed By: jvillard
Differential Revision: D25367787
fbshipit-source-id: aed22cc68
Summary:
Put hooks into Pulse for a faster Topl:
- done: PulseAbductiveDomain now tracks a Topl state
- todo: PulseTopl needs some transfer function (now they're dummies)
Reviewed By: jvillard
Differential Revision: D23815497
fbshipit-source-id: f3f0cf9ef
Summary:
Existing closure substitution only supported direct block calls to formals.
The following didn't work since the domain was only keeping track of loads/calls from formals, but didn't support stores.
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
This diff adds support for assigning a block to a local variable so that we can specialize the above example.
We now have a pair domain
- existing mapping from ids to block vars
- a new mapping from mangled to block specializations
the latter allows us to update the mapping in local block assignment (via store).
Reviewed By: skcho
Differential Revision: D25030234
fbshipit-source-id: 3f172341c
Summary:
Specialized closure substitution was broken for conditionals:
```
void foo(dispatch_block_t block1){
if (x){
block1(); // not replaced with specialized implementation
}
}
```
The problem was that when substituting function calls, it only used memory state at the exit node, rather than at each program point.
We could solve this by
- reverting the domain change in D24418560 (c47911359a), i.e. collecting all possible mappings conservatively (e.g. switch the domain back to `Map`)
- pass the `invariant_map` for substitutions at each program point.
We go with the second option here.
The closure substitution is still somewhat broken as exemplified by the following example:
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
Reviewed By: skcho
Differential Revision: D24993962
fbshipit-source-id: ebadddb58
Summary: Cleanup `Typ` by moving all constant types to `StdTyp`. Also remove `Typ.typ` as it's just `Typ.t` now.
Reviewed By: jberdine
Differential Revision: D24620397
fbshipit-source-id: 4764f87ef
Summary:
This diff replaces overridden method calls in ObjC when possible, ie the first parameter of the
method has a sub-class type of the method's class. For example,
when `MyEnumerator` is a sub-class of `NSEnumerator` and there is overridden `nextObject`,
```
[my_enumerator nextObject]
```
in Sil, it was translated to like
```
NSEnumerator.nextObject(my_enumerator : MyEnumerator*)
```
and the analyzer missed the overridden method. This diff replaces the function call to
```
MyEnumerator.nextObject(my_enumerator : MyEnumerator*)
```
Reviewed By: ezgicicek
Differential Revision: D24477290
fbshipit-source-id: 6842a76f8
Summary: Can be useful, especially to dump all the summaries as json.
Reviewed By: skcho
Differential Revision: D24504253
fbshipit-source-id: 845e7d657
Summary:
Emit the crucial parts of Pulse summaries as json to enable
post-processing by external tools. Stop somewhat arbitrarily at some
datatypes that are just emitted as "opaque" values.
For example:
```
$ infer debug --procedures --procedures-summary-json --select 0
[[["pulse",[["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v7",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v7","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}],["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v8",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v8","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}]]]]]
```
Reviewed By: ezgicicek
Differential Revision: D24503387
fbshipit-source-id: 9bd08e93b
Summary:
Output summaries in json format, so that other tools can exploit the
results of infer without having to be written inside infer itself.
For now the json for a summary is just one line saying "opaque" :)
Set up the infra to generate (yo)json automatically using
ppx_yojson_conv. See it in action in the next diff.
Reviewed By: ezgicicek
Differential Revision: D24503343
fbshipit-source-id: e24a2fff3
Summary:
- output the "menu" of the interactive mode on stderr instead of stdout
so that we can pipe the results, eg
`infer debug --procedures --procedures-summary | cat`
This will be more useful when we add an option to output json, as
otherwise the menu pollutes the json.
- Allow "--select" to work for infer-debug too:
`infer debug --procedures --procedures-summary --select 0`
Reviewed By: da319
Differential Revision: D24503301
fbshipit-source-id: d7fb4b713
Summary:
Don't wait until pre-analysis has completed before updating the task
bar with the current procedure being analysed.
Reviewed By: skcho
Differential Revision: D24418609
fbshipit-source-id: afedaf687
Summary:
Move from Map to SafeInvertedMap:
- joining two branches where only one branch had the variable set to a
given closure or type should *not* keep that information around: now
we correctly get Top instead
- the "Safe" part is an optimisation that doesn't store Top values in
the map, which is important as most values are not closures so we
don't care about storing the fact that we don't know anything about
them
Reviewed By: ngorogiannis
Differential Revision: D24418560
fbshipit-source-id: 0ac701502
Summary:
Another step in the refactoring of the starvation domain:
- Main purpose is to mediate access to the set of critical pairs in a summary through a fold function (`fold_critical_pairs_of_summary`) and not through direct field access to that set. This will allow eliding storage of critical pairs entirely and dynamically generating those when folding.
- Remove optional arguments as much as possible, as this led to unused arguments not being caught.
- Helper functions distributed more logically among modules.
Reviewed By: skcho
Differential Revision: D24275399
fbshipit-source-id: d23123a48
Summary:
In ObjC, `NSObject.copy` returns the object returned by `copyWithZone:` on the given class. This method must be implemented if the class complies with `NSCopying` protocol. Since we don't have access to `NSObject`'s code, to follow calls into `copyWithZone:`, we replace such `copy` calls with calls to `copyWithZone:` when a) such a method exists in the class and b) the class conforms to `NSCopying` protocol.
This is done in the preanalysis because
- we need to know if there is a `copyWithZone:` method in the class.
- so that other analyses also benefit (as opposed to doing this in cost and inferbo models).
Note that `NSObject` doesn't itself conform to `NSCopying` but all its subclasses must confrom to the protocol and support the same behavior as above.
https://developer.apple.com/documentation/objectivec/nsobject/1418807-copy
Similarly for `mutableCopy` -> `mutableCopyWithZone:` for classes implementing `NSMutableCopying` protocol.
Reviewed By: skcho
Differential Revision: D24218102
fbshipit-source-id: 42900760e
Summary: Subsequent diff will push information down into `Event.t` so as preparation, turn all variant values into records.
Reviewed By: jvillard
Differential Revision: D24115201
fbshipit-source-id: d2126dd49
Summary:
`std::lock(x,y,z)` simultaneously acquires locks `x,y,z` in a deadlock free manner (essentially an unspecified fixed order).
Starvation currently deals with it by exploiting properties of the state domain. It's a map from locks to number of times the lock is held, so the count of many locks can be increased at the same time without recording any particular lock order.
In upcoming diffs the domain will be refactored into a tree of nested lock acquisitions (for other reasons) and that domain necessarily records lock order. The obvious way of doing this correctly is to allow `std::lock` as an atomic even (ie, without trying to break it into multiple acquisitions).
This diff does exactly that, by changing the `Event.LockAcquire` variant to take a list of locks.
Reviewed By: ezgicicek
Differential Revision: D24052304
fbshipit-source-id: 410c812d7
Summary: Gradle produces a number of compilation units which are currently captured sequentially. This diff parallelizes this step.
Reviewed By: jvillard
Differential Revision: D23930978
fbshipit-source-id: d71c22ba3
Summary: Subtle false positives and negatives in Hil make Sil preferable. This diff gets rid of the CFG-emulation of Hil, while still using Hil expressions.
Reviewed By: da319
Differential Revision: D23815026
fbshipit-source-id: 731a6d299
Summary:
This diff substitutes closure parameter when it is given via variable. For example,
```
x = ^{ ... }
foo(x)
```
this diff substitutes the closure variable `x`,
```
x = ^{ ... }
foo(^{ ... })
```
so that the specialization of `foo` can be done by `CCallSpecializaedWithClosures.process`.
Reviewed By: jvillard
Differential Revision: D23814595
fbshipit-source-id: a89f1530f
Summary:
Nullifying these leads to observable side-effects, like in the added
test.
Reviewed By: da319
Differential Revision: D23759756
fbshipit-source-id: 559a6486b
Summary: So that it can be meaningfully used for diffing across infer versions, say.
Reviewed By: martintrojer
Differential Revision: D23648560
fbshipit-source-id: 98d634e37
Summary:
Eliminate the need to serialise procnames when sending work from the restart scheduler to the workers, by sending the proc_uid instead. This is (much) shorter than the byte representation of the proc_name and it's the primary DB key of the procedures table, so it can be used by the worker to obtain the full procname.
Also, reduce GC churn by using folds in the scheduler startup instead of copying lists over and over.
Reviewed By: jberdine
Differential Revision: D23566131
fbshipit-source-id: 1472aa990
Summary:
- freshen up /docs/next/absint-framework to give sensible advice, and
delete outdated bits that are now in the API docs so they remain fresh
- delete SimpleChecker.ml as it's just a source of bitrot
- delete the "adding checkers" page as it's completely outdated and
subsumed by the "AI framework" page + the labs.
Reviewed By: jberdine
Differential Revision: D23597271
fbshipit-source-id: 78b541746
Summary:
Limit communication bandwidth and serialisation burden by sending procedure filename strings (which are bounded at ~100 bytes) instead of serialising procnames through the socket to the scheduler (which are unbounded and have been seen to reach ~30kB in the worst case for templated procedures).
Context:
Under the restart scheduler, a worker working on a procedure X that discovers a race on a dependency Y it needs fails the computation of X and sends to the scheduler the procname Y. The next time X is about to be rescheduled, the scheduler checks whether Y is still being analysed, by checking if the lock for Y still exists. This check uses the procedure filename already, so we can send that instead.
Reviewed By: jvillard
Differential Revision: D23554995
fbshipit-source-id: 9828e71a2
Summary: Most of the time, when the procdesc of a callee is requested, all that is really required is the procedure attributes. However, requesting the procdesc may return `None` when the procedure is undefined (in Java, and soon for Clang too). So, change all callsites to using attributes instead, where possible.
Reviewed By: jvillard
Differential Revision: D23539422
fbshipit-source-id: 3b1a52d48
Summary:
Store model summaries in the `model_specs` database table instead of in spec files.
This table is populated when a new database is created by loading a dump of the `specs` table in the models database. This avoids the perf and reliability implications of ATTACHing the same, non-read-only models-DB by many processes.
- `BiabductionModels` is moved into `IR` so that `JsonReports` can access it.
- The binary `sqlite3` is now required on the host compiling infer.
Reviewed By: skcho
Differential Revision: D23191601
fbshipit-source-id: 1532481ee
Summary: Since the unique string ID used as a key is also human-readable, remove the now-redundant `proc_name_hum` field, which is only used for debug purposes anyway.
Reviewed By: jberdine
Differential Revision: D23446223
fbshipit-source-id: 5027066ee
Summary: There is still a bug in using marshalled values as keys (exposed up the diff stack) where structurally equal procnames get serialised to distinct keys. This diff addresses that bug by using the existing functionality for string unique IDs.
Reviewed By: ezgicicek
Differential Revision: D23133761
fbshipit-source-id: 3cbafb51b
Summary: There is no reason to write back to the database here, as the resulting procdesc is anyway stored in the summary. This is a source of non-determinism.
Reviewed By: skcho
Differential Revision: D23500220
fbshipit-source-id: 7434e6239
Summary:
This diff fixes `--reanalyze` option that is to reanalyze specific procedures by removing their
summaries. It was broken because it tried to store an empty summary with `Status.Analyzed`.
Reviewed By: ezgicicek
Differential Revision: D23344633
fbshipit-source-id: 1c4eca6c0
Summary:
`Obj.reachable_words` can be very slow on large values, so only call it in debug mode.
Also, measure the time we spend for compressing/storing the global type environment.
Reviewed By: jvillard
Differential Revision: D23264532
fbshipit-source-id: 4a9456ab7
Summary: Moving specs to the DB missed out cleaning out all specs when reanalysing. This is the fix.
Reviewed By: jvillard
Differential Revision: D23188958
fbshipit-source-id: 5b50fdda8
Summary: In the frontend captured variables for blocks are added as formal parameters in procdesc at the beginning.
Reviewed By: dulmarod
Differential Revision: D23163619
fbshipit-source-id: 2bcbe9b9c
Summary:
Constructing the report is done by reading all the summaries, and using certain parts thereof. However, the payloads, which typically account for the greatest size, are not used (with the exception of costs).
This diff splits the storage of summaries into analysis and report summaries, and only reads and deserialises the latter for the report phase. This makes a big difference for runs with a large number of procedures.
Reviewed By: jvillard
Differential Revision: D23105072
fbshipit-source-id: 359067a0f
Summary:
As preparation for splitting summaries into some of their components, and then iterating over only those when reporting (thus gaining performance) we need procnames in the table.
Also, this fixes the now-broken use case of `infer report` with spec files, by using `--procedures-filter` to restrict printing of summaries accordingly.
Reviewed By: skcho
Differential Revision: D23101853
fbshipit-source-id: 1ae878d8e
Summary: Implement specs storage in DB, apart from biabduction models which are still left in specs files.
Reviewed By: skcho
Differential Revision: D22795638
fbshipit-source-id: 140801d3f
Summary:
This is needed to make dune auto-updating of unit tests introduced in
the next diff cohabit peacefully with our tests to make sure code stays
correctly formatted wrt ocamlformat.
Also, more auto-formatting = better.
Reviewed By: da319
Differential Revision: D22865004
fbshipit-source-id: 91c47ab08
Summary: This diff separates purity analysis and its reporting, since sometimes we want to use the purity analysis results in other checkers, but don't want to report purity issues.
Reviewed By: ezgicicek, jvillard
Differential Revision: D23054913
fbshipit-source-id: 12cc1fc42
Summary: As per title. Eases next diffs by making Summary the only source of truth for how spec files are accessed/stored.
Reviewed By: ezgicicek
Differential Revision: D22794742
fbshipit-source-id: 0ee20ec1c
Summary: This diff adds an option to shard spec files in `infer-out/specs`. For some big analysis targets, there can be too many of spec files in the one directory, which slows down IO speed for reading the spec files.
Reviewed By: jvillard
Differential Revision: D20002128
fbshipit-source-id: bd7722883
Summary:
The old --topl-only is now --topl-biabd-only, and there's also
--topl-pulse-only. This is WIP: the latter runs pulse, but it doesn't yet
extract Topl errors from pulse summaries. (The citv part of pulse path
conditions appears to have the necessary information.)
Reviewed By: jvillard
Differential Revision: D22815250
fbshipit-source-id: a01792945
Summary:
Changing the order of the superclasses of a struct exposes a bug in both biabduction and the devirtualiser where a method would be resolved into a still virtual method (an interface method).
The reason is that we don't check whether a super class is an interface before exploring it, and seemingly we assume that there is only one (first) superclass worth exploring. This also ignores multiple inheritance in C++.
To fix this, refactor the resolution to a complete search (not just the first super class!) which ignores Java interface methods. Also moved it to `Tenv` so that both biabduction and the devirtualiser can use it.
Reviewed By: jvillard
Differential Revision: D22357488
fbshipit-source-id: 54b96c1f4
Summary:
This will allow all the analyses to be able to call closures without any special treatment: we transform the call to variables that point to closures into normal function calls. We treat only ObjC blocks at the moment, with C++ lambdas to be done as a next step.
We aimed to achieve certain results in Pulse (see tests: avoid memory leaks and NPEs FPs) while also keeping the biabduction analysis working as before.
We also checked that for the examples analyzed Pulse behaves like the correct semantics of ObjC programs with blocks.
Reviewed By: jvillard
Differential Revision: D22547333
fbshipit-source-id: efe56ed51
Summary:
Messed up the aggregation of GC stats in the previous commit.
It's cleaner to have GC stats (and analysis time) outside of
BackendStats as the rules for computing them is different than the rest,
eg notice how "analysis time" needed to be corrected at the end of the
run, and similarly for GC stats. Thus, refactor this part.
Also output different aggregations of GC stats: +/max/average.
Reviewed By: skcho
Differential Revision: D22332496
fbshipit-source-id: eefd9dd72