Summary:
This diff adds a new issue type for reporting modifications to immutable fields (when `report-immutable-modifications` is enabled).
The underlying analysis depends on impurity analysis which itself is based on post-processing of pulse's summaries.
Reviewed By: skcho
Differential Revision: D25216637
fbshipit-source-id: 42e843793
Summary:
Previously, impurity analysis only collected one access for a single modification but not all other modifying accesses. This diff
- changes the impurity domain to collect all modifying accesses
- tracks and prints all the accesses seen to reach the modification, improving readability&debugging
Recording all accesses are needed in the next diff to determine if a method modifies any immutable fields. To determine that, we need to know all modifications, not just a single one.
Reviewed By: skcho
Differential Revision: D25186516
fbshipit-source-id: 43ceb3cd8
Summary: There are no users of this, and it stands in the way of refactoring.
Reviewed By: ezgicicek
Differential Revision: D25241940
fbshipit-source-id: 5e653341a
Summary:
The starvation domain keeps a domain element per distinct pair of lock object and source location. This was used to counteract the imprecision of implicit Quandary-style traces. Starvation has used explicit traces for a long time now, so keeping all these elements is expensive (in fact, in some cases exponential) and of no value. Now, lock object identity is the only distinguishing feature of a domain element.
Also, fix some pretty printing for debugging purposes.
Reviewed By: jvillard
Differential Revision: D24829306
fbshipit-source-id: 22e12f9c1
Summary:
Since we report issues types without a prefix (e.g. ERADICATE_) and
with spaces we should also allow for prefix-less issue types in
SuppressLint, so both should work
- SuppressLint("eradicate-field-not-initialized")
- SuppressLint("Field Not Initialized")
Reviewed By: jvillard
Differential Revision: D24760341
fbshipit-source-id: 1590cf6d0
Summary:
The problem in Reporting.ml:log_issue_from_summary is that it merely
checks the presence of `SuppressLint` annotation on method's body to
decide whether to log or not the issue. This means that regardless of
issue types specified in `SuppressLint`, all issues on such method will
get blocked.
Here we fix that.
Reviewed By: ngorogiannis, mityal
Differential Revision: D24726604
fbshipit-source-id: c9cae3833
Summary:
This diff glues the previous work together.
The ClassLevelAnalysis finds list of provisional violation, builds the
graph based on them, and outputs this graph as a separate issue.
Reviewed By: artempyanykh
Differential Revision: D24682802
fbshipit-source-id: 8174da91a
Summary: Cleanup `Typ` by moving all constant types to `StdTyp`. Also remove `Typ.typ` as it's just `Typ.t` now.
Reviewed By: jberdine
Differential Revision: D24620397
fbshipit-source-id: 4764f87ef
Summary:
Namely `enumerateObjectsUsingBlock` which takes a collection and a block as an arg and the iterates over the collection and applies the block to each element of the collection.
This is a common way to iterate over Objc collections, so let's add a model for it.
Reviewed By: skcho
Differential Revision: D24604590
fbshipit-source-id: 1ceeb4b40
Summary:
Emit the crucial parts of Pulse summaries as json to enable
post-processing by external tools. Stop somewhat arbitrarily at some
datatypes that are just emitted as "opaque" values.
For example:
```
$ infer debug --procedures --procedures-summary-json --select 0
[[["pulse",[["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v7",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v7","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}],["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v8",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v8","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}]]]]]
```
Reviewed By: ezgicicek
Differential Revision: D24503387
fbshipit-source-id: 9bd08e93b
Summary:
In ObjC, `NSObject.copy` returns the object returned by `copyWithZone:` on the given class. This method must be implemented if the class complies with `NSCopying` protocol. Since we don't have access to `NSObject`'s code, to follow calls into `copyWithZone:`, we replace such `copy` calls with calls to `copyWithZone:` when a) such a method exists in the class and b) the class conforms to `NSCopying` protocol.
This is done in the preanalysis because
- we need to know if there is a `copyWithZone:` method in the class.
- so that other analyses also benefit (as opposed to doing this in cost and inferbo models).
Note that `NSObject` doesn't itself conform to `NSCopying` but all its subclasses must confrom to the protocol and support the same behavior as above.
https://developer.apple.com/documentation/objectivec/nsobject/1418807-copy
Similarly for `mutableCopy` -> `mutableCopyWithZone:` for classes implementing `NSMutableCopying` protocol.
Reviewed By: skcho
Differential Revision: D24218102
fbshipit-source-id: 42900760e
Summary: We add a naive model for `forEach` idiom for Java's Iterable and Maps. The model is naive because it doesn't take the cost of the lambda into account. This will be fixed later.
Reviewed By: da319
Differential Revision: D23868203
fbshipit-source-id: 37d169c6f
Summary: Subtle false positives and negatives in Hil make Sil preferable. This diff gets rid of the CFG-emulation of Hil, while still using Hil expressions.
Reviewed By: da319
Differential Revision: D23815026
fbshipit-source-id: 731a6d299
Summary:
This diff add a model of `NSArray.index_of_object_passing_test` in
autoreleasepool size model.
Worst case autorelease cost is "array size x cost of given test function (lambda)".
Reviewed By: ezgicicek
Differential Revision: D23785165
fbshipit-source-id: 95ec9700a
Summary:
This can be useful to make pulse forget about tricky parts of the code.
Treat "skipped" procedures as unknown so heuristics for mutating the
return value and parameters passed by reference are applied.
Reviewed By: ezgicicek
Differential Revision: D23729410
fbshipit-source-id: d7a4924a8
Summary:
Eliminate the need to serialise procnames when sending work from the restart scheduler to the workers, by sending the proc_uid instead. This is (much) shorter than the byte representation of the proc_name and it's the primary DB key of the procedures table, so it can be used by the worker to obtain the full procname.
Also, reduce GC churn by using folds in the scheduler startup instead of copying lists over and over.
Reviewed By: jberdine
Differential Revision: D23566131
fbshipit-source-id: 1472aa990
Summary:
Limit communication bandwidth and serialisation burden by sending procedure filename strings (which are bounded at ~100 bytes) instead of serialising procnames through the socket to the scheduler (which are unbounded and have been seen to reach ~30kB in the worst case for templated procedures).
Context:
Under the restart scheduler, a worker working on a procedure X that discovers a race on a dependency Y it needs fails the computation of X and sends to the scheduler the procname Y. The next time X is about to be rescheduled, the scheduler checks whether Y is still being analysed, by checking if the lock for Y still exists. This check uses the procedure filename already, so we can send that instead.
Reviewed By: jvillard
Differential Revision: D23554995
fbshipit-source-id: 9828e71a2
Summary: Most of the time, when the procdesc of a callee is requested, all that is really required is the procedure attributes. However, requesting the procdesc may return `None` when the procedure is undefined (in Java, and soon for Clang too). So, change all callsites to using attributes instead, where possible.
Reviewed By: jvillard
Differential Revision: D23539422
fbshipit-source-id: 3b1a52d48
Summary:
This is needed to make dune auto-updating of unit tests introduced in
the next diff cohabit peacefully with our tests to make sure code stays
correctly formatted wrt ocamlformat.
Also, more auto-formatting = better.
Reviewed By: da319
Differential Revision: D22865004
fbshipit-source-id: 91c47ab08
Summary:
Note that the current implementation lists classes and methods (not
<class:method> pairs.
This is bit clowny, but works for now (given the list of classes is small).
So I'll leave it out of the scope of this diff.
Reviewed By: ngorogiannis
Differential Revision: D23076671
fbshipit-source-id: 71c94ebd9
Summary:
Use `Typ.t` for parameter types in procnames, instead of `JavaSplitName.t`. This allows more precise and usable types stored in procnames, as well as not using the string-based representation of `JavaSplitName` which meant parsing strings whenever one needed a `Typ.t`.
This also makes `JavaSplitName` dead code, so it is removed.
Reviewed By: skcho
Differential Revision: D20500423
fbshipit-source-id: a72728e3f
Summary:
We check for supertypes in Java. Why not ObjC?
Would be good to get dulmarod's input here.
Reviewed By: roro47
Differential Revision: D22817126
fbshipit-source-id: 52c1c3f3c
Summary: There used to be `JoinAfter n` mode where we would try to join `n` states instead of always making disjunctions. It got deleted in D14258485 and Pulse's underlying (pre-disjuncts) domain doesn't even have a join operation. `NeverJoin` mode is not useful in Pulse anymore: pulse will diverge or OOM if we don't limit the number of disjuncts. It is also not used by any other analyzer. Let's remove it.
Reviewed By: jvillard
Differential Revision: D22817425
fbshipit-source-id: 1e658f11d
Summary: This diff refactors Java specific `PatternMatch` functions into its own module. When `PatternMatch.ml` was originally created, it was mainly for Java but now it also supports ObjC. Let's refactor it to reflect the Java/ObjC separation: move all functions that operate on Java procnames into Java submodule.
Reviewed By: jvillard
Differential Revision: D22816504
fbshipit-source-id: ff6b64b29
Summary:
This avoids wasting potentially large amount of work in some
pathological situations. Suppose `foo()` has D specs/disjuncts in its
Pulse summary, and consider a node that calls `foo()` N times, starting
with just one disjunct:
```
[x]
foo()
[x1, ..., xD]
foo()
[y1, ..., yD^2]
foo()
...
```
At the end we get `D^N` disjuncts. Except, currently (before this diff),
we prune the number of disjuncts to the limit L at each step, so really
what happens is that we (very) quickly reach the L limit, then perform
`L*D` work at each step (where we take "apply one pre/post pair to the
current state" to be one unit of work), thus doing `O(L*D*n)` amount of
work.
Instead, this diff counts how many disjuncts we get after each
instruction is executed, and we already reched the limit L then doesn't
bother accumulating more disjuncts (as they'd get discarded any way),
and crucially also doesn't bother executing the current instruction on
these extra disjuncts. This means we only do `O(L*n)` work now, which is
exactly how we expect pulse to scale: execute each instruction (in
loop-free code) at most L times.
This helps with new arithmetic domains that are more expensive and
exhibit huge slowdowns without this diff but no slowdown with this diff
(at least on a few examples).
Reviewed By: skcho
Differential Revision: D22815241
fbshipit-source-id: ce9928e7c
Summary:
We update the type of captured variables to include information about capture mode (`ByReference` or `ByValue`) both for procdesc attributes and the closure expression.
For lambda: closure expression now contains correct capture mode for capture variables. Procdesc still does not contain information about captured variables which we will address in the next diff.
For objc blocks: at the moment all captured variables have mode `ByReference`. Added TODOs to fix this.
Reviewed By: jvillard
Differential Revision: D22572054
fbshipit-source-id: 4c88678ee
Summary:
This diff adds a prototype of a new checker that collects config checkings between markers.
Basically, what the checker is doing is a taint analysis.
* Taint source: function calls of "marker start"
* Taint sink: function calls of config checking
* Taint remove: function calls of "marker end"
By the taint analysis, the analysis results will say that which taints can reach to the sink. In other words, which marker ID that has been started can reach to the config checks, before marker's ending.
I am separating the diff into three steps:
(1/3) Add basic abstract semantics
(2/3) Add trace information
(3/3) Add reporting with test examples
Reviewed By: jvillard
Differential Revision: D21935546
fbshipit-source-id: 092abb92c
Summary: Pulse has now a better version of this check, so let's delete it.
Reviewed By: ngorogiannis
Differential Revision: D22019247
fbshipit-source-id: 344678225
Summary:
This issue type was not giving good results and can be replaced by
Pulse's version.
Reviewed By: ngorogiannis
Differential Revision: D22019551
fbshipit-source-id: 5cf3db46d
Summary:
The checker names are only used in debug information but I need them to
be more useful so users can do queries about each checker. Turn them
into an "id" instead of a "name", with some constraints to avoid crazy
IDs. In the next diff these IDs will be used on the command lin.
Reviewed By: dulmarod
Differential Revision: D21934373
fbshipit-source-id: 847a4958d
Summary:
The past issue with ppx_compare on nonrec types has (at some point) been fixed.
Greped for `let compare = compare` and removed the workaround for `nonrec`.
Reviewed By: jberdine
Differential Revision: D21973087
fbshipit-source-id: 5e2043e20
Summary:
Before: to report an issue type, wrap it in an exception type then pass
that around and translate it back to an issue type at some point.
After: no. Except biabduction, which was built like this and uses
raising of exceptions to stop the analysis too.
Reviewed By: dulmarod
Differential Revision: D21904589
fbshipit-source-id: cb8446f38
Summary:
See previous diff: issues are always reported with the same severity so
recognise that and just use their default severity in "modern" checkers.
Reviewed By: ngorogiannis
Differential Revision: D21904591
fbshipit-source-id: fb5387e35
Summary:
By and large issues of a given type (eg NULL_DEREFERENCE) are always
reported with the same severity (eg ERROR), but that severity is hard to
tease out from the code. This makes it more explicit by favouring the
default in `IssueType.t` in many cases without changing infer's
behaviour. The checkers typically use `Reporting.log_{error,warning}`;
these are taken care of in the next diff.
Reviewed By: skcho
Differential Revision: D21904590
fbshipit-source-id: 47c76cd4c
Summary:
- "visibility" (whether an issue to report is something to show the user
or something that is only used for debugging or for detecting other
issues) is an intrinsic property of an issue type and thus belongs in
`IssueType.t`.
- "severity" (warning/error/...) is something that each issue should
have a default for ("a memory leak is by default an ERROR", a
"condition always true is by default a warning"), but can also be
overriden at run-time. Right now only nullsafe uses that capability:
when in "strict mode", some warnings become errors. We can imagine
extending this to other issue types, or even providing config flags to
turn warnings into errors, like compilers often have.
To guess the default severity (since it's dynamic it can be hard to know
for sure!), I tried to find places where it was reported as the source
of truth, but also later diffs test these defaults against our tests (by
removing most of the dynamic changes in severity).
With this diff there are 3 places where severity is set:
1. The default severity in IssueType.t: this is unused for now.
2. The severity from `Exceptions.recognize_exception`: this is
semi-statically determined and, if set, takes precedence over number 3 (which looks wrong to me!)
3. The severity passed to `Errlog.log_issue` when we actually add an
issue to the error log: this is used only when (2) is unset.
The next diffs will make 1 the default, delete 2, and make 3 optional
but override 1 when passed.
Reviewed By: skcho
Differential Revision: D21904538
fbshipit-source-id: a674b42d8
Summary:
IR/ should contain modules pertaining to the core IR of infer, i.e. how
CFGs are represented (including SIL).
These categories of modules were moved:
- Access paths and HIL are an abstraction on top of SIL used by certain
analyses. Moving the corresponding modules to IR/ makes this clearer
as they are not really part of the IR (they are less fundamental than
SIL).
- Error reporting is also something for other analyses, not part of IR.
Moved a bunch of modules related to that to absint/.
- Same for ProcnameDispatcher
- biabduction-speficic modules: Objc_models, BiabductionModels
- test-determinator-specific modules: JProcname
Reviewed By: ezgicicek
Differential Revision: D21722368
fbshipit-source-id: b28e9bdac
Summary:
Android OS calls certain overridden class methods always on the UI thread. The function changed here attempted to build a list of all these methods, one by one. It's much more complete to simply consider a method as callable on the UI thread if it's an override of an Android library method, and it starts with "on". Only a single instance is known not to obey this pattern, so it's easier to blacklist than to whitelist.
Also clarify the name to `is_android_lifecycle_method`.
Reviewed By: ezgicicek
Differential Revision: D21703365
fbshipit-source-id: 41ca3e998