Summary:
This diff adds a prototype of a new checker that collects config checkings between markers.
Basically, what the checker is doing is a taint analysis.
* Taint source: function calls of "marker start"
* Taint sink: function calls of config checking
* Taint remove: function calls of "marker end"
By the taint analysis, the analysis results will say that which taints can reach to the sink. In other words, which marker ID that has been started can reach to the config checks, before marker's ending.
I am separating the diff into three steps:
(1/3) Add basic abstract semantics
(2/3) Add trace information
(3/3) Add reporting with test examples
Reviewed By: jvillard
Differential Revision: D21998170
fbshipit-source-id: e7759f62f
Summary: We don't rely on `external-java-packages` in the inferconfig anymore. Let's remove it altogether.
Reviewed By: jvillard
Differential Revision: D21997962
fbshipit-source-id: 7a2e13cfe
Summary:
See previous diff: issues are always reported with the same severity so
recognise that and just use their default severity in "modern" checkers.
Reviewed By: ngorogiannis
Differential Revision: D21904591
fbshipit-source-id: fb5387e35
Summary:
Since Java8, interfaces mays contain implementations
(default methods). We modify the resolve algorith in the Java frontend
to take care of that.
Reviewed By: jvillard
Differential Revision: D21785182
fbshipit-source-id: ffab8124c
Summary: We do not use an arbitrary threshold to test cost results anymore but instead rely on `cost-issues` which do not have any trace attached. This diff adds traces to `costs-report.json` so that we can test cost issues with traces.
Reviewed By: skcho
Differential Revision: D21858846
fbshipit-source-id: e73321a92
Summary:
Now that we have a way to write cost issues, let's not rely on some arbitrary threshold (and also get rid of `EXPENSIVE_EXECUTION_TIME` issues in tests).
One consequence of this is that we will loose the cost traces in tests since `costs-report.json` doesn't have any traces. Next diff fixes that.
Reviewed By: skcho
Differential Revision: D21837574
fbshipit-source-id: 86b4d028d
Summary:
The model returns an array the length of which is the same to that of enum entries.
It takes the length of enum entries from the summary of `Enum.values` because it is not written in `tenv`. In order to do that, the model semantics should be able to request the summary of the function with `get_summary`, so I extended `model_env` to include the functionality.
Reviewed By: ezgicicek
Differential Revision: D21843319
fbshipit-source-id: d6f10eb91
Summary:
The model returns an array the length of which is the number of known
fields in `tenv`.
Reviewed By: ezgicicek
Differential Revision: D21840375
fbshipit-source-id: 891517c6e
Summary: D21816312 forgot to add the new cost testing mechanism to `fb-performance` and `performance-exlusive` directories. This diff fixes that.
Reviewed By: skcho
Differential Revision: D21837912
fbshipit-source-id: 407dafcd3
Summary: The models were too naive before since they invalidated the underlying array completely (copying C++'s push_back model), causing spurious vector invalidation issues in Java. This diff adds more reasonable models.
Reviewed By: skcho
Differential Revision: D21787543
fbshipit-source-id: a5a59ff69
Summary:
In order to test cost analysis results, currently we rely on having an arbitrary cost threshold (200) and report issues that exceed this cost. For instance, a cost of 201 is considered expensive and reported as `EXPENSIVE_EXECUTION_TIME` issue in cost tests.
This means, if we change the cost analysis in a slight way that results in some constant cost increase under 200, we wouldn't able to detect it. I find this unsatisfactory and somewhat hacky.
This diff adds the ability to write the result of `costs-report.json` into a separate `cost-issues.exp` and then compare the actual costs (not only than relying on this arbitrary threshold reporting mechanism).
Reviewed By: skcho
Differential Revision: D21816312
fbshipit-source-id: 93b531928
Summary: Add models for `View` methods that schedule on the UI thread.
Reviewed By: skcho
Differential Revision: D21767954
fbshipit-source-id: 015441ea7
Summary:
Android OS calls certain overridden class methods always on the UI thread. The function changed here attempted to build a list of all these methods, one by one. It's much more complete to simply consider a method as callable on the UI thread if it's an override of an Android library method, and it starts with "on". Only a single instance is known not to obey this pattern, so it's easier to blacklist than to whitelist.
Also clarify the name to `is_android_lifecycle_method`.
Reviewed By: ezgicicek
Differential Revision: D21703365
fbshipit-source-id: 41ca3e998
Summary:
This seems to make sense as it's a separate analysis (that depends on
biabduction). This introduces unpleasant `|| is_checker_enabled TOPL`
whenever we try to figure out if biabduction will run. I think this is a
more general problem that deserves a more general solution to express
the fact that checkers can depend on others, so that, eg,
`is_checker_enabled Purity` is true when we pass `--loop-hoisting`. Will
address in another diff.
Reviewed By: ngorogiannis
Differential Revision: D21618460
fbshipit-source-id: 8b0c9a015
Summary: Later on, this can be changed again or made customizable.
Reviewed By: artempyanykh
Differential Revision: D21618730
fbshipit-source-id: fe517c766
Summary:
Start with tests about dynamic dispatch to test the upcoming
pre-analysis.
Reviewed By: ezgicicek
Differential Revision: D21594496
fbshipit-source-id: 1771ea968
Summary: We mistakenly invalidated the set element which causes spurious vector invalidation errors. Instead, we should modify it without any invalidation.
Reviewed By: jvillard
Differential Revision: D21521943
fbshipit-source-id: 67963967e
Summary:
This diff adds semantics of assume null, heuristics. When `assume(x == null)`, it removes the
methods called on the builder `*x` from the abstract state.
```
x -> {p}
p -> {method1 called}
assume(x == null)
x -> {p}
```
This heuristics is unsound: Even though `x` (a pointer to builder object) points-to an builder
object, which cannot imply that the object `p` does not exist in the concrete semantics. The
unsoundness may appear when there is an alias (see the FP test added).
Reviewed By: ezgicicek
Differential Revision: D21502923
fbshipit-source-id: 2e392bd89
Summary: Java's iterator models were wrong. This causes `VECTOR_INVALIDATION` errors in fbandroid projects. This diff aims to fix it by modeling Java iterators with a current pointer and an underlying collection array.
Reviewed By: skcho
Differential Revision: D21448322
fbshipit-source-id: 7d44354b5
Summary:
These 2 methods are automatically supplied for all enums, with
predefined behavior and nullability: https://www.geeksforgeeks.org/enum-in-java/
(Note that they are not part of java.lang.Enum class).
This will allow using them in unvetted third part and under strict mode.
Reviewed By: artempyanykh
Differential Revision: D21501716
fbshipit-source-id: 104082d15
Summary:
This diff changes way we treat classes w.r.t. to Nullsafe modes when
issuing meta-issues.
Previously, we considered nested class independently of the outer one.
This was leading to a tricky case: when the class is clean but nested
class needs fixing, meta-info told that class can be Nullsafe.
This is counter-intutive and lead to problems when users tried to follow
wrong nullsafe suggestions for this case.
After this diff we:
1. Start calculating meta-issues only on the outermost level. This will simplify
reasoning about nullsafe stats.
2. Aggregate all nested issues counters to corresponding outer-level class.
Among others, CLASS_CAN_BE_NULLSAFE Advice will finally become
actionable in all known cases.
Reviewed By: artempyanykh
Differential Revision: D21353607
fbshipit-source-id: a17c6958a
Summary:
We have a common entry point where we skip analysis in nullsafe.
This logic is copied from `Reporting.log_issue_from_summary`.
I believe this should not exist in Reporting: it is not the right place
to decide whether to suppress issues: we should not try to report it in
first place.
Because of that we falsely report "needs improvement" meta-issue while
we don't issue any (they were suppressed but participated in needs
improvement count calculations).
Now this change will make meta-issue to be synced with what the user
actually sees.
Down the line we should have a more reliable fix for that.
So far I reviewed suppressing code and looks like we should not suppress
anything else (unless explicitly SuppressLint-ed, which is fine).
Reviewed By: artempyanykh
Differential Revision: D21328634
fbshipit-source-id: 120ce06d1
Summary:
The directory was created to have several sets of nullsafe tests but
there is only one in the end. Remove the redundant "-default".
Reviewed By: mityal
Differential Revision: D21300205
fbshipit-source-id: 46ed8b032
Summary:
The directory names had some interesting variety due to historical
reasons.
- {c,cpp,objc,objcpp}/errors/ date from the time when infer was only
biabduction
- java/infer/ dates from the time when we had an "--analyzer" option and
"infer" was one of them (sic), and eg another was "eradicate".
- c/biabduction/ dates from the time when the biabduction analysis was
being migrated to the "checkers" (AI) framework. For some reasons the
tests there are not a subset of c/infer/ but seem to be entirely new
tests.
The convention now dictates that we should name all of these
*/biabduction/. This diff moves the existing tests from c/biabduction/
into c/biabduction/misc/.
Reviewed By: mityal
Differential Revision: D21300147
fbshipit-source-id: 516d1cb15
Summary:
In the previous diffs, nullsafe behavior was changed to the following:
nested class mode is inherited from the outer class mode.
Though it is possible to e.g. make nested class Nullsafe and outer not,
or make nested class STRICT and outer just LOCAL, this is an edge case
and we don't want to recommend annotating nested classes by default. The
right way is to make outer class Nullsafe instead.
In this decision, we take into account user experience and codebase
readability.
Reviewed By: ngorogiannis
Differential Revision: D21255806
fbshipit-source-id: 0200cb555
Summary:
With this change, set of possibilities will be more actionable. Most
importantly, this will also educate users and make them realize how
Nullsafe trust works.
NOTE: yes, parenthesis are bit clumsy, but it was the easiest way to
make this change and let the phrase remain grammatically correct.
Reviewed By: artempyanykh
Differential Revision: D21231468
fbshipit-source-id: 4b5349fb5
Summary:
In the previous diff we changed the semantics of nested classes w.r.t.
to Nullsafe.
Let's make it clear if users will attempt to misuse it.
Reviewed By: artempyanykh
Differential Revision: D21230717
fbshipit-source-id: 0ecc0dd06
Summary: Similarly to Enum.name, we model Class.getCanonicalName as returning an arbitrary non-empty string.
Reviewed By: ngorogiannis
Differential Revision: D21207120
fbshipit-source-id: 1e2dbd1fd
Summary:
From the user perspective, the current behavior is confusing.
The users intutitively expect the inner class to inherit Nullsafe mode
from the outer one.
Having a class that is Nullsafe but the inner is not is hence dangerous
and misleading.
For the sake of completeness and to support gradual strictification, we
allow the nested class to improse additional strictness. Particularly,
the inner class can be Nullsafe but the outer can be not.
A follow up to this diff will include warnings telling about redundant and wrong usages of nested annotations.
Reviewed By: artempyanykh
Differential Revision: D21228055
fbshipit-source-id: 75755ad1d
Summary:
We model Enum.name as returing a constant name, rather than getting real field names. We did this
because we couldn't think of any big gains, in terms of analysis precision/performance, from getting
the real names.
Reviewed By: ezgicicek
Differential Revision: D21201730
fbshipit-source-id: a2dc01a44
Summary: This diff revises the models of Collection.set and get to handle its elements.
Reviewed By: ezgicicek
Differential Revision: D21201242
fbshipit-source-id: 9c248453d
Summary: Java has this pattern of wrapping non-thread-safe containers in factory methods producing identically-typed results, but wrapped in a synchronised shell. This diff teaches RacerD about some common factory methods and uses the attribute domain to track the dynamic type of their results.
Reviewed By: ezgicicek
Differential Revision: D21155538
fbshipit-source-id: 42ebe6251
Summary: Complete the set of models for java containers that Infer should not report thread safety violations.
Reviewed By: ezgicicek
Differential Revision: D21138280
fbshipit-source-id: 01e1944b6
Summary: Models were partial and/or simply missing (`Map` writes!). Now the modelled containers use inheritance for conciseness (`List` reads are only those not caught by the `Collection` matcher, etc). Also, add URLs to documentation sources.
Reviewed By: ezgicicek
Differential Revision: D21132069
fbshipit-source-id: fefb360f0
Summary: This diff suppresses cost issues on lambda and auto-generated procedures, since they were too noisy.
Reviewed By: ezgicicek
Differential Revision: D21153619
fbshipit-source-id: 65ad6dcc3
Summary:
Replace horrible hack with ok hack.
The main difficulty in implementing the disjunctive domain is to avoid
the quadratic time complexity of executing the same disjuncts over and
over again when going around loops:
First time around a loop, assuming for example a single disjunct `d`:
```
[d]
loop body
[d1' \/ d2']
```
Second time around the same loop: the new pre will be the join of the
posts of predecessor nodes, so `old_pre \/ post(loop,old_pre)`, i.e.
`d \/ d1' \/ d2'`. Now we need to execute `loop body` again
*without running the symbolic execution of `d` again* (and the time after
that we'll want to not execute `d`, `d1'`, or `d2'`).
Horrible hack (before): Disjuncts have a boolean "visited" attached
that does its best to keep track of whether a given disjunct is old or
new. When executing a single *instruction* look at the flag and skip the
state if it's old. Of course we have no way to know for sure so it turns
out it was often wrongly re-executing old disjuncts. This was also
producing the wrong results over even simple loops: only the last
iteration would make it outside the loop for some reason. Overall, the
semantics were pretty untractable and shady at best.
New hack (this diff): only run instructions of a given *node* on
disjuncts that are not physically equal to the "pre" ones already in the
invariant map for the current node.
This gives the correct result over simple loops and a nice performance
improvement in general (probably the old heuristic was hitting the
quadratic bad case more often).
Reviewed By: skcho
Differential Revision: D21154063
fbshipit-source-id: 5ee38c68c
Summary:
1. Package will make the error too verbose.
2. We don't even need to say it is "class" because we say it in the error
description ("Class has 0 issues and can be marked Nullsafe").
Reviewed By: artempyanykh
Differential Revision: D21131998
fbshipit-source-id: 6ccca7615
Summary:
One source of false positives on container races is when the container member field is initialised to a concurrent version in a constructor, but the static type of the field doesn't reflect the thread safety of it.
This solution
- tracks flows from constructors of safe data structures to abstract addresses;
- initialises the initial attribute state when analysing a non-constructor method to that achieved by all constructors/class-initializers.
- checks for that attribute when recording container accesses.
Reviewed By: jvillard
Differential Revision: D21089428
fbshipit-source-id: 02a88f6e8
Summary:
There are two types of anonymous classes (not user defined classes):
- classic anonymous classes (defined as $<int> suffixes)
- lambda classes (corresponding to lambda expressions). Experimentally,
they all have form `$Lambda$_<int>_<int>`, but the code just uses
`$Lambda$` as a heuristic so it is potentially more robust.
# Problem this diff solves
When generate meta-issues for nullsafe, we are interested only in
user-defined classes, so we merge all nested anonymous stuff to
corresponding user-defined classes and hence aggregate the issues.
Without this diff, for each lambda in the code, we would report this as
a separate meta-issue, which would both screw up stats and be confusing
for the user (when we start reporting mode promo suggestions!).
Reviewed By: artempyanykh
Differential Revision: D21042928
fbshipit-source-id: a7be266af
Summary:
This diff revises how to handle the unknown location in inferbo in two ways:
* stop appending field to the `Unknown` location, e.g. `Unknown.x.a` is evaluated to `Unknown`
* redesign the abstract of multiple locations, like `Bottom` < `Unknown` < `Known` locations
I am doing them in one diff since applying only one of them showed bad results.
Background: `Unknown` was adopted for abstracting all unknown concrete locations, so we could avoid missing semantics of assignments to unknown locations. We tried to keep soundness. However, it brought some other problems related to precision and performance.
1. Sometimes especially when Inferbo failed to reason precise pointer values, `Unknown` may point to many other abstract locations.
2. At that time, value assignments to `*Unknown` makes the situation worse: many abstract locations are updated with imprecise values.
This problem harmed not only its precision, but also its performance since it introduced more location entries in the abstract memory.
Reviewed By: jvillard
Differential Revision: D21017789
fbshipit-source-id: 0bb6bd8b5