Summary: This diff extends the bound domain to express multiplication of bounds in some simple cases.
Reviewed By: ezgicicek
Differential Revision: D18745246
fbshipit-source-id: 4f2dcb42c
Summary:
One standard way to schedule work is by starting a thread. We treat this by
- Treating invocations of `start` on a receiver with the `Runnable _` attribute as scheduling that runnable for parallel execution in the background (as opposed to on the UI thread).
- If `start` is used on an object of a subclass of `Thread` everything already works thanks to the `get_exp_attributes` function which will implicitly ascribe to an expression the attribute `Runnable _` if the expression points to an object with a known `run` method. This will even take care of some degree of dynamic dispatch, yay!
- If `start` is used on a `Thread` object which has been created with a constructor call provided with a `Runnable` argument, we have to appropriately model that constructor call, which is what is done in `do_call`.
Reviewed By: jvillard
Differential Revision: D18726676
fbshipit-source-id: 0bd83c28e
Summary:
A current blind spot is when object construction stores specific executors / runnables to object fields, which are then never mutated and accessed from normal methods. IOW the attributes established in the constructor are necessary to report properly inside a normal method (assuming these attributes are not invalidated by method code).
To achieve this, first we retain a subset of the final state attributes in the summary (only those that affect instance variables, in constructor methods). Then, when we analyse a non-constructor method:
- we analyse all constructors
- remove all attributes from the attribute map whose key is not an expression of the form `this.x. ...`
- re-localise all remaining keys so that they appear as rooted in the `this` local variable of the current procedure
- join (intersect) all such attribute maps
- use the result in place of initial state as far as the attribute map is concerned for the analysis of the current procedure, which can now start.
This means we can catch idioms that use side-effectful initialisation for configuring certain object fields like executors or runnables.
Reviewed By: jvillard
Differential Revision: D18707890
fbshipit-source-id: 42ac6108f
Summary: Another way to schedule work in android is by posting it to a `Handler`. A handler can be constructed out of the main looper, which forces it to schedule work on the UI thread. To model all this, we add syntactic models for getting the main looper and for creating handlers, and dataflow attributes for tracking that an expression is a looper/the main looper, or a handler constructed out of a looper.
Reviewed By: skcho
Differential Revision: D18706768
fbshipit-source-id: 7c46e6913
Summary:
New syntax f1 AND-WITH-WITNESSES f2 : predicate_on_both_witnesses()
This is needed for a linter to check that a macro is present, see the test.
Reviewed By: skcho
Differential Revision: D18735988
fbshipit-source-id: a3be75c5e
Summary:
This gets rid of false positives when something invalid (eg null) is
passed by reference to an initialisation function. Havoc'ing what the
contents of the pointer to results in being optimistic about said
contents in the future.
Also surprisingly gets rid of some FNs (which means it can also
introduce FPs) in the `std::atomic` tests because a path condition
becomes feasible with havoc'ing.
There's a slight refinement possible where we don't havoc pointers to
const but that's more involved and left as future work.
Reviewed By: skcho
Differential Revision: D18726203
fbshipit-source-id: 264b5daeb
Summary:
It's a well-known fact that pulse should know too. To avoid splitting
the abstract state systematically, only act if we know the pointer is
exactly 0 to avoid reporting a nullptr dereference on `free(x)`.
Reviewed By: ezgicicek
Differential Revision: D18708575
fbshipit-source-id: 1cc3f6908
Summary:
Turns out code uses atomics in important places, modelling it removes
FPs.
The tests are copied from biabduction and adapted and extended a bit. I
didn't implement compare_exchange primitives for now (plus, giving them
a sequential semantics like in biabduction is probably a bit cheeky).
Reviewed By: skcho
Differential Revision: D18708576
fbshipit-source-id: a3581b8a4
Summary:
This diff adds inferbo's interval values to pulse's attributes. The added values will be used to
filter out infeasible passes in the following diffs.
Reviewed By: jvillard
Differential Revision: D18726667
fbshipit-source-id: c1125ac6e
Summary:
This diff fixes the model of substring.
Problem: The cost model of the substring function was to return `size of string - start index` as a
cost. However, sometimes this was a negative number, because of state abstractions on paths, array
elements, call contexts, etc, which caused an exception inadvertently.
This diff changes the model to return just `size of string`, when it cannot say `size of string` is
bigger than `start index`.
Reviewed By: ezgicicek
Differential Revision: D18707954
fbshipit-source-id: 63f27e461
Summary:
Instead of trying to figure out what runnable is directly passed to an executor,
use attributes to track the flow of a runnable. This has several advantages:
- Can track runnables across function return values.
- Can somewhat overcome the information loss under dynamic dispatch.
- Unifies handling with other attributes.
Reviewed By: skcho
Differential Revision: D18672676
fbshipit-source-id: a06a0e6ff
Summary:
- Unify treatment of modelled and annotated executors by making things go through attributes.
- Add a return attribute to summaries, so that we can track flows of thread guards/executors/future stuff through returned values.
- Dispatch modeled functions to model summaries.
This will help in following diffs where runnables will also go through attributes.
Reviewed By: skcho
Differential Revision: D18660185
fbshipit-source-id: e26b1083e
Summary:
A plugin update allows infer to know when a function doesn't return
according to its attributes. This propagates this info all the way to
the attributes of each function, and then use this information in a new
pre-analysis that cuts the links to successor nodes of each `Call`
instruction to a function that does not return.
NOTE: The "no_return" `CallFlag.t` was dead code, following diffs deal
with that (by removing it).
Reviewed By: dulmarod
Differential Revision: D18573922
fbshipit-source-id: 85ec64eca
Summary:
This also prints the CFGs *after* pre-analysis for individual procedures
in infer-out/captured/<filename>/<proc>.dot. One can also look up the
CFGs before pre-analysis in infer-out/captured/proc_cfgs_frontend.dot.
Context: I want to add a pre-analysis that needs to look at proc
attributes inter-procedurally. For this to make sense it has to happen
*after* all of capture, and before analysis.
Thus, this diff brings back the lazy running of the pre-analysis like in
D15803492, except that we still make sure to run the pre-analyses
systematically regardless of the checkers being run by running the
pre-analysis from ondemand.ml. Also we don't need to re-introduce the
"did_preanalysis" proc attribute for the same reason that the
pre-analysis is now run once and for all by ondemand.ml (instead of each
individual checker back in the days).
This has the benefit of running the pre-analysis only when needed, and
the drawback that several concurrent processes analysing the same proc
descs will duplicate work. Since pre-analyses are supposed to be very
fast I assume that neither is a big deal. If they become more expensive
then the benefit gets bigger and the drawback is just the same as with
regular analyses.
Reviewed By: skcho
Differential Revision: D18573920
fbshipit-source-id: de350eaef
Summary: This diff get static value with `EMPTY` field from class initializer.
Reviewed By: ngorogiannis
Differential Revision: D18616588
fbshipit-source-id: 26414c9b2
Summary: When we see a call to schedule some work on an executor and we don't have evidence that it is on some specific thread (UI/BG), instead of dropping the work, assign it `UnknownThread` and treat it as running on the background by default.
Reviewed By: jvillard
Differential Revision: D18615649
fbshipit-source-id: e8bad64b6
Summary: Following D18351867, this diff adds more size alias: when initial array size is one.
Reviewed By: ezgicicek
Differential Revision: D18530598
fbshipit-source-id: 26d57fe30
Summary:
- more flexible API
- less error-prone thanks to named parameters
- also takes care of adjusting predecessors of the previous successors!
This fixes some (probably harmless) bugs in the frontends.
Reviewed By: dulmarod
Differential Revision: D18573923
fbshipit-source-id: ad97b3607
Summary: The transition HOLDS-NEXT WITH-TRANSITION Parameters was only implemented for ObjC methods, now it also works on function calls.
Reviewed By: jvillard
Differential Revision: D18572760
fbshipit-source-id: 06e615cbe
Summary:
Now we point to the root cause of the problem, and also provide
actionable way to solve the issue
Reviewed By: artempyanykh
Differential Revision: D18575650
fbshipit-source-id: ba4884fe1
Summary:
Two goals:
1. Be less assertive when speaking about third party code (it might be
written with different conventions).
2. Point to third party signatures folder so the users know how to
proceed
Reviewed By: artempyanykh
Differential Revision: D18571514
fbshipit-source-id: 854d6e746
Summary:
1/ We now support messaging for third-party: show file name and line
number
2/ We did not show information about internal models in case of param
calls
3/ Small change: we don't specify "modelTables.ml" anymore: no need to
expose implementation details
Reviewed By: artempyanykh
Differential Revision: D18569790
fbshipit-source-id: 28586c8ff
Summary:
Whole bunch of changes aimed to make error messages more clear and
concise.
1/ Wording and language is unified. We make errors sound more like a
type system violations, rather than linter reccomendations.
Particularly, we refrain from saying things like "may be null" - this is
a linter-style statement that may provoke discussions (what if the
developer knows it can not be null in this particular case).
Instead, we refer to declared nullability and nullability of actual values. This way, it is more clear that this is not a heuristic, this is how rules of a type-system work.
2/ Additionally, we drop things like field class in places when the
context should be clear by who looks at the error. We expect the user
sees the code and the error caption. So e.g. we don't repeat the word "field"
twice.
3/ In cases when we are able to retrieve formal param name, we include it for
usability.
4/ For Field not initialized error, we refer to Initializer methods:
this is a non-obvious but important nullsafe feature.
Reviewed By: artempyanykh
Differential Revision: D18569762
fbshipit-source-id: 9221d7102
Summary:
It make the message bit less heavy, and also it is kind of obvious that
it is origin.
In follow up diffs we will change the text so it is hopefully even more
obvious.
Reviewed By: artempyanykh
Differential Revision: D18527695
fbshipit-source-id: a305d547b
Summary:
1. We don't want to teach the users to ignore noise origin because
sometimes we are going to render something useful for them.
2. It just looks not cool.
Reviewed By: artempyanykh
Differential Revision: D18527694
fbshipit-source-id: 0ea248122
Summary: Android may spontaneously call these methods on the UI thread, so recognize the fact.
Reviewed By: ezgicicek
Differential Revision: D18530477
fbshipit-source-id: a8a798779
Summary:
First step towards a global analysis. A new command line flag activates the step in `Driver`.
The whole-program analysis is a simple, quadratic (inefficient-as-yet), iteration over all domain elements. However, it is restricted to those elements that are explicitly scheduled to run.
Reviewed By: skcho
Differential Revision: D17787441
fbshipit-source-id: 9fecd766c
Summary:
Note: Disabled by default.
Having some support for values, we can report when a null or constant
value is being dereferenced. The particularity here is that we don't
report when 0 is a possible value for the address, or even if we know
that the value of the address can only be 0 in that branch! Instead, we
allow ourselves to report only when we the address has been *set* to
NULL (or any constant).
This is in line with how pulse deals with other issues: only report when
1. we see an address become invalid, and
2. we see the same address be used later on
Reviewed By: skcho
Differential Revision: D17665468
fbshipit-source-id: f1ccf94cf
Summary:
This diff avoids unqualified variables by `ItvUpdatedBy` are qualified later. For example,
```
z = x & y;
z = z + 1;
```
While `z` should not be selected as a control variable, it wasn't, because it was qualified by the addition. This pattern introduces FPs in many cases.
Reviewed By: ngorogiannis
Differential Revision: D18505894
fbshipit-source-id: 13aec3008
Summary:
This diff excludes integer variables from control variables when their values are calculated by
binary operators.
Reviewed By: ezgicicek
Differential Revision: D18505826
fbshipit-source-id: 710533d4c
Summary: Adding a trace that includes when strongSelf was assigned to, and when it's used without a check.
Reviewed By: skcho
Differential Revision: D18506113
fbshipit-source-id: 778c4b086
Summary:
The variable strongSelf, because it is equal to a weak captured pointer, needs to be checked for null before being used, otherwise it could be null by the time the block is executed.
Added this check to the SelfInBlock checker, and removed it from biabduction.
We want to migrate all the objc checks from biabduction, so it will be easier to change and faster and more reliable. Moreover, this check is more general, it will flag any use of unchecked strongSelf, not just a dereference.
Reviewed By: skcho
Differential Revision: D18403849
fbshipit-source-id: a9cf5d80b
Summary:
There was a precision loss during the substitution of array block. For example:
Callee's abstract memory includes an array block as follows, where `a` is a parameter.
```
a.elements -> { a.elements[*] with a.elements.size }
```
Callers' abstract memory includes a pointer that may point to multiple array blocks.
```
c -> { x, y }
x.elements -> { x.elements[*] with x.elements.size }
y.elements -> { y.elements[*] with y.elements.size }
```
When the callee is called with the parameter `c`, the callees memory is substituted to:
```
x.elements -> { x.elements[*] with top , y.elements[*] with top }
y.elements -> { x.elements[*] with top , y.elements[*] with top }
```
because `a.elements[*]` was substituted to `{ x.elements[*] , y.elements[*] }`
and `a.elements.size` was substituted to `top ( = x.elements.size join y.elements.size )`.
This diff tries to keep the precision in the specific case, not to join the sizes of array blocks.
So now the same callee's abstract memory is substituted to:
```
x.elements -> { x.elements[*] with x.elements.size }
y.elements -> { y.elements[*] with y.elements.size }
```
Reviewed By: ngorogiannis
Differential Revision: D18480585
fbshipit-source-id: b70e63c22
Summary: Due to the weakness of the analysis which can't detect side-effecting prop setting (e.g. as in `builder.prop1(..)`), we currently have many broken chains that do do have any `create` method in their prefixes. Let's not report on these broken chains.
Reviewed By: skcho
Differential Revision: D18503523
fbshipit-source-id: 7506e34b7
Summary:
In some apps executors are obtained by calling standard framework methods (and not by using DI with annotations).
To treat this style, we need to
- Detect calls that return such executors (`do_executor_effect`) and tag the return result with an `Attribute` indicating that it is now an executor, plus what thread it uses.
- Use that information when calling `execute`, to resolve the executor, if any, and its thread (in `do_work_scheduling` via `AttributeDomain.get_executor_constraint`).
- All this requires a new domain component, mapping variables to attributes. This extends the component previously used for remembering whether a variable is the result of a check on whether we run on the UI thread.
At the same time, I un-nested some functions from the transfer function for readability.
Reviewed By: skcho
Differential Revision: D18476122
fbshipit-source-id: bc39b5c2f
Summary:
We consider Java collections to be like c++ std::vectors and add models for
- `Collections.get(..)`
- `__cast`
Reviewed By: skcho
Differential Revision: D18449607
fbshipit-source-id: 448206c84
Summary: `equals1` and `equals2` in `SafeInvertedMap.join` are references that indicate whether given parameters and the result is physically equal or not. This diff fixes a missing update of them.
Reviewed By: ezgicicek
Differential Revision: D18450680
fbshipit-source-id: bae19cbe9
Summary:
It returns non-top value when one of the parameters of band is positive, i.e., `x & 255` returns
`[0, 255]` instead of top.
Reviewed By: ezgicicek
Differential Revision: D18448614
fbshipit-source-id: aaa298a66
Summary:
Let's introduce a set of new cost analysis issue types that are raised when the function is statically determined to run on the UI thread. For this, we rely on the existing `runs_on_ui_thread` check that is developed for RacerD. We also update the cost summary and `jsonbug.cost_item` to include whether a method is on the ui thread so that we don't repeatedly compute this at diff time for complexity increase issues.
Note that `*_UI_THREAD` cost issues are assumed to be more strict than `*_COLD_START` reports at the moment. Next, we can also consider adding a new issue type that combines both such as `*_UI_THREAD_AND_COLD_START` (i.e. for methods that are both on cold start and run on ui thread).
Reviewed By: ngorogiannis
Differential Revision: D18428408
fbshipit-source-id: f18805716
Summary:
This diff tries more narrowing during analysis in order to get preciser results on nested loops.
In the widening phase, it does narrowing a loop right after its widening, for each loops. In general, this may make the widening phase non-terminating because it keeps the abstract state from monotonely increasing to the fixed point in a finite number of iterations. To avoid that situation, this diff applies the narrowing only when the first visit of the loop in the widening phase.
Reviewed By: ngorogiannis
Differential Revision: D18400631
fbshipit-source-id: cc76f7e85
Summary: Sometimes there is a code like `for(int i = 1; i < x; i++){ l.add(); }`, where the first element in a list is addressed specifically. This case was not analyzed precisely, because the alias value is added only when `i` is initialized by 0 by heuristic. This diff extends the heuristic, so it adds a size alias between `i` and `l.size()` when `i` is initialized by 0 or 1.
Reviewed By: ezgicicek
Differential Revision: D18351867
fbshipit-source-id: e7d19a4ec
Summary:
This diff adds semantics of Java function calls of enum `values` inside class initializers.
* Java class initializer function initializes a specific field `$VALUES`, which points to the list
of enum values.
* The `values` function of enum class returns the value of `$VALUES`.
The problem is when the `values` function is called inside the class initializer, for example:
```
enum Color {
RED,
GREEN,
BLUE;
static {
for (Color c : Color.values()) {}
}
}
```
This introduces a recursive dependency: the class initializer calls `Color.values` and the function
returns `Color.$VALUES` the value of which should be initialized in the class initializer.
To address the problem, this diff finds the value of `$VALUES` in its abstract memory when
`values` is called inside the class initializer.
Reviewed By: ezgicicek
Differential Revision: D18349281
fbshipit-source-id: 21766c20f
Summary:
This diff extends bound domain to express Min/Max of another bounds, so it can keep some more
precision in `Math.min/max`.
limitation: `MinMaxB`, the constructor of the bound, can contain only linear expressions or
previous min/max expressions.
Reviewed By: ezgicicek
Differential Revision: D18395365
fbshipit-source-id: fc90d27fd
Summary: Capture locations where work is scheduled to run in parallel (here, just Executors). Also add a test file with cases the upcoming whole-program analysis for starvation should catch.
Reviewed By: dulmarod
Differential Revision: D18346880
fbshipit-source-id: 57411b052
Summary: Follow ups will include error messaging that makes the choice clear
Reviewed By: artempyanykh
Differential Revision: D18347664
fbshipit-source-id: b6f005726
Summary:
In this diff, we just load the info from the storage. Next diff will be
actually using this information to infer nullability.
`ThirdPartyAnnotationGlobalRepo.get_repo` will be used in the next diff,
hence #skipdeadcode
Reviewed By: artempyanykh
Differential Revision: D18347647
fbshipit-source-id: 82a9223c6
Summary:
This diff extends the alias domain, so each variable can have multiple aliases.
It changed `KeyLhs` can be mapped to multiple alias targets in the `AliasMap` domain:
```
before : KeyLhs.t -> KeyRhs.t * AliasTarget.t
after : KeyLhs.t -> KeyRhs.t -> AliasTarget.t
```
Reviewed By: ezgicicek
Differential Revision: D18062178
fbshipit-source-id: b325a6055
Summary:
This diff tries to make less imprecise division by constants results.
For example, the results of the division `[l, u] / c`, where `c` is a positive constant, are:
1. If `l/c` or `u/c` is representable in the bound domain, it uses the precise bounds, i.e., `[l/c, u/c]`.
2. If it is not representable, it tries to make conservative results:
if `0<=l<=u`, it returns `[0, u]` because `0 <= [l/c, u/c] <= u`
if `l<=u<=0`, it returns `[l, 0]` because `l <= [l/c, u/c] <= 0`
if `l<=0<=u`, it returns `[l, u]` because `l <= [l/c, u/c] <= u`
3. otherwise, it returns top, `[-oo, +oo]`
Reviewed By: ezgicicek
Differential Revision: D18270380
fbshipit-source-id: 8fb14c0e4
Summary:
Add precision to analysis by elaborating the thread-status domain. This is done by having unknown (bottom), UI, BG or Any (both/top) elements in the lattice. This way, when we branch on thread-identity (if I am on UI thread do this, otherwise do that), we know that in one branch we are on UI thread and on the other we are *not* on the UI thread (BG thread), where previously the other branch would just go to top.
With this knowledge we can throw away pairs that come from callees which run on a thread that is impossible, given the current caller thread identity. This can happen when annotations are used incorrectly, and since this is the purview of annot-reachability, we just drop those pairs entirely.
Reviewed By: skcho
Differential Revision: D18202175
fbshipit-source-id: be604054e
Summary:
Steal a page from RacerD (and improve interface of) on using certain calls to assert
execution on a particular thread. Reduces FPs and FNs too.
Reviewed By: dulmarod
Differential Revision: D18199843
fbshipit-source-id: 5bdff0dfe
Summary:
The zero cost of node does not make sense especially when the abstract memory is non-bottom. This
resulted in unreasonable zero cost results sometimes, e.g. when the checker could not find
appropriate control varaibles having interval values of iteration. This diff fixes this, so sets
the minimum basic cost as 1, if the abstract memory at the node is non-bottom.
Reviewed By: ezgicicek
Differential Revision: D18199291
fbshipit-source-id: b215d10e5
Summary:
This adds a more interesting value domain to pulse: concrete intervals.
There are still two main limitations:
1. arithmetic operations are all over-approximated: any assignment involving arithmetic operations is replaced by non-determinism
2. abstract values that are discovered to be equal are not merged into one
Reviewed By: skcho
Differential Revision: D18058972
fbshipit-source-id: 0492a590f
Summary:
This does several things because it was hard to split it more:
1. Split most of the arithmetic reasoning to PulseArithmetic.ml. This
doesn't need to be reviewed thoroughly because an upcoming diff
changes the domain from just `EqualTo of Const.t` to an interval domain!
2. When going through a prune node intra-procedurally, abduce arithmetic
facts to the pre (instead of just propagating them). This is the "assume
as assert" trick used by biabduction 1.0 too and allows to propagate
arithmetic constraints to callers.
3. Use 2 when applying summaries by pruning specs whose preconditions
have un-satisfiable arithmetic constraints.
This changes one of the tests! Pulse now does a bit more work to find
the false positive, as can be seen in the longer trace.
Reviewed By: skcho
Differential Revision: D18117160
fbshipit-source-id: af3b2c8c0
Summary:
Primitive types are not annotated. Because of that, we used to implicitly derive
`DeclaredNonnull` type for them. This worked fine, but this leads to errors in Strict mode, which does
not believe DeclaredNonnull type.
Now lets offifically teach nullsafe that primitive types are
non-nullable.
Reviewed By: jvillard
Differential Revision: D18114623
fbshipit-source-id: 227217931
Summary: It is now possible to push the thread status into each critical pair. This leads to higher precision, because when code branches on whether it is on the UI thread, the final abstract state of the procedure will be `AnyThread`, but pairs created in the UI thread branch should know that their status is `UIThread`, not `AnyThread`.
Reviewed By: jvillard
Differential Revision: D18114273
fbshipit-source-id: cbb99b46f
Summary:
This diff avoids making top values on unknown non-static function,
such as abstract function, calls. This is necessary because the
generated top values ruin the precision of the cost checker.
Reviewed By: ezgicicek
Differential Revision: D17418611
fbshipit-source-id: aeb759bdd
Summary:
The wrong function was used when we tried to see if the class is
annotated with NullsafeStrict. This made it work only for non-static
methods.
Now we use the proper way.
Reviewed By: ngorogiannis
Differential Revision: D18113848
fbshipit-source-id: 02b7555be
Summary:
Previously, we considered a function which modifies its parameters to be impure even though it might not be modifying the underlying value. This resulted in FPs like the following program in Java:
```
void fresh_pure(int[] a) {
a = new int[1];
}
```
Similarly, in C++, we considered the following program as impure because it was writing to `s`:
```
Simple* reassign_pure(Simple* s) {
s = new Simple{2};
return s;
}
```
This diff fixes that issue by starting the check for address equivalnce in pre-post not directly from the addresses of the stack variables, but from the addresses pointed to by these stack variables. That means, we only consider things to be impure if the actual values pointed by the parameters change.
Reviewed By: skcho
Differential Revision: D18113846
fbshipit-source-id: 3d7c712f3
Summary: We stop tracking at builder boundaries. Let's tract create methods as well so that trace is more informative.
Reviewed By: skcho
Differential Revision: D18038637
fbshipit-source-id: a99b6431f
Summary: Adding support to matching block names. We match mangled block names. We also needed to extend the function for extracting the range for each method, to also traverse the stmts to be able to find the block declarations.
Reviewed By: skcho
Differential Revision: D17956931
fbshipit-source-id: 707908812
Summary:
This is the first take on strict mode semantics.
The main invariant of strict mode is the following:
If the function passes `NullsafeStrict` check and its return value is
NOT annotated as Nullable, then the function does not indeed return
nulls, subject to unsoundness issues (which should either be fixed, or
should rarely happen in practice).
This invariant helps the caller in two ways:
1. Dangerous usages of non strict functions are visible, so the caller is enforced to check them (via assertions or conditions), or strictify them.
2. When the function is strict, the caller does not need to worry about
being defensive.
Biggest known issues so far:
1. Condition redundant and over-annotated warnings don't fully
respect strict mode, and this leads to stupid false positives. (There is
so much more stupid false positives in condition redundant anyway, so
not particularly a big deal for now).
2. Error reporting is not specific to mode. (E.g. we don't distinct real nullables and non-trusted non-nulls, which can be misleading). To be
improved as a follow up.
Reviewed By: artempyanykh
Differential Revision: D17978166
fbshipit-source-id: d6146ad71
Summary:
This is an intermediate nullability type powering future Strict mode.
See the next diff.
Reviewed By: artempyanykh
Differential Revision: D17977909
fbshipit-source-id: 2d5ab66d4
Summary: In preparation for improvements to the arithmetic reasoning.
Reviewed By: dulmarod
Differential Revision: D17977207
fbshipit-source-id: ee98e0772
Summary:
Domain for thread-type. The main goals are
- Track code paths that are explicitly on UI thread (via annotations, or assertions).
- Maintain UI-thread-ness through the call stack (if a callee is on UI thread then the
trace any call site must be on the UI thread too).
- If we are not on the UI thread we assume we are on a background thread.
- Traces with "UI-thread" status cannot interleave but all other combinations can.
- We do not track other annotations (eg WorkerThread or AnyThread) as they can be
erroneously applied -- other checkers should catch those errors (annotation reachability).
- Top is AnyThread, and is used as the initial state for analysis.
Interestingly, by choosing the right strategy for choosing initial state and applying callee summaries gets rid of some false negatives in the tests even though we have not introduced any path sensitivity yet.
Reviewed By: jvillard
Differential Revision: D17929390
fbshipit-source-id: d72871034
Summary: For Objective-C methods we match the mangled names (the field is name in the profiler samples).
Reviewed By: skcho
Differential Revision: D17952552
fbshipit-source-id: 308d415f6
Summary:
bigmacro_bender
There are 3 ways pulse tracks history. This is at least one too many. So
far, we have:
1. "histories": a humble list of "events" like "assigned here", "returned from call", ...
2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah"
3. "traces", which combine one history with one interproc action
This diff gets rid of interproc actions and makes histories include
"nested" callee histories too. This allows pulse to track and display
how a value got assigned across function calls.
Traces are now more powerful and interleave histories and interproc
actions. This allows pulse to track how a value is fed into an action,
for instance performed in callee, which itself creates some more
(potentially now interprocedural) history before going to the next step
of the action (either another call or the action itself).
This gives much better traces, and some examples are added to showcase
this.
There are a lot of changes when applying summaries to keep track of
histories more accurately than was done before, but also a few
simplifications that give additional evidence that this is the right
concept.
Reviewed By: skcho
Differential Revision: D17908942
fbshipit-source-id: 3b62eaf78
Summary:
Java method annotations are ambiguous in that there is no difference between
annotating the return value of a method, and annotating the method itself.
The disambiguation is done entirely based on the meaning of the annotation.
Here, while `UiThread`/`MainThread` are genuine method/class annotations
and not return annotations, the reverse is true for `ForUiThread`/`ForNonUiThread`.
This means that these latter annotations do not determine the thread status of
the method they are attached to.
Here we fix that misunderstanding.
Reviewed By: jvillard
Differential Revision: D17960994
fbshipit-source-id: 5aecfb124
Summary: As per title. These test pass already because the previous thread domain was sufficient to express them. This won't necessarily be true when the whole-program analysis version comes around, because we may decide to not report on the `Threaded` elements (see domain).
Reviewed By: dulmarod
Differential Revision: D17930653
fbshipit-source-id: 2174f6b22
Summary:
- add the variable being declared so we can report it back in the trace in addition to its location
- distinguish between local vars and formals
Reviewed By: skcho
Differential Revision: D17930348
fbshipit-source-id: a5b863e64
Summary:
Eventually thread status will be stored inside every critical pair so as to allow path sensitivity. That means that the status can no longer be a whole trace, as this will quickly become intractable, because each domain element would have to maintain its own trace as well as its own thread-status trace.
This is not great, as we lose information here, but I don't see any other way around it that is not super complicated/costly (sharing will be limited when moving from callee to caller).
Other diffs up the stack will clean up infrastructure no longer used meaningfully (ie models and domains).
Reviewed By: mityal
Differential Revision: D17908908
fbshipit-source-id: 3bf353e33
Summary:
Starvation is currently path insensitive. Two special cases of sensitivity cover a large range of useful cases:
- sensitivity on whether the current thread is a UI/background thread;
- sensitivity on whether a lock can be acquired (without blocking) or not.
We add a few tests capturing some of the false positives and negatives of the current analysis.
Reviewed By: mityal
Differential Revision: D17907492
fbshipit-source-id: fbce896ac
Summary:
This diff adopts an array length evaluation function that is conservative. It is useful when our
domain cannot express length result precisely.
For example, suppose there is an array pointer `arr_locs` that may point to two arrays `a` and `b`,
and their lengths are `a.length` and `b.length` (symbols), respectively. Using the usual
evaluation, our current domain cannot express `a.length join b.length` (join of two symbolic
values), so it returns top.
In this case, we can use the conservative function intead. It evaluates the length as `[0,
a.length.ub + b.length.ub]`, since we know every array length is positive. The result is not
precise, but better than top.
Reviewed By: ezgicicek
Differential Revision: D17908859
fbshipit-source-id: 7c0b1591b
Summary:
Instead of a string argument named `~str` pass `Formal | Global` and let
`add_to_errlog` figure out how to print it.
Reviewed By: ezgicicek
Differential Revision: D17907657
fbshipit-source-id: ed09aab72
Summary:
When we make the decision to go into a branch "v = N" where some
abstract value is compared to a constant, remember the corresponding
equality. This allows to prune simple infeasible paths
intra-procedurally.
Further work is needed to make this useful interprocedurally, for
instance either or both of these ideas could be explored:
- abduce v=N in the precondition and do not apply summaries when the
equalities in the pre are not satisfied
- prune post-conditions that lead to unsat states where a value has to
be equal to several different constants
Reviewed By: skcho
Differential Revision: D17906166
fbshipit-source-id: 5cc84abc2
Summary:
When we know "x = 3" and we have a condition "x != 3" we know we can
prune the corresponding path.
Reviewed By: skcho
Differential Revision: D17665472
fbshipit-source-id: 988958ea6
Summary:
Let's add basic Java support to impurity checker. Since impurity checker relies on pulse, we need to add Java with Pulse callback as well. Pulse doesn't officially support Java yet, but we can enable it for impurity checker for now.
Many Java primitives/operations are not yet modeled (such as creation of new objects, support for collections etc.). Still, it is good to run impurity checker on the existing tests of the purity checker. Also, it is nice to see that we can identify most of the impure functions correctly in the purity dir. There are a lot of FNs though.
Reviewed By: skcho
Differential Revision: D17906237
fbshipit-source-id: 15308d285
Summary:
This diff introduces inequality for the iterator alias target, as we
did for the size target before.
Reviewed By: ezgicicek
Differential Revision: D17879208
fbshipit-source-id: cc2f6a723
Summary: If we have no pulse summary (most likely caused by pulse finding a legit issue with the code), let's consider the function as impure.
Reviewed By: jvillard
Differential Revision: D17906016
fbshipit-source-id: 671d3e0ba
Summary:
This diff revises the semantics of hasNext model to add the lengths of
arrays, rather than join them to top.
Reviewed By: ezgicicek
Differential Revision: D17882388
fbshipit-source-id: f5edaedb3
Summary:
[androidx.collection.SimpleArrayMap](https://developer.android.com/reference/androidx/collection/SimpleArrayMap.html) also has `keySet` and `entrySet` methods which make them eligible for inefficient keyset checker. Let's add it.
Title
Reviewed By: skcho
Differential Revision: D17831594
fbshipit-source-id: 32e831e18
Summary:
The current usage has several issues reducing code maintainability and
readability:
1. Null_field_access was misleading: it was used for checking accesing
to arrays as well!
2. But actually, when checking access to array via `length`, we sometimes
pretended it is a field access (hence very tricky code in rendering the
error).
3. "Call receiver consistency" is unclear name, was not obvious that it is all about
calling a method in an object.
Let's also consolidate code.
Reviewed By: artempyanykh
Differential Revision: D17789618
fbshipit-source-id: 9b0f58c9c
Summary: Before, we didn't track litho framework callees on client code which was wrong. Now, we replace this with the following: If the callee is `build()` itself or doesn't contain a `build()` in its summary, then we want to track it in the domain. The former makes sense since we always want to track `build()` methods. The latter also makes sense since such a method could be a setter for a prop (as in the case of `prop1` in `buildPropLithoOK` which we were missing before due to the imprecise heuristic that prevented picking up callees in litho).
Reviewed By: ngorogiannis
Differential Revision: D17810704
fbshipit-source-id: 87d88e921
Summary: As a heuristic, litho library calls on non-litho callers are not tracked. This is very imprecise and results in FPs and FNs as exemplified by newly added tests. Instead, we should check to see if the summary contains a `build()` method as will be done in the next diff. This diff adds these tests and refactors the test code.
Reviewed By: skcho
Differential Revision: D17809536
fbshipit-source-id: 6dff1868c
Summary:
Improve the trace by incorporating the callees and their locations in the call chain (i.e. chain of methods starting from `build()` call)
- extend the domain to contain the callee location
- replace the test results with the new traces
This makes our job much easier to debug FPs in a big codebase.
Reviewed By: skcho
Differential Revision: D17788996
fbshipit-source-id: 31938b5fe
Summary: `litho` checker contained two checkers: required-props and graphQL field accesses. Although they use the same domain, their reporting conditions and analysis details are different. However, they were bundled into the same analysis by adding disjunctions to `exec_instr` to handle both cases. Let's separate them into two different checkers, keeping a modular transfer function and analyzer that is reused by these two checkers.
Reviewed By: skcho
Differential Revision: D17788834
fbshipit-source-id: 47d77063b
Summary:
At some point it was thought that we can assume that any annotation starting with "On" means the method is on the UI thread.
That's too imprecise and has led to false positives and negatives. Restrict to a well-known safe set.
Reviewed By: ezgicicek
Differential Revision: D17769376
fbshipit-source-id: 0f8fee059
Summary:
This diff tries to narrowing the fixpoint of outermost loops, so that over-approximated widened values do not flow to the following code.
Problem: There are two phases for finding a fixpoint, widening and narrowing. First, it finds a fixpoint with widening, in function level. After that, it finds a fixpoint with narrowing. A problem is that sometimes an overly-approximated, imprecise, values by widening are flowed to the following loops. They are hard to narrow in the narrowing phase because there is a cycle preventing it.
To mitigate the problem, it tries to do narrowing, in loop level, right after it found a fixpoint of a loop. Thus, it narrows before the widened values are flowed to the following loops. In order to guarantee the termination of the analysis, this eager narrowing is applied only to the outermost loops.
Reviewed By: ezgicicek
Differential Revision: D17740265
fbshipit-source-id: e2d454036
Summary:
This diff extends the alias domain to analyze loop with list comprehensions form in Java precisely.
```
list2 = new List();
for (Element e : list1) {
list2.add(e);
}
```
1. `IteratorOffset` is a relation between a iterator offset and a length of another array. For example, in the above example, after n-times of iterations, the offset of the iterator (if it exists) and the length of `list2` are the same as `n`.
2. `IteratorHasNext` is a relation between iterator and its `hasNext` result.
3. At the conditional nodes, it prunes the alias list length of `list2` by that of `list1`.
* if `hasNext(list1's iterator)` is true, `list2`'s length is pruned by `< list1's length`
* if `hasNext(list1's iterator)` is false, `list2`'s length is pruned by `= list1's length`
Reviewed By: ezgicicek
Differential Revision: D17667128
fbshipit-source-id: 41fb23a45
Summary:
The old domain keeps two sets:
- `events` are things (including lock acquisitions) which eventually happen during the execution of a procedure.
- `order` are pairs of `(lock, event)` such that there is a trace through the procedure which at some point acquires `lock` and before releasing it performs `event`.
A deadlock would be reported if for two procedures, `(lock1,lock2)` is in `order` of procedure 1 and `(lock2,lock1)` is in `order` of procedure 2. This condition/domain allowed for the false positive fixed in the tests, as well as was unwieldy, because it required translating between the two sets.
The new domain has only one set of "critical pairs" `(locks, event)` such that there is a trace where `event` occurs, and *right before it occurs* the locks held are exactly `locks` (no over/under approximation). This allows keeping all information in one set, simplifies the procedure call handling and eliminates the known false positive.
Reviewed By: mityal
Differential Revision: D17686944
fbshipit-source-id: 3c68bb957
Summary: Holding a master lock and then acquiring two other locks inside can generate a false positive as shown.
Reviewed By: mityal
Differential Revision: D17710076
fbshipit-source-id: 5bc910ba2
Summary:
Previously deduplication was always on which is not great for testing.
Also split tests so that we can still test deduplication separately.
Reviewed By: mityal
Differential Revision: D17686877
fbshipit-source-id: 280d91473
Summary:
Ideally the analyser should equate locks `this.x.f` and `a.x.f` in different methods if they can alias.
The heuristic removed here was rarely used and is in the way of a re-write of the analysis.
It was also badly implemented, as this should ideally be the comparison relation of `Lock`.
Reviewed By: mityal
Differential Revision: D17602827
fbshipit-source-id: 4f4576c1a
Summary:
This diff generates a symbolic value when a function returns only
exceptions. Previously, the exception expression is evaluated to top,
thus it was propagated to other functions, which made those costs as
top. For preventing that situation, this diff changed:
* exception expressions are evaluated to bottom, and
* if callee's return value is bottom, it generates a symbolic value
for it.
Reviewed By: ezgicicek
Differential Revision: D17500386
fbshipit-source-id: 0fdcc710d
Summary: This diff introduces an inequality for the size alias targets, in order to get preciser array lengths after loops. The alias domain in inferbo was able to express strict equality between alias source and its targets, e.g. x=size(array). Now, for the size alias target, it can express less than or equal relations, e.g. x>=size(array).
Reviewed By: ezgicicek
Differential Revision: D17606222
fbshipit-source-id: 2557d3bd0
Summary: Component.Builder has its own non-required props that are inherited by the MyComponent.Builder. Add tests where these common props are set in the chain of calls.
Reviewed By: Katalune
Differential Revision: D17710294
fbshipit-source-id: f3c5ef28c
Summary:
Unfortunately it is very hard to predict when
`Typ.Procname.describe` will add `()` after the function name, so we
cannot make sure it is always there.
Right now we report clowny stuff like "error while calling `foo()()`",
which this change fixes.
Reviewed By: ezgicicek
Differential Revision: D17665470
fbshipit-source-id: ef290d9c0
Summary: `Prop(varArg = myProp) List <?> myPropList` can also be set via `myPropList()` or `myProp()`. Add support for picking up the `varArg` and checking this form of required props.
Reviewed By: ngorogiannis
Differential Revision: D17571997
fbshipit-source-id: 7956cb972
Summary:
Turns out, we did not have such a test in place.
Known issue: we report over-annotated warnings for each fields N times,
one per constructor, which is wrong.
Reviewed By: artempyanykh
Differential Revision: D17574791
fbshipit-source-id: def992691
Summary:
This diff avoids giving the top value to unknown globals in Java,
because they harm precision of the cost checker. Instead, it doesn't
subst the global symbols at function calls.
Reviewed By: ezgicicek
Differential Revision: D17498714
fbshipit-source-id: d1215b3aa
Summary:
This diff adds an eval mode for the substitutions of the cost results, in order to avoid precision
loss by joining two symbols.
The usual join of two different symbolic values, `s1` and `s2`, becomes top due to the limitation of
our domain. On the other hand, in the new eval mode, it returns an upperbound `s1+s2`, because the
cost values only care about the upperbounds.
Reviewed By: ezgicicek
Differential Revision: D17573400
fbshipit-source-id: 2c84743d5
Summary:
This was causing a crash, because when trying to create a procname from a block at that point we don't have the block return type, which is needed for the name. I don't understand why BlockDecl doesn't contain the type, but I looked again and it doesn't (also in clang). So in general we need to pass it from the context, but that's not possible in this case.
Also, one could argue that such a block is not a method from the struct, since it's just a block that is assigned to a field as initialization.
Reviewed By: skcho
Differential Revision: D17575197
fbshipit-source-id: 3974ead3f
Summary:
This continues work for eliminating Annot.Item.t from Nullsafe low-level
code.
The introduced function `from_nullsafe_type` is called when we infer
initial type of the equation based on the function or field formal signature.
Before that, we did it via reading the annotation directly, which
complicates the logic and making introducing Unknown nullability tricky.
## Clarifying the semantics of PropagatesNullable
This diff also clarifies (and changes) the behavior of PropagatesNullable params.
Previously, if the return value of a function that has PropagatesNullable params was
annotated as Nullable, nullsafe was effectively ignoring PropagatesNullable effect.
This is especially bad because one can add Nullable annotation based on the logic "if the function can return `null`, it should be annotated with Nullable`.
In the new design, there is no possibility for such a misuse: the code that
applies the rule "any param is PropagatesNullable hence the return
value is nullable even if not explicitly annotated" lives in NullsafeType.ml, so
this will be automatically taken into account.
Meaning that now we implicitly deduce Nullable annotation for the return value, and providing it explicitly as an alternative that does not change the effect.
In the future, we might consider annotating the return value with `Nullable` explicit.
Reviewed By: jvillard
Differential Revision: D17479157
fbshipit-source-id: 66c2c8777
Summary:
In the cost checker, the range of selected control variables are used to estimate the number of loop iteration. However, sometimes the ranges of control variables are not related to how many times the loop iteration. This diff strengthens the condition for them as:
1. integers from `size` models
2. integers constructed from `+` or `-`
3. integers constructed from `*`
For the last one, the loop iteration is likely to be log scale of the range of the control variable:
```
while (i < c) {
i *= 2;
}
```
We will address this in the future.
Reviewed By: ezgicicek
Differential Revision: D17365796
fbshipit-source-id: c1e709ae8
Summary: Our annotation parameter parsing is too primitive to identify `resType` and before we only assumed that all Prop's can be set by any of the two suffixes: `Attr` and `Res`. After talking to Litho team, there is 3 more additions to these suffixes: `Dip`, `Sip`, and `Px`.
Reviewed By: ngorogiannis
Differential Revision: D17528482
fbshipit-source-id: 8d7f49130
Summary: Before, we were mistakenly checking any annotation that ends with Prop such as TreeProp. This was wrong. Instead, we should only check Prop as adviced by the Litho team.
Reviewed By: ngorogiannis
Differential Revision: D17527769
fbshipit-source-id: b753dd87a
Summary:
Introduce a new experimental checker (`--impurity`) that detects
impurity information, tracking which parameters and global variables
of a function are modified. The checker relies on Pulse to detect how
the state changes: it traverses the pre and post pairs starting from
the parameter/global variable and finds where the pre and post heaps
diverge. At diversion points, we expect to see WrittenTo/Invalid attributes
containing a trace of how the address was modified. We use these to
construct the trace of impurity.
This checker is a complement to the purity checker that exists mainly
for Java (and used for cost and loop-hoisting analyses). The aim of
this new experimental checker is to rely on Pulse's precise
memory treatment and come up with a more precise im(purity)
analysis. To distinguish the two checkers, we introduce a new issue
type `IMPURE_FUNCTION` that reports when a function is impure, rather
than when it is pure (as in the purity checker).
TODO:
- improve the analysis to rely on impurity information of external
library calls. Currently, all library calls are assumed to be nops,
hence pure.
- de-entangle Pulse reporting from analysis.
Reviewed By: skcho
Differential Revision: D17051567
fbshipit-source-id: 5e10afb4f
Summary:
`ModeledRange` represents how many times the interval value can be updated by modeled functions. This
domain is to support the case where there are mismatches between value of a control variable and
actual number of loop iterations. For example,
```
while((c = file_channel.read(buf)) != -1) { ... }
```
the loop will iterates as the file size, but the control variable `c` does not have that value. In
these cases, it assigns a symbolic value of the file size to the modeled range of `c`, then which
is used when calculating the overall cost.
Reviewed By: jvillard
Differential Revision: D17476621
fbshipit-source-id: 9a81376e8
Summary:
This diff extends the `Simple` alias domain to address Java's
temporary variables better. It now has an additional field to denote
an alias temporary variable.
Reviewed By: jvillard
Differential Revision: D17421907
fbshipit-source-id: 8b8b47461
Summary:
D17397144 adds dedicated tests for condition redundant.
We also have tests for overannotated methods.
This makes these test cases redundant. Let's not pollute the results.
Reviewed By: jberdine
Differential Revision: D17398757
fbshipit-source-id: 10f6beeca
Summary:
This will simplify modifying functionality around this type of error.
Also rename the file for clarity.
Reviewed By: jvillard
Differential Revision: D17397144
fbshipit-source-id: 552215243
Summary:
This diff revises widening functions of bounds that have a linear form and a min/max form.
For example, for lower bounds,
* 3 ▽ (1+min(2, x)) = (1+min(2, x))
* 3+x ▽ (3+min(2, x)) = (3+min(2, x))
Reviewed By: jvillard
Differential Revision: D17420786
fbshipit-source-id: ff9eebed3
Summary:
This diff addresses collection adds in loop. For example,
```
ArrayList<...> a = new ArrayList<>();
for (int i = 0; i < size; i++) {
a.add(...);
}
// we want to know the size of `a` here!
```
This is a common pattern on initializing a collection in Java.
How we did: Instead of adopting general (but complicated) solutions such as relational domain, we
extended the current alias domain of inferbo, to be able to handle this specific case:
* An array `a` should have size 0, at the entry of the loop.
* The iterating variable `i` should start with 0.
* `add` should be called once inside the loop.
Reviewed By: jvillard
Differential Revision: D17319350
fbshipit-source-id: 99b6acae1
Summary:
In D17156724, we forked nullsafe tests, which was a strategy to
introduce nullsafe-gradual mode back then.
The reason was "gradual" mode is a pretty big change in a way Infer
handles annotations, so we wanted to tests both scenarios: gradual and
non-gradual mode.
The plan was to deprecate "non-gradual" tests at some point, hence we
decided to go with duplication.
Now we have a better approach to ensure "gradual" features are well
covered. The approach is the following.
1. [Mostly finished] Improve existings tests so that they cover negative and positive
cases. With this, we can safely add something like
--non-annotated-default UNKNOWN_NULLABILITY to the test config and be sure tests still make
sense (i.e. don't pass simply because annotations don't make sense
anymore)
2. [In progress]. Refactor nullsafe code so that instead of using of Annot.ml everywhere we use a special abstraction telling if the class is annotated with Nullable, Nonnull, or not annotated. With this change, we essenstially have a single place we need to test, which removes the need to have 2 pair of tests for each feature.
3. [To be done]. Introduce Uknown nullability and add small number of tests specifically
for that feature (together with existing tests).
NOTE: I did not rename `nullsafe-default` back to `nullsafe` to not
pollute blame without need.
Reviewed By: artempyanykh
Differential Revision: D17395743
fbshipit-source-id: 3d3e062f6