Summary:
There was a precision loss during the substitution of array block. For example:
Callee's abstract memory includes an array block as follows, where `a` is a parameter.
```
a.elements -> { a.elements[*] with a.elements.size }
```
Callers' abstract memory includes a pointer that may point to multiple array blocks.
```
c -> { x, y }
x.elements -> { x.elements[*] with x.elements.size }
y.elements -> { y.elements[*] with y.elements.size }
```
When the callee is called with the parameter `c`, the callees memory is substituted to:
```
x.elements -> { x.elements[*] with top , y.elements[*] with top }
y.elements -> { x.elements[*] with top , y.elements[*] with top }
```
because `a.elements[*]` was substituted to `{ x.elements[*] , y.elements[*] }`
and `a.elements.size` was substituted to `top ( = x.elements.size join y.elements.size )`.
This diff tries to keep the precision in the specific case, not to join the sizes of array blocks.
So now the same callee's abstract memory is substituted to:
```
x.elements -> { x.elements[*] with x.elements.size }
y.elements -> { y.elements[*] with y.elements.size }
```
Reviewed By: ngorogiannis
Differential Revision: D18480585
fbshipit-source-id: b70e63c22
Summary: Due to the weakness of the analysis which can't detect side-effecting prop setting (e.g. as in `builder.prop1(..)`), we currently have many broken chains that do do have any `create` method in their prefixes. Let's not report on these broken chains.
Reviewed By: skcho
Differential Revision: D18503523
fbshipit-source-id: 7506e34b7
Summary:
In some apps executors are obtained by calling standard framework methods (and not by using DI with annotations).
To treat this style, we need to
- Detect calls that return such executors (`do_executor_effect`) and tag the return result with an `Attribute` indicating that it is now an executor, plus what thread it uses.
- Use that information when calling `execute`, to resolve the executor, if any, and its thread (in `do_work_scheduling` via `AttributeDomain.get_executor_constraint`).
- All this requires a new domain component, mapping variables to attributes. This extends the component previously used for remembering whether a variable is the result of a check on whether we run on the UI thread.
At the same time, I un-nested some functions from the transfer function for readability.
Reviewed By: skcho
Differential Revision: D18476122
fbshipit-source-id: bc39b5c2f
Summary:
We consider Java collections to be like c++ std::vectors and add models for
- `Collections.get(..)`
- `__cast`
Reviewed By: skcho
Differential Revision: D18449607
fbshipit-source-id: 448206c84
Summary: `equals1` and `equals2` in `SafeInvertedMap.join` are references that indicate whether given parameters and the result is physically equal or not. This diff fixes a missing update of them.
Reviewed By: ezgicicek
Differential Revision: D18450680
fbshipit-source-id: bae19cbe9
Summary:
It returns non-top value when one of the parameters of band is positive, i.e., `x & 255` returns
`[0, 255]` instead of top.
Reviewed By: ezgicicek
Differential Revision: D18448614
fbshipit-source-id: aaa298a66
Summary:
Let's introduce a set of new cost analysis issue types that are raised when the function is statically determined to run on the UI thread. For this, we rely on the existing `runs_on_ui_thread` check that is developed for RacerD. We also update the cost summary and `jsonbug.cost_item` to include whether a method is on the ui thread so that we don't repeatedly compute this at diff time for complexity increase issues.
Note that `*_UI_THREAD` cost issues are assumed to be more strict than `*_COLD_START` reports at the moment. Next, we can also consider adding a new issue type that combines both such as `*_UI_THREAD_AND_COLD_START` (i.e. for methods that are both on cold start and run on ui thread).
Reviewed By: ngorogiannis
Differential Revision: D18428408
fbshipit-source-id: f18805716
Summary:
This diff tries more narrowing during analysis in order to get preciser results on nested loops.
In the widening phase, it does narrowing a loop right after its widening, for each loops. In general, this may make the widening phase non-terminating because it keeps the abstract state from monotonely increasing to the fixed point in a finite number of iterations. To avoid that situation, this diff applies the narrowing only when the first visit of the loop in the widening phase.
Reviewed By: ngorogiannis
Differential Revision: D18400631
fbshipit-source-id: cc76f7e85
Summary: Sometimes there is a code like `for(int i = 1; i < x; i++){ l.add(); }`, where the first element in a list is addressed specifically. This case was not analyzed precisely, because the alias value is added only when `i` is initialized by 0 by heuristic. This diff extends the heuristic, so it adds a size alias between `i` and `l.size()` when `i` is initialized by 0 or 1.
Reviewed By: ezgicicek
Differential Revision: D18351867
fbshipit-source-id: e7d19a4ec
Summary:
This diff adds semantics of Java function calls of enum `values` inside class initializers.
* Java class initializer function initializes a specific field `$VALUES`, which points to the list
of enum values.
* The `values` function of enum class returns the value of `$VALUES`.
The problem is when the `values` function is called inside the class initializer, for example:
```
enum Color {
RED,
GREEN,
BLUE;
static {
for (Color c : Color.values()) {}
}
}
```
This introduces a recursive dependency: the class initializer calls `Color.values` and the function
returns `Color.$VALUES` the value of which should be initialized in the class initializer.
To address the problem, this diff finds the value of `$VALUES` in its abstract memory when
`values` is called inside the class initializer.
Reviewed By: ezgicicek
Differential Revision: D18349281
fbshipit-source-id: 21766c20f
Summary:
This diff extends bound domain to express Min/Max of another bounds, so it can keep some more
precision in `Math.min/max`.
limitation: `MinMaxB`, the constructor of the bound, can contain only linear expressions or
previous min/max expressions.
Reviewed By: ezgicicek
Differential Revision: D18395365
fbshipit-source-id: fc90d27fd
Summary: Capture locations where work is scheduled to run in parallel (here, just Executors). Also add a test file with cases the upcoming whole-program analysis for starvation should catch.
Reviewed By: dulmarod
Differential Revision: D18346880
fbshipit-source-id: 57411b052
Summary: Follow ups will include error messaging that makes the choice clear
Reviewed By: artempyanykh
Differential Revision: D18347664
fbshipit-source-id: b6f005726
Summary:
In this diff, we just load the info from the storage. Next diff will be
actually using this information to infer nullability.
`ThirdPartyAnnotationGlobalRepo.get_repo` will be used in the next diff,
hence #skipdeadcode
Reviewed By: artempyanykh
Differential Revision: D18347647
fbshipit-source-id: 82a9223c6
Summary:
This diff extends the alias domain, so each variable can have multiple aliases.
It changed `KeyLhs` can be mapped to multiple alias targets in the `AliasMap` domain:
```
before : KeyLhs.t -> KeyRhs.t * AliasTarget.t
after : KeyLhs.t -> KeyRhs.t -> AliasTarget.t
```
Reviewed By: ezgicicek
Differential Revision: D18062178
fbshipit-source-id: b325a6055
Summary:
Add precision to analysis by elaborating the thread-status domain. This is done by having unknown (bottom), UI, BG or Any (both/top) elements in the lattice. This way, when we branch on thread-identity (if I am on UI thread do this, otherwise do that), we know that in one branch we are on UI thread and on the other we are *not* on the UI thread (BG thread), where previously the other branch would just go to top.
With this knowledge we can throw away pairs that come from callees which run on a thread that is impossible, given the current caller thread identity. This can happen when annotations are used incorrectly, and since this is the purview of annot-reachability, we just drop those pairs entirely.
Reviewed By: skcho
Differential Revision: D18202175
fbshipit-source-id: be604054e
Summary:
Steal a page from RacerD (and improve interface of) on using certain calls to assert
execution on a particular thread. Reduces FPs and FNs too.
Reviewed By: dulmarod
Differential Revision: D18199843
fbshipit-source-id: 5bdff0dfe
Summary:
The zero cost of node does not make sense especially when the abstract memory is non-bottom. This
resulted in unreasonable zero cost results sometimes, e.g. when the checker could not find
appropriate control varaibles having interval values of iteration. This diff fixes this, so sets
the minimum basic cost as 1, if the abstract memory at the node is non-bottom.
Reviewed By: ezgicicek
Differential Revision: D18199291
fbshipit-source-id: b215d10e5
Summary:
Primitive types are not annotated. Because of that, we used to implicitly derive
`DeclaredNonnull` type for them. This worked fine, but this leads to errors in Strict mode, which does
not believe DeclaredNonnull type.
Now lets offifically teach nullsafe that primitive types are
non-nullable.
Reviewed By: jvillard
Differential Revision: D18114623
fbshipit-source-id: 227217931
Summary: It is now possible to push the thread status into each critical pair. This leads to higher precision, because when code branches on whether it is on the UI thread, the final abstract state of the procedure will be `AnyThread`, but pairs created in the UI thread branch should know that their status is `UIThread`, not `AnyThread`.
Reviewed By: jvillard
Differential Revision: D18114273
fbshipit-source-id: cbb99b46f
Summary:
This diff avoids making top values on unknown non-static function,
such as abstract function, calls. This is necessary because the
generated top values ruin the precision of the cost checker.
Reviewed By: ezgicicek
Differential Revision: D17418611
fbshipit-source-id: aeb759bdd
Summary:
The wrong function was used when we tried to see if the class is
annotated with NullsafeStrict. This made it work only for non-static
methods.
Now we use the proper way.
Reviewed By: ngorogiannis
Differential Revision: D18113848
fbshipit-source-id: 02b7555be
Summary:
Previously, we considered a function which modifies its parameters to be impure even though it might not be modifying the underlying value. This resulted in FPs like the following program in Java:
```
void fresh_pure(int[] a) {
a = new int[1];
}
```
Similarly, in C++, we considered the following program as impure because it was writing to `s`:
```
Simple* reassign_pure(Simple* s) {
s = new Simple{2};
return s;
}
```
This diff fixes that issue by starting the check for address equivalnce in pre-post not directly from the addresses of the stack variables, but from the addresses pointed to by these stack variables. That means, we only consider things to be impure if the actual values pointed by the parameters change.
Reviewed By: skcho
Differential Revision: D18113846
fbshipit-source-id: 3d7c712f3
Summary: We stop tracking at builder boundaries. Let's tract create methods as well so that trace is more informative.
Reviewed By: skcho
Differential Revision: D18038637
fbshipit-source-id: a99b6431f
Summary:
This is the first take on strict mode semantics.
The main invariant of strict mode is the following:
If the function passes `NullsafeStrict` check and its return value is
NOT annotated as Nullable, then the function does not indeed return
nulls, subject to unsoundness issues (which should either be fixed, or
should rarely happen in practice).
This invariant helps the caller in two ways:
1. Dangerous usages of non strict functions are visible, so the caller is enforced to check them (via assertions or conditions), or strictify them.
2. When the function is strict, the caller does not need to worry about
being defensive.
Biggest known issues so far:
1. Condition redundant and over-annotated warnings don't fully
respect strict mode, and this leads to stupid false positives. (There is
so much more stupid false positives in condition redundant anyway, so
not particularly a big deal for now).
2. Error reporting is not specific to mode. (E.g. we don't distinct real nullables and non-trusted non-nulls, which can be misleading). To be
improved as a follow up.
Reviewed By: artempyanykh
Differential Revision: D17978166
fbshipit-source-id: d6146ad71
Summary:
This is an intermediate nullability type powering future Strict mode.
See the next diff.
Reviewed By: artempyanykh
Differential Revision: D17977909
fbshipit-source-id: 2d5ab66d4
Summary:
Domain for thread-type. The main goals are
- Track code paths that are explicitly on UI thread (via annotations, or assertions).
- Maintain UI-thread-ness through the call stack (if a callee is on UI thread then the
trace any call site must be on the UI thread too).
- If we are not on the UI thread we assume we are on a background thread.
- Traces with "UI-thread" status cannot interleave but all other combinations can.
- We do not track other annotations (eg WorkerThread or AnyThread) as they can be
erroneously applied -- other checkers should catch those errors (annotation reachability).
- Top is AnyThread, and is used as the initial state for analysis.
Interestingly, by choosing the right strategy for choosing initial state and applying callee summaries gets rid of some false negatives in the tests even though we have not introduced any path sensitivity yet.
Reviewed By: jvillard
Differential Revision: D17929390
fbshipit-source-id: d72871034
Summary:
bigmacro_bender
There are 3 ways pulse tracks history. This is at least one too many. So
far, we have:
1. "histories": a humble list of "events" like "assigned here", "returned from call", ...
2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah"
3. "traces", which combine one history with one interproc action
This diff gets rid of interproc actions and makes histories include
"nested" callee histories too. This allows pulse to track and display
how a value got assigned across function calls.
Traces are now more powerful and interleave histories and interproc
actions. This allows pulse to track how a value is fed into an action,
for instance performed in callee, which itself creates some more
(potentially now interprocedural) history before going to the next step
of the action (either another call or the action itself).
This gives much better traces, and some examples are added to showcase
this.
There are a lot of changes when applying summaries to keep track of
histories more accurately than was done before, but also a few
simplifications that give additional evidence that this is the right
concept.
Reviewed By: skcho
Differential Revision: D17908942
fbshipit-source-id: 3b62eaf78
Summary:
Java method annotations are ambiguous in that there is no difference between
annotating the return value of a method, and annotating the method itself.
The disambiguation is done entirely based on the meaning of the annotation.
Here, while `UiThread`/`MainThread` are genuine method/class annotations
and not return annotations, the reverse is true for `ForUiThread`/`ForNonUiThread`.
This means that these latter annotations do not determine the thread status of
the method they are attached to.
Here we fix that misunderstanding.
Reviewed By: jvillard
Differential Revision: D17960994
fbshipit-source-id: 5aecfb124
Summary: As per title. These test pass already because the previous thread domain was sufficient to express them. This won't necessarily be true when the whole-program analysis version comes around, because we may decide to not report on the `Threaded` elements (see domain).
Reviewed By: dulmarod
Differential Revision: D17930653
fbshipit-source-id: 2174f6b22
Summary:
Eventually thread status will be stored inside every critical pair so as to allow path sensitivity. That means that the status can no longer be a whole trace, as this will quickly become intractable, because each domain element would have to maintain its own trace as well as its own thread-status trace.
This is not great, as we lose information here, but I don't see any other way around it that is not super complicated/costly (sharing will be limited when moving from callee to caller).
Other diffs up the stack will clean up infrastructure no longer used meaningfully (ie models and domains).
Reviewed By: mityal
Differential Revision: D17908908
fbshipit-source-id: 3bf353e33
Summary:
Starvation is currently path insensitive. Two special cases of sensitivity cover a large range of useful cases:
- sensitivity on whether the current thread is a UI/background thread;
- sensitivity on whether a lock can be acquired (without blocking) or not.
We add a few tests capturing some of the false positives and negatives of the current analysis.
Reviewed By: mityal
Differential Revision: D17907492
fbshipit-source-id: fbce896ac
Summary:
This diff adopts an array length evaluation function that is conservative. It is useful when our
domain cannot express length result precisely.
For example, suppose there is an array pointer `arr_locs` that may point to two arrays `a` and `b`,
and their lengths are `a.length` and `b.length` (symbols), respectively. Using the usual
evaluation, our current domain cannot express `a.length join b.length` (join of two symbolic
values), so it returns top.
In this case, we can use the conservative function intead. It evaluates the length as `[0,
a.length.ub + b.length.ub]`, since we know every array length is positive. The result is not
precise, but better than top.
Reviewed By: ezgicicek
Differential Revision: D17908859
fbshipit-source-id: 7c0b1591b
Summary:
Let's add basic Java support to impurity checker. Since impurity checker relies on pulse, we need to add Java with Pulse callback as well. Pulse doesn't officially support Java yet, but we can enable it for impurity checker for now.
Many Java primitives/operations are not yet modeled (such as creation of new objects, support for collections etc.). Still, it is good to run impurity checker on the existing tests of the purity checker. Also, it is nice to see that we can identify most of the impure functions correctly in the purity dir. There are a lot of FNs though.
Reviewed By: skcho
Differential Revision: D17906237
fbshipit-source-id: 15308d285
Summary:
This diff introduces inequality for the iterator alias target, as we
did for the size target before.
Reviewed By: ezgicicek
Differential Revision: D17879208
fbshipit-source-id: cc2f6a723
Summary:
This diff revises the semantics of hasNext model to add the lengths of
arrays, rather than join them to top.
Reviewed By: ezgicicek
Differential Revision: D17882388
fbshipit-source-id: f5edaedb3
Summary:
[androidx.collection.SimpleArrayMap](https://developer.android.com/reference/androidx/collection/SimpleArrayMap.html) also has `keySet` and `entrySet` methods which make them eligible for inefficient keyset checker. Let's add it.
Title
Reviewed By: skcho
Differential Revision: D17831594
fbshipit-source-id: 32e831e18
Summary:
The current usage has several issues reducing code maintainability and
readability:
1. Null_field_access was misleading: it was used for checking accesing
to arrays as well!
2. But actually, when checking access to array via `length`, we sometimes
pretended it is a field access (hence very tricky code in rendering the
error).
3. "Call receiver consistency" is unclear name, was not obvious that it is all about
calling a method in an object.
Let's also consolidate code.
Reviewed By: artempyanykh
Differential Revision: D17789618
fbshipit-source-id: 9b0f58c9c
Summary: Before, we didn't track litho framework callees on client code which was wrong. Now, we replace this with the following: If the callee is `build()` itself or doesn't contain a `build()` in its summary, then we want to track it in the domain. The former makes sense since we always want to track `build()` methods. The latter also makes sense since such a method could be a setter for a prop (as in the case of `prop1` in `buildPropLithoOK` which we were missing before due to the imprecise heuristic that prevented picking up callees in litho).
Reviewed By: ngorogiannis
Differential Revision: D17810704
fbshipit-source-id: 87d88e921
Summary: As a heuristic, litho library calls on non-litho callers are not tracked. This is very imprecise and results in FPs and FNs as exemplified by newly added tests. Instead, we should check to see if the summary contains a `build()` method as will be done in the next diff. This diff adds these tests and refactors the test code.
Reviewed By: skcho
Differential Revision: D17809536
fbshipit-source-id: 6dff1868c
Summary:
Improve the trace by incorporating the callees and their locations in the call chain (i.e. chain of methods starting from `build()` call)
- extend the domain to contain the callee location
- replace the test results with the new traces
This makes our job much easier to debug FPs in a big codebase.
Reviewed By: skcho
Differential Revision: D17788996
fbshipit-source-id: 31938b5fe