Summary:
Changing the order of the superclasses of a struct exposes a bug in both biabduction and the devirtualiser where a method would be resolved into a still virtual method (an interface method).
The reason is that we don't check whether a super class is an interface before exploring it, and seemingly we assume that there is only one (first) superclass worth exploring. This also ignores multiple inheritance in C++.
To fix this, refactor the resolution to a complete search (not just the first super class!) which ignores Java interface methods. Also moved it to `Tenv` so that both biabduction and the devirtualiser can use it.
Reviewed By: jvillard
Differential Revision: D22357488
fbshipit-source-id: 54b96c1f4
Summary: We recently changed the translation of NSArray literals in a way that we pass a different type of argument to `arrayWithObjects:count`, such that the biabduction model doesn't work anymore. So we remove the model for now.
Reviewed By: skcho
Differential Revision: D22691611
fbshipit-source-id: 03cd940ed
Summary:
Merging global type environments for Java needs some form of non-trivial type definition merging because:
- The frontend is likely non-deterministic, so it can capture the same type differently.
- There are classes that appear with two distinct definitions (usually ordered by inclusion) when one is produced by an ABI-like compilation process (so only public fields/methods would appear for example), and one full version.
- The frontend produces dummy versions (empty definitions), and full ones.
- The location information is variously missing/present.
This diff tries to strike a balance between a full semantic merge (which depends on the frontend/buck integration) and the current code which "merges" by clobbering old definitions with new ones.
One side-effect of this diff is that code cannot expect a special order for supers.
Reviewed By: jvillard
Differential Revision: D22630286
fbshipit-source-id: fc66c7000
Summary:
This diff adds translation of `arrayWithObjects:count:`. In the previous implementation it was
translated as if it was `arrayWithObjects:`, but their function parameters are different.
In this diff, it translates an array literal `NSArray* a = @ [ 2, 3 ];` to
```
n$1=NSNumber.numberWithInt:(2:int)
n$2=NSNumber.numberWithInt:(3:int)
temp[0]:objc_object*=n$1
temp[1]:objc_object*=n$2
n$3=NSArray.arrayWithObjects:count:(temp:objc_object* const [2*8],2:int)
a:NSArray*=n$3
```
where `temp` is an additional local variable declared as array.
See,
https://developer.apple.com/documentation/foundation/nsarray/1460145-arraywithobjectshttps://developer.apple.com/documentation/foundation/nsarray/1460096-arraywithobjects?language=objc
Reviewed By: jvillard
Differential Revision: D22631305
fbshipit-source-id: 5be0a55d4
Summary:
Add a test to the repo to try and detect perf regressions in pulse.
Currently analyzed in ~0.1s. With `--pudge`, takes ~10s.
Sledge does eager normalization and canonicalization when incorporating new facts into formula contexts and the algorithm is polynomial in the number of equalities. This example generates one equality per location in the array => boom. This bypasses the recency model of arrays because the formula needs to be constructed before it can be simplified to get rid of dead variables.
The new arithmetic is not as complete as sledge's algorithm but linear in time. We could use it to simplify the formula *before* passing it to sledge. In fact, that was the original motivation.
Reviewed By: skcho
Differential Revision: D22574366
fbshipit-source-id: e9044ae09
Summary:
When applying function summaries, we are careful not to violate the
summary's assumptions about non-aliasing. For example, the summary we
generate for `foo(x,y) { *x = *y; }` will have `x` and `y` be allocated
to two different `AbstractValue.t` in the heap, representing
disjointness.
However, the current logic is too coarse and also rejects passing the
same pure value to functions that made no assumption about them being
equal or different, eg `goo(int x,int y) { int z = x + y; }`. This is
because the corresponding `AbstractValue.t` are different in the
callee's summary, but are represented by only one same value in callers
such as `goo(i,i)`.
This diff restricts the "don't violate aliasing" condition to only
consider heap-allocated values. This is consistent with separation logic
by the way: we use the implication `x|->- * y|->- |- x≠y`, which is
valid only when both `x` and `y` are both allocated in the heap as in
the left-hand-side of `|-`.
Reviewed By: skcho
Differential Revision: D22574297
fbshipit-source-id: 206a18499
Summary:
This will allow all the analyses to be able to call closures without any special treatment: we transform the call to variables that point to closures into normal function calls. We treat only ObjC blocks at the moment, with C++ lambdas to be done as a next step.
We aimed to achieve certain results in Pulse (see tests: avoid memory leaks and NPEs FPs) while also keeping the biabduction analysis working as before.
We also checked that for the examples analyzed Pulse behaves like the correct semantics of ObjC programs with blocks.
Reviewed By: jvillard
Differential Revision: D22547333
fbshipit-source-id: efe56ed51
Summary: Lambda is called using `operator()`. We need to know the information of captured variables when `operator()` procedure is being analysed. This diff records lambda captured variables in `operator()` procdesc. The complication arises from the fact that procdesc for `operator()` is created before translating lambda expression or during the translation of lambda expression where captured variables are translated. To solve this issue, we update existing `operator()` procdesc attributes with captured variable information when we translate lambda expression.
Reviewed By: jvillard
Differential Revision: D22374495
fbshipit-source-id: 44909adea
Summary:
This diff fixes a bug that eval_arr misses the case when a stack
variable points to an array.
Reviewed By: ezgicicek, roro47
Differential Revision: D22596999
fbshipit-source-id: 7c4a13d01
Summary: Add cost model for most common `NSString` functions in cost analysis
Reviewed By: skcho
Differential Revision: D22433005
fbshipit-source-id: 2f57bbda9
Summary: If a node is unreachable and the cost of the node is Top, we were giving Top cost :( Let's fix it.
Reviewed By: skcho
Differential Revision: D22548269
fbshipit-source-id: d79743669
Summary:
We update the type of captured variables to include information about capture mode (`ByReference` or `ByValue`) both for procdesc attributes and the closure expression.
For lambda: closure expression now contains correct capture mode for capture variables. Procdesc still does not contain information about captured variables which we will address in the next diff.
For objc blocks: at the moment all captured variables have mode `ByReference`. Added TODOs to fix this.
Reviewed By: jvillard
Differential Revision: D22572054
fbshipit-source-id: 4c88678ee
Summary: This diff prints where the cost becomes top by calling `html_debug_new_node_session`. This will print them in the start node of the procedure in html. There are already printing functions in `get_instr_node_cost_record`.
Reviewed By: ezgicicek
Differential Revision: D22547578
fbshipit-source-id: 257e957c0
Summary:
The frontend was hackily adding protocols as superclasses in the tenv, with the implicit encoding that the first element in the list was the actual superclass. This was clearly very fragile.
Protocols are not used in the backend at the moment, so for now we will remove them from the list of superclasses to have more consistency in the tenv.
Reviewed By: ngorogiannis
Differential Revision: D22525078
fbshipit-source-id: 2aef1fab1
Summary: This diff extends the value domain to express multiple markers.
Reviewed By: ngorogiannis
Differential Revision: D22524864
fbshipit-source-id: b8e4af2eb
Summary:
As title
Model `NSString` as `JavaString`.
Since `NSArray` does not contain information about its type of element, we do not use associate string with collection as in Java and C++. In Java, String model is implemented using java collection, and for C++, string model is implemented using vector.
So instead, we use existing `JavaString` model.
Reviewed By: skcho
Differential Revision: D22431949
fbshipit-source-id: 7cdde1ad7
Summary:
In order to allow implementations of the single Fol interface using
multiple backend first-order logic solvers, add explicit definitions
of terms and formulas in the Fol module, and implement Context in
terms of them.
The Fol interface supports freely mixing Terms and Formulas, in
particular there is `Term.ite : cnd:Formula.t -> thn:Term.t ->
els:Term.t -> Term.t` which allows Formulas to appear in Terms. The
Fol implementation performs enough normalization to enable using an
internal representation of terms that is strictly partitioned into
"theory terms" and "formulas", which are stratified below "conditional
terms" and then below "general terms". This partitioning and
stratification enables using backend solvers that do not support
mixing formulas in terms.
Reviewed By: jvillard
Differential Revision: D22170506
fbshipit-source-id: a014ee7d7
Summary: To avoid NULLPTR_DEREFERENCE false positives we want to model some functions as returning non-null. A new flag --pulse-model-return-nonnull allows us to provide a list of such functions.
Reviewed By: ezgicicek
Differential Revision: D22431564
fbshipit-source-id: 9944c7382
Summary: Make the module interface safe wrt closing the classpath channel when done, plus reducing the exposed API.
Reviewed By: skcho
Differential Revision: D22411685
fbshipit-source-id: 11316c577
Summary: `addAll` adds elements one by one and hence takes linear time. We didn't have a model for this and considered it O(1).
Reviewed By: skcho
Differential Revision: D22375157
fbshipit-source-id: 65b82bfae
Summary: This diff prevents printing line numbers of loop in the trace description, which helps to keep the same descriptions even when the line number of a function is changed in tests.
Reviewed By: ezgicicek
Differential Revision: D22375584
fbshipit-source-id: 676d1a7cc
Summary:
This one is observed to be more memory efficient. Intuitively, maps need
to be re-allocated more often than lists for balancing. In pulse, we'll
often only ever add new values, in increasing order (when they are fresh
variables created as we symbolically execute the program), which pushes
maps into their worst-case allocation pattern. At least I suspect that's
what happens. With lists, this case is handled much better as lists are
not re-allocated when adding elements.
This is somewhat confirmed by benchmarking and observing GC stats.
Reviewed By: skcho
Differential Revision: D22140908
fbshipit-source-id: 29815112f
Summary:
Messed up the aggregation of GC stats in the previous commit.
It's cleaner to have GC stats (and analysis time) outside of
BackendStats as the rules for computing them is different than the rest,
eg notice how "analysis time" needed to be corrected at the end of the
run, and similarly for GC stats. Thus, refactor this part.
Also output different aggregations of GC stats: +/max/average.
Reviewed By: skcho
Differential Revision: D22332496
fbshipit-source-id: eefd9dd72
Summary:
Keyword `thread_local` in cpp allows us to create a variable with thread storage duration, meaning that the object's lifetime begins when the thread begins and ends when the thread ends.
We get `NULLPTR_DEREFERENCE` false positive for `thread_local` variable since we reallocate it in the `VariableLifetimeBegins` metadata instruction and we do not see further updates to the variable. To solve the issue we special case `VariableLifetimeBegins` instruction for global variables.
Reviewed By: jvillard
Differential Revision: D22284135
fbshipit-source-id: 13c14ef90
Summary: Create test for the most common unmodeled function in inferbo that acts as control variable.
Reviewed By: skcho
Differential Revision: D22331168
fbshipit-source-id: 1913682db
Summary: Add objc test for customized class and blocks. Mostly sanity test.
Reviewed By: ezgicicek
Differential Revision: D22043918
fbshipit-source-id: 917deeea7
Summary:
This diff adds a model of `File.listFiles` as returning an array with
a symbolic length.
Reviewed By: ezgicicek
Differential Revision: D22332258
fbshipit-source-id: 6ca593b8b
Summary: This diff adds support for `com.facebook.litho.sections.Section` which mimics the behavior for `com.facebook.litho.Component`.
Reviewed By: skcho
Differential Revision: D22309039
fbshipit-source-id: 3510441a8
Summary:
Following from previous diff.
**Idea** - 80% of functions with Top cost are caused by calling top-costed callees, i.e. callee's Top cost is simply propagated to its transitive callers, so the aim is to investigate such root callees along with the number of their transitive callers.
Consider the following code
```
void bar1() {
// top cost function
}
void bar2() {
// another top cost function
}
void baz(){
// baz have top cost because of bar
bar1();
}
void foo() {
// goo have top cost because of baz
baz();
bar2()
}
```
Clearly, the root cause of the foo being top cost is `bar1` and `bar2`.
1. When we are analyzing `baz`, we know that it calls `bar1`, which is top cost, so we record that `baz = { T, bar1 } `.
2. Now, say we are analyzing foo.
When we analyze the call to `baz`, we found out that the top cost of `baz` is caused by `bar1`, so we record `foo = { T, bar1 }`.
When we analyze the call to `bar2`, we know that `bar2` is top cost, but since at this stage we only want to deal with the first top cost function we met, so we ignore it.
Since we are keeping track of top cost function by examining the `Call` instruction, we would expect to see two log of `bar1` in the result. The test plan confirms it.
Reviewed By: ezgicicek
Differential Revision: D22231457
fbshipit-source-id: 45d48e4a7
Summary:
New `debug` command takes over from `explore` the `--procedures`, `--source-files` functionality and adds `--global-tenv` for printing the global type environment.
Also, uncrustify printing of type environments.
Reviewed By: jvillard
Differential Revision: D22284807
fbshipit-source-id: 9c6fb0c7a
Summary:
Log stats obtained via `Gc.stat ()` for various phases:
- capture (doesn't include child infer processes created by the build
system)
- analysis
- worker processes of the analysis, aggregated
- reporting phase
- total GC stats for the main infer process
Reviewed By: jberdine
Differential Revision: D22140131
fbshipit-source-id: b0ee39559