Summary:
This diff adds trace of closure in autoreleasepool checker. We introduce a symbolic trace value for closure variable.
* It is added to the trace when closure variable is called
* and is substituted to concrete one when actual closure is given later.
Reviewed By: ezgicicek
Differential Revision: D25025883
fbshipit-source-id: a6e246be7
Summary: At a function call, an access performed by a callee must be processed in various ways before it's added to the accesses of the caller, and several of these steps may throw away the access. Previously, this was done by effectively doing a bit of transformation, creating a new set of accesses, then folding over that to add to the caller's. This is inefficient and somewhat confusing, as this can be done with one fold and a sequence of `Option.map`s.
Reviewed By: skcho
Differential Revision: D24885117
fbshipit-source-id: 4ab61eab9
Summary:
Existing closure substitution only supported direct block calls to formals.
The following didn't work since the domain was only keeping track of loads/calls from formals, but didn't support stores.
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
This diff adds support for assigning a block to a local variable so that we can specialize the above example.
We now have a pair domain
- existing mapping from ids to block vars
- a new mapping from mangled to block specializations
the latter allows us to update the mapping in local block assignment (via store).
Reviewed By: skcho
Differential Revision: D25030234
fbshipit-source-id: 3f172341c
Summary:
Specialized closure substitution was broken for conditionals:
```
void foo(dispatch_block_t block1){
if (x){
block1(); // not replaced with specialized implementation
}
}
```
The problem was that when substituting function calls, it only used memory state at the exit node, rather than at each program point.
We could solve this by
- reverting the domain change in D24418560 (c47911359a), i.e. collecting all possible mappings conservatively (e.g. switch the domain back to `Map`)
- pass the `invariant_map` for substitutions at each program point.
We go with the second option here.
The closure substitution is still somewhat broken as exemplified by the following example:
```
void foo(dispatch_block_t block1){
dispatch_block_t local_block = block1;
local_block(); // we don't substitute the call here
}
```
Reviewed By: skcho
Differential Revision: D24993962
fbshipit-source-id: ebadddb58
Summary:
A minor but persistent annoyance: named argument of a non-ambiguous type
of almost the same name. No.
Reviewed By: skcho
Differential Revision: D24991673
fbshipit-source-id: c806f9cec
Summary:
This was left as a TODO before: where to place calls to destructors for
C++ temporaries that are only conditionally creating when evaluating an
expression. This can happen inside the branches of a conditional
operation `b?e:f` or in potentially-short-circuited conditions on the
righ-hand side of `&&` and `||` operators.
Following the compilation scheme of clang (observed by looking at the
generated LLVM bitcode), we instrument the program with "marker"
variables, so that for instance `X x = true?X():y;` becomes (following
the execution on the true branch):
```
marker1 = 0; // initialize all markers to 0
PRUNE(true) // entering true branch
X::X(&temporary); // create temporary...
marker1 = 1; // ...triggers setting its marker to 1
X::X(&x, &temporary); // finish expression
if (marker1) {
X::~X(&temporary); // conditionally destroy the temporary
}
```
In this diff, you'll find code for:
- associating markers to temporaries that need them
- code to initialize markers to 0 before full-expressions
- code to conditionally destroy temporaries based on the values of the
markers once the full-expression has finished evaluating
Reviewed By: da319
Differential Revision: D24954070
fbshipit-source-id: cf15df7f7
Summary:
The translation of `switch` cases needs to insert nodes around the
translation of each `case` sub-statement, so we need to force node
creation in these sub-statements so the nodes around it can be connected
to the translation of the sub-statements.
Also added more logging I found useful when debugging that.
Reviewed By: da319
Differential Revision: D24991455
fbshipit-source-id: d3a622142
Summary:
The current source parser is based on ocamllex only.
In order to track field declaration locations, we propose a
new parser using ocamllex/menhir. This is a more ambitious
project that closely follows the official Java syntax.
Reviewed By: jvillard
Differential Revision: D24858280
fbshipit-source-id: 22d6766e5
Summary:
In all other places, we index params from 0, but accidentally recorded
the wrong number in json. It was because of the confusion between index
and user-visible param position that we show for the user message.
This diff fixes it: now we use 0-based indices internally (but of course still
report 1-based ones in the error message).
Reviewed By: artempyanykh
Differential Revision: D24916878
fbshipit-source-id: 45532c5ff
Summary:
Sometimes there are annotations that don't correspond to the user facing
code.
Previously we would fail, now process them gracefully.
Reviewed By: artempyanykh
Differential Revision: D24890895
fbshipit-source-id: e64a866ec
Summary:
Split the translation of return more aggressively between:
1. the instruction that has to happen before the translation of the sub-expr
2. the sub-expr
3. the instruction that has to happen after the sub-expr
This is needed for the next diff which creates potentially large CFGs in
(2).
Reviewed By: da319
Differential Revision: D24954071
fbshipit-source-id: a7e7e2527
Summary: Model `folly::Optional::get_pointer` which returns an address to a value if exists or `nullptr` if empty.
Reviewed By: jvillard
Differential Revision: D24935677
fbshipit-source-id: 9d990fe07
Summary:
We deliberately stopped as soon as an error was detected when applying a
function call. This is not good as other pre/posts of the function may
apply cleanly, which would allow us to cover more behaviours of the
code.
Went on a bit of a refactoring tangeant while fixing this, to clarify
the `Ok None`/`Ok Some _`/`Error _` datatype returned by PulseInterproc.
Now we report errors as soon as we find them during function calls but
continue accumulating specs afterwards.
Reviewed By: da319
Differential Revision: D24888768
fbshipit-source-id: d5f2c29d7
Summary:
In the next diff we need to know when a destructor is needed for sure
before calling some of these auxiliary methods.
Reviewed By: da319
Differential Revision: D24832078
fbshipit-source-id: 6d4e17de2
Summary:
Communicate new facts from the arithmetic domain to the memory domain to
detect contradictions between the two.
Reviewed By: jberdine
Differential Revision: D24832079
fbshipit-source-id: 2caf8e9af
Summary:
This is several inter-connected changes together to keep the tests
happy.
The ConditionalOperator `b?t:e` is translated by first creating a
placeholder variable to temporarily store the result of the evaluation
in each branch, then the real thing we want to assign to reads that
variable. But, there are situations where that changes the semantics of
the expression, namely when the value created is a struct on the stack
(eg, a C++ temporary). This is because in SIL we cannot assign the
*address* of a program variable, only its contents, so by the time we're
out of the conditional operator we cannot set the struct value
correctly anymore: we can only set its content, which we did, but that
results in a "shifted" struct value that is one dereference away from
where it should be.
So a batch of changes concern `conditionalOperator_trans`:
- instead of systematically creating a temporary for the conditional,
use the `trans_state.var_exp_typ` provided from above if available
when translating `ConditionalOperator`
- don't even set anything if that variable was already initialized by
merely translating the branch expression, eg when it's a constructor
- fix long-standing TODO to propagate these initialization facts
accurately for ConditionalOperator (used by `init_expr_trans` to also
figure out if it should insert a store to the variable being
initialised or not)
The rest of the changes adapt some relevant other constructs to deal
with conditionalOperator properly now that it can set the current
variable itself, instead of storing stuff inside a temp variable. This
change was a problem because some constructs, eg a variable declaration,
will insert nodes that set up the variable before calling its
initialization, and now the initialization happens *before* that setup,
in the translation of the inner conditional operator, which naturally
creates nodes above the current one.
- add a generic helper to force a sequential order between two
translation results, forcing node creation if necessary
- use that in `init_expr_trans` and `cxxNewExpr_trans`
- adjust many places where `var_exp_typ` was incorrectly not reset when translating sub-expressions
The sequentiality business creates more nodes when used, and the
conditionalOperator business uses fewer temporary variables, so the
frontend results change quite a bit.
Note that biabduction tests were invaluable in debugging this. There
could be other constructs to adjust similarly to cxxNewExpr that were
not covered by the tests though.
Added tests in pulse that exercises the previous bug.
Reviewed By: da319
Differential Revision: D24796282
fbshipit-source-id: 0790c8d17
Summary: In cpp, lambda's operator() name includes line and column numbers which were not ignore in proc name when computing bug hash.
Reviewed By: ngorogiannis
Differential Revision: D24890545
fbshipit-source-id: 95e6735f3
Summary:
As per title, plus de-quadratic-ify substitution of actuals into formals.
Also, fix a bug in treatment of callee summaries where the caller lock state was updated first and then used to process accesses in the callee (so should only take into account the original caller state).
Reviewed By: jvillard
Differential Revision: D24857922
fbshipit-source-id: 07ce6999c
Summary:
Otherwise this gets serialized in an unconventional way; `nullable` serializes it as a
standard `null` or string value.
Reviewed By: artempyanykh
Differential Revision: D24827593
fbshipit-source-id: a5a9afc80
Summary: As per title, plus minor improvements in interfaces and a couple of FIXMEs.
Reviewed By: skcho
Differential Revision: D24836125
fbshipit-source-id: f7a4dc196
Summary:
The starvation domain keeps a domain element per distinct pair of lock object and source location. This was used to counteract the imprecision of implicit Quandary-style traces. Starvation has used explicit traces for a long time now, so keeping all these elements is expensive (in fact, in some cases exponential) and of no value. Now, lock object identity is the only distinguishing feature of a domain element.
Also, fix some pretty printing for debugging purposes.
Reviewed By: jvillard
Differential Revision: D24829306
fbshipit-source-id: 22e12f9c1
Summary:
This diff fixes `degree_with_term` to ignore function pointer symbols. `degree_with_term` does
* calculate the degree
* simplify the polynomial only for printing them to users
thus, there is no problem to ignore the function pointer symbols always, ie which does not affect semantics or summary values.
Reviewed By: ezgicicek
Differential Revision: D24596479
fbshipit-source-id: 1e29d2de0
Summary: We recently introduced a more precise model for constructing an optional from a value by making a shallow copy. However, this introduced Use After Delete false positives. For now, we go back to a less precise model by creating a fresh value. A proper model would be to either make a deep copy or call the copy constructor for a value. We will address this in the following diff.
Reviewed By: jvillard
Differential Revision: D24826749
fbshipit-source-id: 3e5e4edeb
Summary: Refactor `folly::Optional` models to make them easier to reuse for `std::optional`
Reviewed By: jvillard
Differential Revision: D24760053
fbshipit-source-id: f665e84c8
Summary:
- log trans_state for each instruction
- create boxes to indent logs
- hunt down "@." that would prematurely close the boxes
- improve messages
Reviewed By: ngorogiannis
Differential Revision: D24794798
fbshipit-source-id: 80d51a8c5
Summary:
They are indirectly encoded in "procedure" field already, but:
1/ To extract name and params one needs to parse the procedure on the
client side
2/ We already have class, package, and field: method and its params is a
consistent change
Reviewed By: artempyanykh
Differential Revision: D24731064
fbshipit-source-id: fae478e7e
Summary:
If the issue one of:
- Field Not Nullable
- Field Not Initialized
- Field Overannotated,
we record field_name to .json result.
NoTE: Design choice for representation. For Field Not Initialized and Field Overannotated
this is always internal (relative to the class) field, but for Field Not
Nullable it can be either internal or external. We could have a
structured output, or always output full name. I preferred to output
short name for convenience of the main usacase I am anticipating.
NOTE: not to be confused with the case where the field is nullable but
we e.g. try to dereference it. This is indirectly related to the issue
(can be several such fields for starters) and if we one day output it,
it will be provided in a separate way (similarly to how we output
nullable_methods).
Reviewed By: artempyanykh
Differential Revision: D24730320
fbshipit-source-id: c995ec221
Summary:
We never tested params dependent on things (tested only things dependend
on params).
Reviewed By: artempyanykh
Differential Revision: D24726858
fbshipit-source-id: a0861cfc3
Summary:
Since we report issues types without a prefix (e.g. ERADICATE_) and
with spaces we should also allow for prefix-less issue types in
SuppressLint, so both should work
- SuppressLint("eradicate-field-not-initialized")
- SuppressLint("Field Not Initialized")
Reviewed By: jvillard
Differential Revision: D24760341
fbshipit-source-id: 1590cf6d0
Summary:
There is a feature in Nullsafe that is interfering with "annotation
graph" feature. Because of this we would not detect provisional
violations for misuses of params of equals() (They will be recorded
as user facing rather than provisional issues).
This diff turns this feature off for annotation graph mode.
Reviewed By: artempyanykh
Differential Revision: D24726655
fbshipit-source-id: 4b7577667
Summary:
Virtual "this" invisible param exists in the annotated signature, but
does not exist in some other places, which causes a lot of annoyance in
different places.
This diff does not intend to solve all this, but makes one step forward.
1/ AnnotatedNullability now explicitly distincts normal params and
VirtualThis.
2/ AnnotatedSignature accounts for this via: a) not having redundant
fake annotation points b) having corrent param indices (those were off
by 1 in non-virtul methods).
Reviewed By: artempyanykh
Differential Revision: D24726480
fbshipit-source-id: fdb8bb0fb
Summary: This is a complex enough feature so iterating on it in a safe manner will be useful.
Reviewed By: artempyanykh
Differential Revision: D24725406
fbshipit-source-id: 81b247143
Summary: `folly::Optional::value()` returns a reference, hence an error was shown when the actual value was being accessed. Since `value()` throws an exception in case of `folly::none`, we want to show the error message at the call site of `value()`. We do this by dereferencing the result of `value()` in the model.
Reviewed By: jvillard
Differential Revision: D24702875
fbshipit-source-id: ca9f30349
Summary:
This is needed for `infer reportdiff` to report issues that are in files
in the same workspace but were captured under a different project root,
but it's also legit to do so in general as "under project root" really
means "this file belongs to the project under analysis".
Reviewed By: martintrojer
Differential Revision: D24755592
fbshipit-source-id: fa8fab127
Summary: In cpp, lambda's `operator()` name includes line and column numbers which were not ignore when computing bug hash.
Reviewed By: jberdine
Differential Revision: D24649125
fbshipit-source-id: 7a235fd3e
Summary:
The problem in Reporting.ml:log_issue_from_summary is that it merely
checks the presence of `SuppressLint` annotation on method's body to
decide whether to log or not the issue. This means that regardless of
issue types specified in `SuppressLint`, all issues on such method will
get blocked.
Here we fix that.
Reviewed By: ngorogiannis, mityal
Differential Revision: D24726604
fbshipit-source-id: c9cae3833
Summary:
This diff glues the previous work together.
The ClassLevelAnalysis finds list of provisional violation, builds the
graph based on them, and outputs this graph as a separate issue.
Reviewed By: artempyanykh
Differential Revision: D24682802
fbshipit-source-id: 8174da91a
Summary:
This change is what makes the annotation graph a _graph_.
Now we can detect places when adding one annotation causes an issue that can be
fixed by adding another annotation.
Reviewed By: artempyanykh
Differential Revision: D24650979
fbshipit-source-id: b8452b822
Summary:
In the previous diff we introduced
AnnotatedNullability.is_annotatable_as_nullable method.
Lets now use it in all places where we issue ProvisionallyNullable
annotation.
This will lead to more precise nullability graph (without attempts to
annotate primitives).
Reviewed By: artempyanykh
Differential Revision: D24650951
fbshipit-source-id: 8ea0bb97d
Summary:
This change is on par with the logic that we already have for methods
and method params.
In this logic, we explicitly distinct class under analysis & the
external class when fetching AnnotatedNullability (we needed this to support
various Nullsafe modes as well).
Here we use the same approach for fields.
NOTE: this is not intended to work with nested classes just yet.
Reviewed By: artempyanykh
Differential Revision: D24650934
fbshipit-source-id: f555bcc3f
Summary:
:
New flag (equivalent to --clang-biniou-file) to pass in AST to
frontend.
Passing in json files has the big advantage of emitting line numbers
on frontend AST errors.
Reviewed By: jvillard
Differential Revision: D23814358
fbshipit-source-id: 0ad0452ff
Summary: Cleanup `Typ` by moving all constant types to `StdTyp`. Also remove `Typ.typ` as it's just `Typ.t` now.
Reviewed By: jberdine
Differential Revision: D24620397
fbshipit-source-id: 4764f87ef
Summary:
Now we have all pieces of information needed to build the annotation
graph (to be finished as a follow up).
This diff utilizes the same approach that is done when we calculate node
promotions:
1/ AssignmentRule and DereferenceRule know that some issues are related
to provisional annotations and expose this information.
2/ ClassLevelAnalysis collects all the issues, extract violations from
them, and outputs them.
Next step will be changing the output to the meta-issue printing the
annotation graph in json.
Reviewed By: artempyanykh
Differential Revision: D24621410
fbshipit-source-id: c240da3aa
Summary:
In the previous diffs, we introduced
AnnotatedNullability.ProvisionallyNullable. This is a symmetric change
introducing the same for Nullability.t
Note how ProvisionallyNullable is made `is_nonnullish`. This means that
if we run nullsafe in --nullsafe-annotation-graph mode, the following
should start happening:
1/ A lot of new issues will be recorded - every time a local method or
non-null param is dereferenced etc.
2/ But all of those new issues will be filtered by corresponding Rules
(e.g. AssignmentRule). So those issues will not manifest to user-visible
errors. However, now we are capable to see and analyse those hidden
issues in ClassLevelAnalysis. This is exactly what we will be doing in
follow up diffs.
Note that the place of ProvisionallyNullable in the hierarchy is
somewhat arbitraily. It is important to make it less pri then Nullable -
but when it comes to comparing with other "nonnullish" modes - we can
tweak the behavior later on.
Reviewed By: artempyanykh
Differential Revision: D24621411
fbshipit-source-id: 3b99e5e55
Summary:
The fact that we did not preserve all joins (and used only the first
one) was a shorcut since the origin is used only for reporting, and
reporting the info about left joinee is enough for practical purposes.
Hovewer, to support annotation graph, we need all joinees.
This diff is a no-op since we use only the first one currently.
Reviewed By: artempyanykh
Differential Revision: D24621412
fbshipit-source-id: 8fe1174e7
Summary:
This was perhaps needed at some point, but now there is no need in this.
See the next diff that refactors InferredNullability so we store all
joinee results there.
Reviewed By: artempyanykh
Differential Revision: D24621413
fbshipit-source-id: d6f406b87
Summary:
This is the basic building blocks for the future annotation graph. This
diff introduces the first abstraction to be iterated on.
Reviewed By: artempyanykh
Differential Revision: D24621414
fbshipit-source-id: 19acdf216
Summary:
Before we were creating a fresh internal value when we were constructing `folly::Optional`. This diff models `folly::Optional` constructor more precisely by copying the given value.
There was also a missing dereference in the model of `value_or`
Reviewed By: jvillard
Differential Revision: D24621016
fbshipit-source-id: c86d3c157
Summary:
The global type environment is loaded in all analysis workers, so it is best to minimise its memory footprint. `MaxSharing` does a great job at this but
- it is very slow
- it may be involved in observed segfaults, since it uses potentially unsafe operations
This diff replaces `MaxSharing` with a post-facto poor-man's hashcons pass through the parts involved in a java type environment. It is not as efficient as `MaxSharing` at reducing footprint, but it's much faster.
Reviewed By: jberdine
Differential Revision: D24590400
fbshipit-source-id: c37100325
Summary:
Namely `enumerateObjectsUsingBlock` which takes a collection and a block as an arg and the iterates over the collection and applies the block to each element of the collection.
This is a common way to iterate over Objc collections, so let's add a model for it.
Reviewed By: skcho
Differential Revision: D24604590
fbshipit-source-id: 1ceeb4b40
Summary:
The autorelease pool size checker increases the size when an non-arc-compiled function calls
arc-compiled function. However, if the callee is declared in arc-compiled code and its body is not
actually captured or compiled, we should not increase the autorelease pool size.
Reviewed By: ezgicicek
Differential Revision: D24569661
fbshipit-source-id: 2fd707096
Summary:
This diff revises the translation of message expression's arguments in ObjC frontend. In the
frontend, it massages the arguments when calling a static method, so the class or object value is
not given to the static method as the first parameter.
The problem is that it used a raise-exception-and-catch way to detect where we remove the first
parameter. This way of using an exception is not only hard to understand, but also incorrectly
removed the first parameter, with breaking abstract semantics sometimes. (See the added test.) This diff
avoids using the exception.
Reviewed By: jvillard
Differential Revision: D24565513
fbshipit-source-id: 0a84ca394
Summary:
The `make clean` did not remove objects and dot files, so
```
infer/tests/codetoanalyze/objc/frontend$ make test
infer/tests/codetoanalyze/objc/frontend$ make clean
infer/tests/codetoanalyze/objc/frontend$ make test
```
the second `make test` did nothing. This diff adds additional regular
expressions to clean all objects and dot files generated.
Reviewed By: ngorogiannis
Differential Revision: D24566169
fbshipit-source-id: b8c50c922
Summary: This diff fixes on-demand symbolic value generation of a class that inherits NSEnumerator.
Reviewed By: ngorogiannis
Differential Revision: D24504955
fbshipit-source-id: bcb20e8aa
Summary:
This diff replaces overridden method calls in ObjC when possible, ie the first parameter of the
method has a sub-class type of the method's class. For example,
when `MyEnumerator` is a sub-class of `NSEnumerator` and there is overridden `nextObject`,
```
[my_enumerator nextObject]
```
in Sil, it was translated to like
```
NSEnumerator.nextObject(my_enumerator : MyEnumerator*)
```
and the analyzer missed the overridden method. This diff replaces the function call to
```
MyEnumerator.nextObject(my_enumerator : MyEnumerator*)
```
Reviewed By: ezgicicek
Differential Revision: D24477290
fbshipit-source-id: 6842a76f8
Summary: Model `folly::Optional::value_or(default)` to return value if not-empty and `default` if empty.
Reviewed By: jvillard
Differential Revision: D24539456
fbshipit-source-id: cc9e176cc
Summary: Can be useful, especially to dump all the summaries as json.
Reviewed By: skcho
Differential Revision: D24504253
fbshipit-source-id: 845e7d657
Summary:
Emit the crucial parts of Pulse summaries as json to enable
post-processing by external tools. Stop somewhat arbitrarily at some
datatypes that are just emitted as "opaque" values.
For example:
```
$ infer debug --procedures --procedures-summary-json --select 0
[[["pulse",[["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v7",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v7","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}],["ContinueProgram",{"post":{"heap":[["v3",[[["Dereference"],["v4","_"]]]],["v8",[[["Dereference"],["v3","_"]]]]],"stack":[[["ProgramVar",{"plain":"return","mangled":null}],["v8","_"]]],"attrs":"_"},"pre":{"heap":[],"stack":[],"attrs":"_"},"skipped_calls":"_","path_condition":"_"}]]]]]
```
Reviewed By: ezgicicek
Differential Revision: D24503387
fbshipit-source-id: 9bd08e93b
Summary:
Output summaries in json format, so that other tools can exploit the
results of infer without having to be written inside infer itself.
For now the json for a summary is just one line saying "opaque" :)
Set up the infra to generate (yo)json automatically using
ppx_yojson_conv. See it in action in the next diff.
Reviewed By: ezgicicek
Differential Revision: D24503343
fbshipit-source-id: e24a2fff3
Summary:
- output the "menu" of the interactive mode on stderr instead of stdout
so that we can pipe the results, eg
`infer debug --procedures --procedures-summary | cat`
This will be more useful when we add an option to output json, as
otherwise the menu pollutes the json.
- Allow "--select" to work for infer-debug too:
`infer debug --procedures --procedures-summary --select 0`
Reviewed By: da319
Differential Revision: D24503301
fbshipit-source-id: d7fb4b713
Summary:
This diff revises `nextObject` model to handle multiple symbolic enumerators. Instead joining the
symbolic offsets of them, which sometimes introduces top, it sums the offsets. This is a sound &
conservative semantics since they are all non-negative integers.
Reviewed By: ezgicicek
Differential Revision: D24474513
fbshipit-source-id: 6707aa907
Summary:
This diff revises memory model of enumerator in ObjC to enable passing it as a parameter.
The cost checker was not able to analyze a function precisely when it gets an enumerator as a
parameter because the offset of an enumerator was available only when the analyzer knew the correct
relation between the enumerator and an array.
This diff simplifies the enumerator to have a similar value with `array->elements`, so its offset can
be taken without the relation between enumerator and array to get them.
Reviewed By: ezgicicek
Differential Revision: D24446574
fbshipit-source-id: 27cdc051e
Summary:
This diff adds an option hiding function pointers in costs to users: `cost-suppress-func-ptr` is
true by default.
Reviewed By: ezgicicek
Differential Revision: D24448212
fbshipit-source-id: 88f6b5ea1
Summary:
Before this diff we would just propagate the callee abstract state,
which doesn't make sense in the caller. We could just remove the state
from AbortProgram altogether as Pulse itself doesn't use it, but for now
let's at least make sure it's accurate.
Also needed for upcoming hackathon that will start from Pulse error
specs to try to produce tests :)
Reviewed By: ezgicicek
Differential Revision: D24448073
fbshipit-source-id: 9100b3f79
Summary:
Detect when changed files paths are trying to escape the project root
and try to guess their relative project root (which has to be a parent
of the current one).
This is perhaps a bit too hacky but it works for the case we need it to.
Reviewed By: martintrojer
Differential Revision: D24425427
fbshipit-source-id: 018651740
Summary:
Sometimes you need several project roots (eww), this makes paths make
sense even in that case.
Reviewed By: martintrojer
Differential Revision: D24336244
fbshipit-source-id: f087d533a
Summary:
Don't wait until pre-analysis has completed before updating the task
bar with the current procedure being analysed.
Reviewed By: skcho
Differential Revision: D24418609
fbshipit-source-id: afedaf687
Summary:
Move from Map to SafeInvertedMap:
- joining two branches where only one branch had the variable set to a
given closure or type should *not* keep that information around: now
we correctly get Top instead
- the "Safe" part is an optimisation that doesn't store Top values in
the map, which is important as most values are not closures so we
don't care about storing the fact that we don't know anything about
them
Reviewed By: ngorogiannis
Differential Revision: D24418560
fbshipit-source-id: 0ac701502
Summary:
This diff adds a model for NSFileManager.contentsOfDirectoryAtURL as returning a constant-length
collection.
The analyzer cannot know files in a directory. We have some options to handle such unknown data.
1. Use `Unknown` value, ie `top`
2. Use a symbolic value
3. Use a constant value
We had been used the first option. An upside of this is that the analyzer can remain as sound.
However, a downside of this is the top value can be propagated to other procedures, making their
costs top, thus we may miss some cost changes of them.
The second option is to introduce a symbolic value, ie. that for the number of files. A problem is
that the symbolic value will never be concretized. As a result, the symbol can be propagated to
other procedures, increasing the coefficient of the complexity or making top costs. Note that handling multiple
symbols is somewhat limited in Inferbo's interval domain.
The last option is to introduce a constant value. I think this is the best approach we can take among above.
Even though we may have FNs when there are a lot of files in a directory, we cannot reason or expect about
that at the analysis time anyway.
Reviewed By: ezgicicek
Differential Revision: D24418099
fbshipit-source-id: bf8cf3538
Summary: This diff adds closure symbols to operation/allocation costs, when function pointer is called.
Reviewed By: ezgicicek
Differential Revision: D24308550
fbshipit-source-id: 6c5889d41
Summary:
This diff extended the polynomial domain to include symbols for closure calls.
When the closure symbol is added to the polynomial? Unknown closure is called inside a function
like,
```
foo() {
self->closure_field();
}
```
Thus, the cost of `foo` becomes `|self->flosure_field|`, rather than unknown. (Note that this
semantics is added only for autoreleasepool size at the moment.)
When the symbol is instantiated? `foo` is called with correct closure contexts.
```
goo() {
self->closure_field = ^(){ ... };
foo();
}
```
The summary of `goo` will have instantiated summary of the closure.
Reviewed By: ezgicicek
Differential Revision: D23992590
fbshipit-source-id: d1d228403
Summary:
Another step in the refactoring of the starvation domain:
- Main purpose is to mediate access to the set of critical pairs in a summary through a fold function (`fold_critical_pairs_of_summary`) and not through direct field access to that set. This will allow eliding storage of critical pairs entirely and dynamically generating those when folding.
- Remove optional arguments as much as possible, as this led to unused arguments not being caught.
- Helper functions distributed more logically among modules.
Reviewed By: skcho
Differential Revision: D24275399
fbshipit-source-id: d23123a48
Summary: As part of a refactor, push `thread` from the enclosing type (`CriticalPairElement`) into `Event.t`.
Reviewed By: jvillard
Differential Revision: D24161709
fbshipit-source-id: bd812f3fd
Summary:
In ObjC, `NSObject.copy` returns the object returned by `copyWithZone:` on the given class. This method must be implemented if the class complies with `NSCopying` protocol. Since we don't have access to `NSObject`'s code, to follow calls into `copyWithZone:`, we replace such `copy` calls with calls to `copyWithZone:` when a) such a method exists in the class and b) the class conforms to `NSCopying` protocol.
This is done in the preanalysis because
- we need to know if there is a `copyWithZone:` method in the class.
- so that other analyses also benefit (as opposed to doing this in cost and inferbo models).
Note that `NSObject` doesn't itself conform to `NSCopying` but all its subclasses must confrom to the protocol and support the same behavior as above.
https://developer.apple.com/documentation/objectivec/nsobject/1418807-copy
Similarly for `mutableCopy` -> `mutableCopyWithZone:` for classes implementing `NSMutableCopying` protocol.
Reviewed By: skcho
Differential Revision: D24218102
fbshipit-source-id: 42900760e
Summary:
`NonBlocking` methods have starvation errors silenced (but not deadlock ones). This is implemented by summarising as usual and then filtering out such events when the summary is finalised, if the method is annotated as such.
It's better to not record the events in the first place.
Reviewed By: ezgicicek
Differential Revision: D24237465
fbshipit-source-id: 1b24a26f0
Summary: This will be needed in the next diff so that we can find all classes that conform to `NSCopying` protocol.
Reviewed By: skcho
Differential Revision: D24216549
fbshipit-source-id: 297b527a6
Summary:
- rename the checker "Uninitialized Variable" to "Uninitialized Value"
as this is the name of the issue type
- delete timestamp XML comment from the man pages to avoid future git
churn when updating the website
- counting is hard
Reviewed By: martintrojer
Differential Revision: D24219165
fbshipit-source-id: cf3057373
Summary:
I wanted to change the default to "callgraph" but that created issues in
our tests, introducing flaky behaviours and even a failure due to trying
to run the pre-analysis multiple times (not 100% sure it was related).
Instead, document the various options and put the option in the analysis
manual so users can choose.
Reviewed By: martintrojer
Differential Revision: D24193751
fbshipit-source-id: 4b7c33a79
Summary: Model it similar to `NSArray.initWithArray` as copying from the given dictionary elements. Removes a FP as expected.
Reviewed By: ngorogiannis
Differential Revision: D24136868
fbshipit-source-id: ed31c3c8f
Summary:
This diff is a preparation to extend the polynomial domain. What I will do in the following diff is
to extend the polynomial domain to include a symbol representing "cost of function pointer call".
Then, the symbol will be instantiated to another polynomial cost at call sites.
The polynomial domain is defined as a sum of
* non negative insteger
* map from `bound` to polynomail domain
```
type poly = {const: NonNegativeInt.t; terms: poly M.t}
```
The problem is all `bound`s are adderessed as integer values, rather than polynomial, thus it
does not fit to include the symbol for the "cost of function pointer call".
In this diff, it introduces a `Key` module for the map `M`, the type of `Key.t` is the same to the previous key of `M`. The actual extension of the key type will be in the following diff.
Reviewed By: ezgicicek
Differential Revision: D23991401
fbshipit-source-id: cec64347f
Summary: Subsequent diff will push information down into `Event.t` so as preparation, turn all variant values into records.
Reviewed By: jvillard
Differential Revision: D24115201
fbshipit-source-id: d2126dd49
Summary:
`std::lock(x,y,z)` simultaneously acquires locks `x,y,z` in a deadlock free manner (essentially an unspecified fixed order).
Starvation currently deals with it by exploiting properties of the state domain. It's a map from locks to number of times the lock is held, so the count of many locks can be increased at the same time without recording any particular lock order.
In upcoming diffs the domain will be refactored into a tree of nested lock acquisitions (for other reasons) and that domain necessarily records lock order. The obvious way of doing this correctly is to allow `std::lock` as an atomic even (ie, without trying to break it into multiple acquisitions).
This diff does exactly that, by changing the `Event.LockAcquire` variant to take a list of locks.
Reviewed By: ezgicicek
Differential Revision: D24052304
fbshipit-source-id: 410c812d7
Summary:
This diff avoids some usages of functors when they are applied only once in `polynomials.ml`. While
functor is in general helpful to abstract data and make code cleaner, it makes us hard to understand
code, eg `merlin-locate` does not give an actual definition point of terms when it is from a module
parameter.
Reviewed By: ngorogiannis
Differential Revision: D23991164
fbshipit-source-id: c61289e84
Summary:
This diff extends inferbo's domain to include closure values. The goal of this extension is to
follow missing semantics where closures are handled as values, for example, a closure is assigned to
an object field, then it is got later to call.
Due to the bottom-up nature of the analyzer, sometimes we don't know which values are written in a
field, which is the same for the other non-closure values.
Reviewed By: ezgicicek
Differential Revision: D23932186
fbshipit-source-id: 4a575d0de
Summary: Dispatch & function call mechanism was so jumbled up together. Let's refactor it to be cleaner.
Reviewed By: skcho
Differential Revision: D24049889
fbshipit-source-id: 42a218016
Summary:
This makes sure we call `AbductiveDomain.summary_of_post` exactly once
per post-condition. Notice in particular in the diff:
- in Pulse.ml we remove a now-certified-useless "is_unsat_expensive"
call
- in PulseOperations.ml we add a previously-missing call to
`summary_of_post` (it's needed to remove local variables from the
symbolic state + normalize)
The price to pay is ugly type annotations and down-casting peppered in a
few places, in reasonable number.
Reviewed By: da319
Differential Revision: D24078564
fbshipit-source-id: 3102cacf0
Summary:
Take another page from the Incorrectness Logic book and refrain from reporting issues on paths unless we know for sure that this path will be taken.
Previously, we would report on paths that are merely *not impossible*. This goes very far in the other direction, so it's possible we'll want to go back to some sort of middle ground. Or maybe not. See the changes in the tests to get a sense of what we're missing.
Reviewed By: ezgicicek
Differential Revision: D24014719
fbshipit-source-id: d451faf02
Summary: We add a naive model for `forEach` idiom for Java's Iterable and Maps. The model is naive because it doesn't take the cost of the lambda into account. This will be fixed later.
Reviewed By: da319
Differential Revision: D23868203
fbshipit-source-id: 37d169c6f
Summary: Gradle produces a number of compilation units which are currently captured sequentially. This diff parallelizes this step.
Reviewed By: jvillard
Differential Revision: D23930978
fbshipit-source-id: d71c22ba3
Summary:
The problem: current enumerator semantics does not work on symbolic enumerator that is given as a
parameter.
Reviewed By: ezgicicek
Differential Revision: D24017059
fbshipit-source-id: 378e75bb0
Summary:
This can be used by additional tooling for further analysis (e.g.
codemods, autofixes, etc).
Reviewed By: ngorogiannis
Differential Revision: D23987694
fbshipit-source-id: b9fa343ac
Summary: When using the database write server process, the parent has to wait for it to fully start up before continuing. Instead of dying after a fixed amount of time, never terminate (external timeouts can take care of that) and use exponential backoff to avoid spamming.
Reviewed By: jvillard
Differential Revision: D23989959
fbshipit-source-id: ff5232c14
Summary:
Testing procedure for java source parser
- we can run directly the parser without compiling and analysing the source file
- we add a test file
Reviewed By: ngorogiannis
Differential Revision: D23705199
fbshipit-source-id: 2103c1681
Summary: The code that parses `/proc/cpuinfo` expects IDs to be sequential. This is not always the case, so keep track of which IDs appear instead of keeping the maximum, when counting.
Reviewed By: jvillard
Differential Revision: D23987776
fbshipit-source-id: 98d267560
Summary:
This diff keeps closure parameters in closure-specializated procedures.
What the closure-specialization is doing is a propagation of concrete closures. For example, it
translates:
```
foo(block b) {
b();
}
goo() {
foo(^{...});
}
```
to
```
foo_new() {
(^{...})();
}
goo() {
foo_new();
}
```
However, if `foo` addresses `b` as a normal value like
```
foo(block b) {
block c = b;
}
```
this is translated to
```
foo_new() {
block c = b;
}
```
Note that the closure parameter of `foo` is removed, thus `b` becomes a free variable. Not good.
To avoid the situation, this diff keeps the closure parameters intact.
Reviewed By: da319
Differential Revision: D23905580
fbshipit-source-id: 014989fbf
Summary: Subtle false positives and negatives in Hil make Sil preferable. This diff gets rid of the CFG-emulation of Hil, while still using Hil expressions.
Reviewed By: da319
Differential Revision: D23815026
fbshipit-source-id: 731a6d299
Summary: Without curly braces, it is declaration of globals, not fields.
Reviewed By: ezgicicek
Differential Revision: D23866604
fbshipit-source-id: dd685c8d6
Summary:
The previous diffs recorded it for the case when the unvetted value is
dereferenced or otherwise used wrongly. This case finishes the work,
recording the needed signature for the remaining case (when the
offending third party has a non-nullable param with nullable passed
inside)
Reviewed By: ngorogiannis
Differential Revision: D23706679
fbshipit-source-id: e6f641223
Summary: This diff adds a test, in which autoreleasing lambda is written in an object field, then called later. The test is not analyzed precisely because it cannot follow that a lambda value is written to a field of an object as of now.
Reviewed By: ezgicicek
Differential Revision: D23843089
fbshipit-source-id: 62ba153aa
Summary:
This diff substitutes closure parameter when it is given via variable. For example,
```
x = ^{ ... }
foo(x)
```
this diff substitutes the closure variable `x`,
```
x = ^{ ... }
foo(^{ ... })
```
so that the specialization of `foo` can be done by `CCallSpecializaedWithClosures.process`.
Reviewed By: jvillard
Differential Revision: D23814595
fbshipit-source-id: a89f1530f
Summary:
Upgrade to latest clang release, needed for xcode12.
clang-8/9 won't be able to read the Xcode 12 SDK since there's annotations that will fail compilation.
Also removing unused (and hard to compile) binary `ast_exporter_bin` from facebook-clang-plugins/libtooling.
Reviewed By: ngorogiannis
Differential Revision: D23780089
fbshipit-source-id: 2314125a9
Summary:
This diff add a model of `NSArray.index_of_object_passing_test` in
autoreleasepool size model.
Worst case autorelease cost is "array size x cost of given test function (lambda)".
Reviewed By: ezgicicek
Differential Revision: D23785165
fbshipit-source-id: 95ec9700a
Summary:
This diff simply changes the type of autorelease size models as `CostUtils.model` instead of
`unit`. In the following diff, it will add a model that has more complicated semantics that does not return the unit cost.
Reviewed By: ezgicicek
Differential Revision: D23785159
fbshipit-source-id: 300808574
Summary:
`WithBlockParameters` is generated by a pre-analysis to express concrete block parameters. However,
before this diff, the block parameters only have names, which is insufficient to find their
summaries. This diff change `WithBlockParameters` to have `Block.t` that includes the parameters of
the block, so we can find blocks' summaries.
Reviewed By: ezgicicek
Differential Revision: D23785148
fbshipit-source-id: 9034f4f8d
Summary: Also, fix some leftover meaningless logging, enable all cost issue types on diff testing so that we can see the unreachable cost issues
Reviewed By: skcho
Differential Revision: D23784563
fbshipit-source-id: 1b4a2b582
Summary:
Nullifying these leads to observable side-effects, like in the added
test.
Reviewed By: da319
Differential Revision: D23759756
fbshipit-source-id: 559a6486b
Summary: ObjC objects can be added to autorelease pool by `CFAutorelease`.
Reviewed By: ezgicicek
Differential Revision: D23625632
fbshipit-source-id: 694e5bffb
Summary:
This diff regards some methods of NSKeyedUnarchiver as autorelease, because they may call
autorelease.
Reviewed By: ezgicicek
Differential Revision: D23600232
fbshipit-source-id: a9fc1cc56
Summary:
This diff increases autoreleasepool size when
* caller is non-ARC-compiled
* callee is ARC-compiled
* return type is a pointer to objc object
To distinguish non-ARC-/ARC-compiled:
* extended `translation_unit_decl_info` to have a boolean field `is_objc_arc_on`
* then copied it to `ProcAttributes.t` for each procedures.
Reviewed By: ezgicicek
Differential Revision: D23565003
fbshipit-source-id: dee22ea82
Summary:
We in fact build NullsafeIssue.t by parts, filling needed details until
all the info is provided.
This is nicely expressed via partial functions.
See the next diff that uses this to add additional information to the created
`NullsafeIssue.t`
Reviewed By: skcho
Differential Revision: D23706674
fbshipit-source-id: ca5fac704
Summary: Structs captured both by reference or by value should have reference in their type. Struct captured by value should first call copy constructor. In this diff we fix the type of the captured variable to include reference. Copy constructor injection is left for the future.
Reviewed By: jvillard
Differential Revision: D23688713
fbshipit-source-id: d13748b5d
Summary: Variables captured without initialization do not have correct type inside lambda's body. This diff sets the correct type of captured reference variables inside procdesc and makes sure the translation of captured variables is correct. The translation of lambda's body will then take into account the type of captured var from procdesc.
Reviewed By: jvillard
Differential Revision: D23678371
fbshipit-source-id: ed16dc978
Summary: Add missing reference to the type of variable captured by reference without initialization.
Reviewed By: jvillard
Differential Revision: D23567685
fbshipit-source-id: b4e2ac0b6
Summary: Variables captured by reference without initialization are missing dereference in their type inside lambda's body. This causes the translation to miss one dereference. To fix this, we want to add missing reference to the type. However, we first need to make sure that lambdas body is translated after the translation of captured variables.
Reviewed By: jvillard
Differential Revision: D23564099
fbshipit-source-id: 6a2ae053d
Summary:
Added a function to check if a method is a cpp lambda call operator that will be used in later diffs
#skipdeadcode
Reviewed By: jvillard
Differential Revision: D23564089
fbshipit-source-id: 144c3d735
Summary:
We were missing assignment to captured variables with initializers.
Consider the following example:
```
S* update_inside_lambda_capture_and_init(S* s) {
S* object = nullptr;
auto f = [& o = object](S* s) { o = s; };
f(s);
return object;
}
```
which was translated to
```
VARIABLE_DECLARED(o:S*&);
*&o:S*&=&object
*&f =(_fun...lambda..._operator(),([by ref]&o &o:S*&))
```
However, we want to capture `o` (which is an address of `object`), rather `&o` in closure.
After the diff
```
VARIABLE_DECLARED(o:S*&);
*&o:S*&=&object
n$7=*&o:S*&
*&f =(_fun...lambda..._operator(),([by ref]n$7 &o:S*&))
```
Reviewed By: jvillard
Differential Revision: D23567346
fbshipit-source-id: 20f77acc2
Summary:
Following the previous diff, the reason we mistakingly introduced new block as complexity increases is because when we are computing the hash of block, we return the string `block` to compute the hash for all block.
This diff fixes this case by returning the non-verbose name of the block when returning hashable name.
Reviewed By: ezgicicek
Differential Revision: D23729710
fbshipit-source-id: a345d3045
Summary:
When inspecting silent introduced issues, we found that an introduced issue is about a complexity increase of a block that is only created in the current diff. Based on the trace view, we find out that this is caused by infer mistakingly consider another block that exists in the previous diff as the same block that is newly created in the current diff.
This diff adds a test case that reproduces this case, and this will be fixed in the next diff.
facebook
Trace view: https://www.internalfb.com/intern/traceview/?id=109896337
Reviewed By: ezgicicek
Differential Revision: D23681550
fbshipit-source-id: 78341268b
Summary:
If the issue is related to the use of an unvetted third party method
that can not be trusted, we record this fact.
Reviewed By: ngorogiannis
Differential Revision: D23705626
fbshipit-source-id: 851328fe5
Summary:
It was inconsistent before (we recorded it only for meta issues).
Now we always record it, which will simplify further diffs.
In the next diffs, we are going to add more fields inside `nullsafe_extra`.
Reviewed By: ngorogiannis
Differential Revision: D23705592
fbshipit-source-id: 8bbb0e7c8
Summary:
This diff introduces `NullsafeIssue.t`: information about the issue
ready to be reported (and put to `result.json`).
Its notion was already implicitly used in a lot of code.
With this change, the achitechure becomes the following:
Firstly, `TypeErr.err_instance` represents issues at the moment of registration
during the typecheck. At this moment we don't always want to report
them, but it is important to store even non-reportable ones (we use it to calculate mode promotions).
Secondly, given the current `NullsafeMode.t`, we can either report the
issue or not. This logic lives in
`TypeErr.make_nullsafe_issue_if_reportable_lazy` that normally redirects
this decision to Rules (e.g. AssignmentRule or DereferenceRule).
Thirdly, if we want to report the issue, it is time to actually figure
out what to report (e.g. the precise error message, or additional
nullsafe specific `.json` params - see next diffs adding them).
Note that the exact logic of deciding if we want to report / how to
report / what should the message or .json param be is in practice
coupled (otherwise we'd have weird bugs when we decided the issue is
reportable, but don't have a good user facing reason).
In practice such logic for complix issues leaves in Rules.
C.f. `DereferenceRule` and `AssignmentRule` code.
The tricky part is that those rules actually share some common code
responsible for reporting, e.g. when it comes to processing third
parties (so the decision making & reporting is unified). See
`ErrorRenderingUtils.mk_nullsafe_issue_for_untrusted_values` which is
called both from `AssignmentRule` and `DereferenceRule`.
`NullsafeIssue.t` glues together those shared parts of code, and make
dependencies explicit.
Check out the next diff when we add more capabilities to
`NullsafeIssue.t` (e.g. ability to store dependent third party methods).
Without this refactoring, implementing this feature would be rather
tricky.
Reviewed By: skcho
Differential Revision: D23705587
fbshipit-source-id: b5246062a
Summary:
Even though it's unused in the implementation this argument is supposed
to be `Typ.t` so spell it out.
Reviewed By: skcho
Differential Revision: D23729433
fbshipit-source-id: d05548f42
Summary:
This can be useful to make pulse forget about tricky parts of the code.
Treat "skipped" procedures as unknown so heuristics for mutating the
return value and parameters passed by reference are applied.
Reviewed By: ezgicicek
Differential Revision: D23729410
fbshipit-source-id: d7a4924a8
Summary: This diff selects an autorelease trace that has a bigger polynomial.
Reviewed By: ezgicicek
Differential Revision: D23731155
fbshipit-source-id: 243591583
Summary:
This diff reports paths under the xcode isysroot as relative in tests.
This was a problem when another machine that has a different isysroot
directory is running the test.
Reviewed By: ezgicicek
Differential Revision: D23729222
fbshipit-source-id: 4e9681f65
Summary:
ObjC++ models are a copy (symlink) into the ObjC ones, albeit with different compile time flags. However, the resulting procnames are identical, and we are left with only one copy of models anyway. This means that only one version (chosen arbitrarily by the build order) is used for analysis.
This diff deletes the ObjC++ version.
Reviewed By: jvillard
Differential Revision: D23704266
fbshipit-source-id: 1dc94251f
Summary:
Handle the case of
- computing the range of `[lb, ub]` where `lb` contains length path and `ub` is constant. E.g. `[a.length, 2]` would give `3` range
- simplifying `c1 +/- min(c2, a.length)` to `c1 +/- min(c2, 0)`
since the length of a collection/array is always nonnegative.
This also removes some existing FPs in tests results.
Facebook
Context: we stumbled upon this issue in ObjC results TV109717121 which is corresponding to the added test case.
Reviewed By: skcho
Differential Revision: D23648807
fbshipit-source-id: 1e89ea246
Summary: This diff adds trace field for autoreleasepool size. Unlike to the other checkers, eg inferbo and operation cost, the autoreleasepool size checker should have traces for constants.
Reviewed By: ezgicicek
Differential Revision: D23678084
fbshipit-source-id: 35e6cf5f5
Summary:
For complexity issues from O(m) to O(n), we only include the trace of the current complexity O(n). However, this makes it difficult to understand what the original complexity O(m) was. Especially in fixed issues where n=1, we only get a constant cost with no trace attached, so it is difficult to see how the symbol m disappeared.
This diff includes the traces for the previous cost in the cost issues.
Reviewed By: skcho
Differential Revision: D23680360
fbshipit-source-id: 3f2b21b20
Summary:
The best-fit allocator with overhead 120 is slightly faster, and
significantly reduces peak memory usages in the workers.
Reviewed By: jvillard
Differential Revision: D23678794
fbshipit-source-id: ccdfced74
Summary:
The `attr_kind` column has now exactly two possible values: defined and undefined (since D22187238 (61ae2d1e1b)). This is also what should be stored in the field `is_defined` in the procedure attributes. Finally, the cfg can be `NULL` or not. This diff makes all three of these agree, in order to allow for the removal of the `attr_kind` column up the stack, since it will be equivalent to `cfg is NOT NULL`.
The changes in the tests indicate improved correctness: previously, specialising a modelled callee would result in a dummy entry in the procedures table, because all clang procedures were given a procdesc, so the specialisation would specialise the empty procdesc. Now, the model is not shadowed by the specialised dummy procdesc, and is specialised itself (and that's why it also ends up in the capture DB, which is slightly counter-intuitive).
Reviewed By: jvillard
Differential Revision: D23536931
fbshipit-source-id: 983216cb6
Summary: So that it can be meaningfully used for diffing across infer versions, say.
Reviewed By: martintrojer
Differential Revision: D23648560
fbshipit-source-id: 98d634e37
Summary:
Since we don't yet support `.sig` files for fields, hardcoding this in Ocaml
for now.
Reviewed By: jvillard
Differential Revision: D23627390
fbshipit-source-id: 8df78068d
Summary: Polynomial category zero corresponds to unreachable but "zero" is a misnomer and rather confusing. Let's fix it.
Reviewed By: skcho
Differential Revision: D23597735
fbshipit-source-id: f0c96ed26
Summary:
Eliminate the need to serialise procnames when sending work from the restart scheduler to the workers, by sending the proc_uid instead. This is (much) shorter than the byte representation of the proc_name and it's the primary DB key of the procedures table, so it can be used by the worker to obtain the full procname.
Also, reduce GC churn by using folds in the scheduler startup instead of copying lists over and over.
Reviewed By: jberdine
Differential Revision: D23566131
fbshipit-source-id: 1472aa990
Summary:
- freshen up /docs/next/absint-framework to give sensible advice, and
delete outdated bits that are now in the API docs so they remain fresh
- delete SimpleChecker.ml as it's just a source of bitrot
- delete the "adding checkers" page as it's completely outdated and
subsumed by the "AI framework" page + the labs.
Reviewed By: jberdine
Differential Revision: D23597271
fbshipit-source-id: 78b541746
Summary:
We defined cost's plus as "zero+unreachable = unreachable" for the operation cost. The meaning of
the zero cost is that no statment is analyzed yet, and the unreachable(bottom) cost means a program
point is analyzed as unreachable. We thought "zero+unreachable" happens in very specific cases we
need to check, or due to an analyzer bug. For debugging purpose, we defined it to return more
specific unreachable cost.
However, in allocation/autoreleasepool costs, the zero cost is not very special value. If there is
no alloc/autorelease calls, they can have the zero cost. "zero+unreachable = unreachable" doesn't
make sense there.
This diff changes the plus as "zero+unreachable = zero" for the non-operation costs.
Reviewed By: ezgicicek
Differential Revision: D23578869
fbshipit-source-id: 5392eca1c
Summary: This diff adds some tests run with ARC/non-ARC compiled objective-c sources.
Reviewed By: ezgicicek
Differential Revision: D23542135
fbshipit-source-id: 8e089666b
Summary:
This would previously print that we ran out of fuel even if we didn't
and we simply reached a normal form.
Reviewed By: ezgicicek
Differential Revision: D23575571
fbshipit-source-id: 37d02ca8d
Summary:
This diff adds a new experimental checker for detecting size of objects in autorelease pool in ObjC. The basic mechanism is almost the same with the previous cost calculation:
* Autorelease pool size is increased at explicit `autorelease` call
* Autorelease pool size is set as zero by the `autoreleasepool` block.
While it only supports the explicit calls as of now, we will extend the checker to handle more cases in the following diffs.
Reviewed By: ezgicicek
Differential Revision: D23473145
fbshipit-source-id: 416488176
Summary:
This diff extends the cost_item json format to print the autoreleasepool_size field. Not yet, there
is no semantics for that code kind, so the results will always be zero with no traces.
Reviewed By: ezgicicek
Differential Revision: D23540665
fbshipit-source-id: 94442e376
Summary:
Limit communication bandwidth and serialisation burden by sending procedure filename strings (which are bounded at ~100 bytes) instead of serialising procnames through the socket to the scheduler (which are unbounded and have been seen to reach ~30kB in the worst case for templated procedures).
Context:
Under the restart scheduler, a worker working on a procedure X that discovers a race on a dependency Y it needs fails the computation of X and sends to the scheduler the procname Y. The next time X is about to be rescheduled, the scheduler checks whether Y is still being analysed, by checking if the lock for Y still exists. This check uses the procedure filename already, so we can send that instead.
Reviewed By: jvillard
Differential Revision: D23554995
fbshipit-source-id: 9828e71a2
Summary: Most of the time, when the procdesc of a callee is requested, all that is really required is the procedure attributes. However, requesting the procdesc may return `None` when the procedure is undefined (in Java, and soon for Clang too). So, change all callsites to using attributes instead, where possible.
Reviewed By: jvillard
Differential Revision: D23539422
fbshipit-source-id: 3b1a52d48
Summary: Given the brittleness of DB-indexing over marshalled values (one example exposed in D23191601), the serializer "for comparison" must be removed. Its only use is for source files, and these have an obvious, trivial (and possibly faster and less space hungry) protocol.
Reviewed By: jberdine
Differential Revision: D23499481
fbshipit-source-id: 57ad0ced6
Summary: Given the brittleness of DB-indexing over marshalled values (one example exposed in D23191601), the serializer "for comparison" must be removed. Serialized procnames are no longer used as keys, so their serializer can be converted to the one which preserves sharing (saving some space in the process).
Reviewed By: jberdine
Differential Revision: D23425123
fbshipit-source-id: 9c82da778
Summary:
Store model summaries in the `model_specs` database table instead of in spec files.
This table is populated when a new database is created by loading a dump of the `specs` table in the models database. This avoids the perf and reliability implications of ATTACHing the same, non-read-only models-DB by many processes.
- `BiabductionModels` is moved into `IR` so that `JsonReports` can access it.
- The binary `sqlite3` is now required on the host compiling infer.
Reviewed By: skcho
Differential Revision: D23191601
fbshipit-source-id: 1532481ee
Summary: Since the unique string ID used as a key is also human-readable, remove the now-redundant `proc_name_hum` field, which is only used for debug purposes anyway.
Reviewed By: jberdine
Differential Revision: D23446223
fbshipit-source-id: 5027066ee
Summary: There is still a bug in using marshalled values as keys (exposed up the diff stack) where structurally equal procnames get serialised to distinct keys. This diff addresses that bug by using the existing functionality for string unique IDs.
Reviewed By: ezgicicek
Differential Revision: D23133761
fbshipit-source-id: 3cbafb51b
Summary: There is no reason to write back to the database here, as the resulting procdesc is anyway stored in the summary. This is a source of non-determinism.
Reviewed By: skcho
Differential Revision: D23500220
fbshipit-source-id: 7434e6239
Summary: We don't report and IO Cost and are not planning to do so in the near future. Let's remove it from cost kinds.
Reviewed By: ngorogiannis
Differential Revision: D23499160
fbshipit-source-id: 7281da3af
Summary:
As title.
Facebook
Found this case by examining db.
>
D23394545
Time complexity of `_registerTabPreloadables` has **decreased** from `Top` to `O(1)`. Please make sure this is an expected change. You can inspect the trace to understand the complexity increase:
Reviewed By: ezgicicek
Differential Revision: D23448099
fbshipit-source-id: 108fd1ef2
Summary:
This diff adds translation of `dictionaryWithObjects:forKeys:count:`. In the previous implementation it was
translated as if it was `dictionaryWtihObjectsAndKeys:`, but their function parameters are different.
In this diff, it translates an array literal `NSDictionary* a = @ [ @"firstName": @"Foo", @"lastName":@"Bar" ];` to
```
n$1=NSString.stringWithUTF8:(@"firstName")
n$2=NSNumber.stringWithUTF8:(@"Foo")
n$3=NSNumber.stringWithUTF8:(@"lastName")
n$4=NSNumber.stringWithUTF8:(@"Bar")
temp1[0]:objc_object*=n$1
temp1[1]:objc_object*=n$3
temp2[0]:objc_object*=n$2
temp2[1]:objc_object*=n$4
n$3=NSDictionary.dictionaryWithObjects:forKeys:count:(temp2:objc_object* const [2*8], temp1:objc_object*const [2*8], 2:int)
```
where `temp` is an additional local variable declared as array.
See,
https://developer.apple.com/documentation/foundation/nsdictionary/1574184-dictionarywithobjects?language=objchttps://developer.apple.com/documentation/foundation/nsdictionary/1574181-dictionarywithobjectsandkeys
{F316854542}
Reviewed By: skcho
Differential Revision: D23447042
fbshipit-source-id: 14b7c3f2b
Summary: Added a model for copy constructor for `std::function`. In most cases, the SIL instruction `std::function::function(&dest, &src)` gives us pointers to `dest` and `src`, hence, we model the copy constructor as a shallow copy. However, in some cases, e.g. `std::function f = lambda_literal`, SIL instruction contains the closure itself `std::function::function(&dest, (operator(), captured_vars)`, hence, we need to make sure we copy the right value.
Reviewed By: ezgicicek
Differential Revision: D23396568
fbshipit-source-id: 0acb8f6bc
Summary:
Move Sqlite registered statements inside the functions that use them, since these are only ever used in one place.
Also, reformat some SQL.
Also, make `mark_all_source_files_stale` use an un-prepared statement, as it is used at most once per run, wasting resources if registered.
Reviewed By: skcho
Differential Revision: D23423844
fbshipit-source-id: 07362d19c
Summary: We only added suffix-stripped prefixes of props but this causes FPS if a component declares props that actually contain suffixes since we strip them when we are adding builder calls. The fix is to add both normal and stripped versions.
Reviewed By: skcho, Katalune
Differential Revision: D23375405
fbshipit-source-id: e99cb4dd8
Summary:
This diff finishes the migration from the specialization of methods that take blocks as arguments. Here we delete all the old code and change the way we model dispatch functions so that the tests pass.
- Remove the code for specializing the methods in biabduction.
- Remove the call flags `cf_with_block_parameters` that was only used in this algorithm.
- Removes models for dispatch functions.
- Adds models for dispatch functions as program transformation only in biabduction. To be added in other checkers in the future.
Reviewed By: ngorogiannis
Differential Revision: D23345342
fbshipit-source-id: b5e8542ce
Summary:
This preanalysis in general aims to create specialized clones of methods that have blocks as arguments and that are called with concrete closures, and then call these clone methods instead of the original ones.
One complication is what happens with the captured variables in the closure. What we do is we add them to the formals of the cloned method and passed them through to the concrete blocks.
We do this transformation in two steps:
1. Go through all the callers of methods with blocks as parameters, and create the clone methods. In this preanalysis we only create the attributes for the new method, not the code. We also update the call instructions in the callers to represent a call to the cloned method with updated arguments: we don't need to pass closures arguments anymore, we instead pass the captured variables as new arguments.
2. We add the corresponding code to the newly created clones: this means swapping the call to the block variable with a call to the corresponding block. Moreover, we add some of the new formals (that correspond to the captured variables) to the arguments of the call.
This diff implements step 1 of the analysis. The next diff D23216021 implements step 2.
Reviewed By: ngorogiannis
Differential Revision: D23109204
fbshipit-source-id: 91a5eb16b
Summary: There was a mismatch between formals and actuals in `std::function::operator()` because we were not passing the first argument corresponding to the closure.
Reviewed By: ezgicicek
Differential Revision: D23372104
fbshipit-source-id: d0f9b27d6
Summary: When we evaluate lambdas in pulse, we create a closure object with `fake` fields to store captured variables. However, during the function call we were not linking the captured values from the closure object. We address this missing part here.
Reviewed By: jvillard
Differential Revision: D23316750
fbshipit-source-id: 14751aa58
Summary:
Report errors found by running Topl on top of Pulse, when using
--topl-pulse. Topl tests now run on top of Pulse.
Reviewed By: jvillard
Differential Revision: D23030771
fbshipit-source-id: 8770c2902
Summary:
Two issues are fixed
1. For this diff, when the condition `curr_langauge_is Java` is removed in some part of the analysis, some c bufferoverrun tests are broken. This is fixed in this diff by inspecting if we are dealing with a `Size` alias referring to an `objc_internal_collection_array`.
2. Previously, when we are modifying mutable array in a loop, e.g. adding element to the array or removing element from the array, we are unable to give an estimable size of the array after exiting the loop. This is now fixed, and the corresponding FPs are resolved.
Reviewed By: skcho
Differential Revision: D23342350
fbshipit-source-id: 200f5261c
Summary:
This diff fixes `--reanalyze` option that is to reanalyze specific procedures by removing their
summaries. It was broken because it tried to store an empty summary with `Status.Analyzed`.
Reviewed By: ezgicicek
Differential Revision: D23344633
fbshipit-source-id: 1c4eca6c0
Summary:
The generated code used to contain Prune statements that had boolean
connectives in their conditions. After this commit, all conditions
should have no boolean connective (LNot, LOr, LAnd) at top-level; that
is, prune conditions should be atomic.
The main motivation behind this change is that (a) frontends follow this
convention, and (b) Pulse assumes this convention.
Reviewed By: jvillard
Differential Revision: D23022273
fbshipit-source-id: 1313328e4
Summary:
In the previous diffs, we implement enumerator in order to estimate the cost of for-each loop in ObjC, but when we have FP case when enumerator is used not in for-each loop. For example, the following code has top cost before the fix.
```
void nsarray_enumerator_linear_FP(NSArray* array) {
id obj;
NSInteger sum = 0;
NSEnumerator* enumerator = [array objectEnumerator];
while (obj = [enumerator nextObject]) {
sum += (NSInteger)obj;
}
}
```
Reviewed By: skcho
Differential Revision: D23294895
fbshipit-source-id: 50c7b359f
Summary:
When normalizing discovers new linear arithmetic facts in
`normalize_linear_eqs` we go around once more. Do the same when atoms
become linear equalities.
Reviewed By: skcho
Differential Revision: D23264425
fbshipit-source-id: b355875f3
Summary:
Mostly cosmetic except for a change in [solve_eq] to try harder at
normalization (improves unit tests!). Add more comments and do minor
renamings.
Reviewed By: skcho
Differential Revision: D23243629
fbshipit-source-id: 55bdaf8a8
Summary:
This function had become a bit hard to read and the part about embedded
atoms was not very clear and also a bit incomplete (need to handle "= 1"
and "≠ 1" too).
Reviewed By: skcho
Differential Revision: D23242216
fbshipit-source-id: 239fade97
Summary:
This does a bunch of things at once (sorry):
- Refactor atom/term normalisation so that terms that are really just
atoms become atoms.
- Use this to not bother adding special cases in the functions exported
in the .mli: `and_less_than`, `and_equal_binop`, `prune_binop`, etc.
all had special cases to avoid introducing terms that could be atoms.
That's not great because the same smarts wasn't applied to terms that
would only become atom-like after some normalisation, and led to weird
and duplicated code. Now it's much cleaner: just add the most
straighforward fact and normalise!
- Fix a bug: adding a new equality `x = linear` should *not* be done
using `Normalizer.merge_var_linarith` as this is an internal function
that assumes that `x` is the right representative in `x - linear`.
Instead, for abitrary equations of that form, `solve_eq` should be used.
- When `normalize_linear_eqs` discovers new linear equalities, normalize
again. Add fuel there too to avoid spending too much time doing that.
It could be that we don't need/want fuel there but then we'd need to
think very hard about why there's no infinite recursion possible and
that seems harder.
Reviewed By: skcho
Differential Revision: D23241282
fbshipit-source-id: e5b8c4759
Summary:
This is used for variable substitution and will often be a no-op when
normalising terms over and over again (after the first normalisation,
the expression should stay the same). The equivalent function for terms
was already being careful about not re-allocating identical terms so
extend that care to linear expression.
Reviewed By: skcho
Differential Revision: D23241601
fbshipit-source-id: b365eb87a
Summary:
This allows further normalisation now that terms contain linear
expressions in normal form.
Reviewed By: skcho
Differential Revision: D23241499
fbshipit-source-id: f8e4e759c
Summary:
Linear arithmetic is able to simplify more atoms, eg `x+y <= x+y`
becomes `True` by normalising to "lhs - rhs <= 0". This does the first
step of normalisation, but to get True in this example we also need to
substitute inside atoms according to the linear equalities, which is the
next diff (for now we only substitute variables inside atoms for other
variables or for constants).
Reviewed By: skcho
Differential Revision: D23241457
fbshipit-source-id: 0da0b545c
Summary:
More scaffolding, nothing creates `Linear _` terms yet. Some changes to
variables substitution to allow substituting variables for linear terms
(as well as constants and other variables).
Reviewed By: skcho
Differential Revision: D23241461
fbshipit-source-id: fc870255e
Summary:
This is needed for the rest of the stack that introduces a `Linear of
LinArith.t` variant in `Term.t` to enable more normalisation inside of
terms.
Reviewed By: skcho
Differential Revision: D23241353
fbshipit-source-id: ad765cd13
Summary:
Make term simplification a bit more structured and separate the
"simplification" phase from the "evaluating constant expressions" phase.
Also implement the latter for all possible terms.
Reviewed By: skcho
Differential Revision: D23241334
fbshipit-source-id: 2964aa477
Summary: Not much to see here, extracted to make further changes more readable.
Reviewed By: da319
Differential Revision: D23241335
fbshipit-source-id: 81181f23a
Summary:
`delete` works exactly like `free` so merge both models together. Also
move the `free(0)` test to nullptr.cpp as it seems more appropriate.
Reviewed By: da319
Differential Revision: D23241297
fbshipit-source-id: 20a32ac54
Summary:
Since this is where almost all of the reasoning is concentrated, let's
make sure we use it at every opportunity!
Reviewed By: skcho
Differential Revision: D23194224
fbshipit-source-id: fedb2811e
Summary:
Fix the FP when iterating through constant collection.
facebook
This fix is a hack for now.
Reviewed By: ezgicicek, skcho
Differential Revision: D23241338
fbshipit-source-id: e2e0c05f8
Summary:
As title.
This diff is co-authored by SungKeun Cho and me.
facebook
This diff is co-authored by skcho and me.
Original comments from skcho
"For the record:
1. Rory and I tried to write models for ObjC iterator.
2. We could not use Java's iterator semantics: In Java's, `hasNext` returns the size of collection, rather than a boolean, and which is used as a control variable. On the other hand, in ObjC, it calls only `nextObject`, not calling `hasNext`, and the return value of which is being checked as `null`.
3. We added an artificial field `objc_iterator_offset` to keep the index of the iterator, and the models added in this diff are handling that integer value.
A problem is that `array.objc_iterator_offset` is not included in the control variables, since the condition of the loop is `nextObject() != null` that does not include the iterator offset. We need to make `array.objc_iterator_offset` as a control variable, by changing the part collecting control variables.
"
Reviewed By: ezgicicek, skcho
Differential Revision: D22944278
fbshipit-source-id: 7e71b79c1
Summary:
`Obj.reachable_words` can be very slow on large values, so only call it in debug mode.
Also, measure the time we spend for compressing/storing the global type environment.
Reviewed By: jvillard
Differential Revision: D23264532
fbshipit-source-id: 4a9456ab7
Summary: Use `SqliteUtils.exec` where appropriate, reformat some queries for readability.
Reviewed By: ezgicicek
Differential Revision: D23240945
fbshipit-source-id: 24d921a3a
Summary:
Reset the state before each test so that adding tests doesn't affect
other tests by shifting the ids of their anonymous variables.
Reviewed By: skcho
Differential Revision: D23194171
fbshipit-source-id: 7b717f160
Summary:
Before: 3 modes: (where "lenient" build has warnings not crash the
build, while "strict" build errors on warning):
- opt (default): flambda optimisations + lenient build
- dev (recommand for dev): lenient build, no flambda
- test: strict build (used in tests), no flambda
Now:
- dev (default): *strict* build, no flambda
- opt: lenient build, flambda
- dev-noerror: lenient build, no flambda (use when you want to test
infer but there are build warnings)
The goal is to give faster feedback to infer developers and reduce the
amount of times diffs are sent with build warnings. Also it's now faster
to alternate between changing infer and running unit tests since test
mode is just dev mode.
Reviewed By: ngorogiannis
Differential Revision: D23167416
fbshipit-source-id: d663b6054
Summary: Moving specs to the DB missed out cleaning out all specs when reanalysing. This is the fix.
Reviewed By: jvillard
Differential Revision: D23188958
fbshipit-source-id: 5b50fdda8
Summary:
These are the only ones we need, it turns out the other types (string,
proc names, ...) were dead code. The changes the integer constants to
rational constants, to match the domain of the linear arithmetic engine.
Reviewed By: skcho
Differential Revision: D23164136
fbshipit-source-id: 755c3f526
Summary: In the frontend captured variables for blocks are added as formal parameters in procdesc at the beginning.
Reviewed By: dulmarod
Differential Revision: D23163619
fbshipit-source-id: 2bcbe9b9c
Summary:
There was a syntax error in the comment as `[code]` is interpreted as
code but backticks were used instead. This made `ocamlformat` produce a
warning but no error.
Reviewed By: skcho
Differential Revision: D23167273
fbshipit-source-id: a7fad10d5
Summary:
Instead of alternating between a normal form and a tree structure,
always keep a normal form. Except the normal form is not always fully
normalized. Overall, it's a bit faster than the previous iteration,
while being more precise! In particular, linear arithmetic aims at being
much more complete.
Reviewed By: skcho
Differential Revision: D23134209
fbshipit-source-id: 5f9ec6ece
Summary:
When implementing iterator, we find out that because some semantics of inferbo is Java-specific, we cannot simply use Java's `Collection` model for `NSCollection`.
So this diff writing `NSCollection` model separately.
This diff also extracts the common parts of `NSCollection` and `Collection` into `AbstractCollection` to combine the duplicate the parts.
Reviewed By: ezgicicek
Differential Revision: D22975159
fbshipit-source-id: daed3f99f
Summary:
We already take into account inheritance if the method inherits a known
modelled initializer method (e.g. Activity.onCreate()).
But if the method is explicitly marked as Initializer, we require its
overrides to be also marked.
This diffs fixes the behavior and makes it consistent, now this is
enough to annotate only parent class.
Reviewed By: jvillard
Differential Revision: D23135177
fbshipit-source-id: a21ff4a0e
Summary:
This is part of work aimed to reduce usage of language-agnostics modules
in Java-specific parts of nullsafe.
Reviewed By: ngorogiannis
Differential Revision: D23075333
fbshipit-source-id: 4029732b4
Summary:
This is one of two files in nullsafe that remains interface-less.
Let's fix it!
Reviewed By: jvillard
Differential Revision: D23056324
fbshipit-source-id: a6592f7ae
Summary: Before we were modelling `vector.end()` as returning a fresh pointer every time is was called. It is common to check if an iterator is not the `end()` iterator and proceed to dereference the iterator in that case. In such code pattern `vector.end()` is called twice and returns different fresh values which causes false positives. To fix this, we add a special internal field `__infer_model_backing_array_pointer_to_last_element` to a vector to denote its end. Now, every time we call `vector.end()` we return the value of this field. We introduce a new attribute `EndOfCollection` to mark `end` iterator as the existing `EndIterator` invalidation is not suitable when we need to read the same value multiple times.
Reviewed By: jvillard
Differential Revision: D23101185
fbshipit-source-id: fa8a33b58
Summary:
Constructing the report is done by reading all the summaries, and using certain parts thereof. However, the payloads, which typically account for the greatest size, are not used (with the exception of costs).
This diff splits the storage of summaries into analysis and report summaries, and only reads and deserialises the latter for the report phase. This makes a big difference for runs with a large number of procedures.
Reviewed By: jvillard
Differential Revision: D23105072
fbshipit-source-id: 359067a0f
Summary:
As preparation for splitting summaries into some of their components, and then iterating over only those when reporting (thus gaining performance) we need procnames in the table.
Also, this fixes the now-broken use case of `infer report` with spec files, by using `--procedures-filter` to restrict printing of summaries accordingly.
Reviewed By: skcho
Differential Revision: D23101853
fbshipit-source-id: 1ae878d8e
Summary: Implement specs storage in DB, apart from biabduction models which are still left in specs files.
Reviewed By: skcho
Differential Revision: D22795638
fbshipit-source-id: 140801d3f
Summary: Extra info helps in debugging frontend issues and such.
Reviewed By: ngorogiannis
Differential Revision: D23101854
fbshipit-source-id: 27024675d
Summary:
At the end of analysing a procedure we call `simplify
~keep:vars_live_in_pre_post`. Any variable not in
`vars_live_in_pre_post` is not mentioned anywhere else in the state and
therefore is not going to contribute constraints in callers of the
procedure (in other words: they're dead). We want to also forget
arithmetic facts about these variables as this is a good opportunity to
make the path condition smaller, sometimes by a lot!
The main issue is that dead variables may be useful intermediate terms
in the formula, eg trying to keep only facts about `x` in `y = x + 1 &&
y = 0` is going to lose a lot of precision. But, if a variable not in
`keep` is only mentioned in a simple atom `z = 42` atom, for example,
it's safe to forget about it, eg it's safe to remember only `x=0` in
`x=0 && z=42` (if only `x` is live).
In other words, we can get rid of all atoms containing variables not
transitively involved in other atoms that eventually involve live
variables. A graph problem! This is guaranteed not to forget anything
important and can still trim a lot of atoms in certain situations.
Reviewed By: skcho
Differential Revision: D22921313
fbshipit-source-id: 6d5db7cbe
Summary:
Perhaps a bit overkill to introduce all this extra complexity but it
makes the unit tests much more readable. In fact, this uncovered a bug
in the dead variable elimination!
Reviewed By: dulmarod
Differential Revision: D22925548
fbshipit-source-id: d1f411683
Summary:
Do not always add parens around sub-terms, and add more parens around
terms in atoms and normal forms when they can be confused with the atom
or normal form structure.
Reviewed By: skcho
Differential Revision: D22925549
fbshipit-source-id: 8646e96a5
Summary: These will change to more interesting outputs in the next diff.
Reviewed By: dulmarod
Differential Revision: D22921349
fbshipit-source-id: c58c6240a
Summary:
Add unit tests to pulse in order to write tests for the arithmetic
solver, because it is a pain to write programs to do that end to end.
Reviewed By: ezgicicek
Differential Revision: D22864607
fbshipit-source-id: 0a20a3593
Summary:
This is needed to make dune auto-updating of unit tests introduced in
the next diff cohabit peacefully with our tests to make sure code stays
correctly formatted wrt ocamlformat.
Also, more auto-formatting = better.
Reviewed By: da319
Differential Revision: D22865004
fbshipit-source-id: 91c47ab08
Summary:
Normalization is potentially expensive and its result should be
remembered if the formula keeps being used. In the future we might use
this to make normalization more incremental.
Also rename PathCondition.satisfiable -> is_unsat to match
PulseFormula.is_unsat.
Reviewed By: skcho
Differential Revision: D22728264
fbshipit-source-id: 7759b33ac
Summary:
Now that this is a cheap operation, use it whenever we are checking the
satisfiability of the path condition.
Reviewed By: skcho
Differential Revision: D22724373
fbshipit-source-id: df31c6010
Summary:
Pausing the experiment in favour of new PulseFormula. Can be resurrected
later.
Reviewed By: skcho
Differential Revision: D22576274
fbshipit-source-id: 76529d767
Summary:
This time it's personal.
Roll out pulse's own arithmetic domain to be fast and be able to add
precision as needed. Formulas are precise representations of the path
condition to allow for good inter-procedural precision. Reasoning on
these is somewhat ad-hoc (except for equalities, but even these aren't
quite properly saturated in general), so expect lots of holes.
Skipping dead code in the interest of readability as this (at least
temporarily) doesn't use pudge anymore. This may make a come-back as
pudge has/will have better precision: the proposed implementation of
`PulseFormula` is very cheap so can be used any time we could want to
prune paths (see following commits), but this comes at the price of some
precision. Calling into pudge at reporting time still sounds like a good
idea to reduce false positives due to infeasible paths.
#skipdeadcode
Reviewed By: skcho
Differential Revision: D22576004
fbshipit-source-id: c91793256
Summary:
Current handling of lambdas is quite rudimentary. Looking at test
results we can see that errors are all over the place: False Positives,
False Negatives and just plain wrong results.
These tests can be grouped in 2 sets:
1. Basic support which implies:
- understanding method signatures,
- providing comprehensible error messages.
2. Extended support with implies:
- understanding scoping of values captures in lambdas (needs proper aliasing analysis).
- understanding parametric nullability in generics (needs "some"
support for Generics in our Java frontend).
With follow-up patches I'll attempt to implement "Basic" support for
Lambdas. "Extended" support will be out of scope unless there's
significant demand.
Reviewed By: mityal
Differential Revision: D23058673
fbshipit-source-id: 621551cca
Summary:
Absence of kotlin-annotations led to warnings messages during tests compilation:
```
$ make
warning: unknown enum constant MigrationStatus.STRICT
reason: class file for kotlin.annotations.jvm.MigrationStatus not found
```
This doesn't affect test results but is annoying, hence the fix.
Reviewed By: mityal
Differential Revision: D23051769
fbshipit-source-id: e45fbe7be
Summary:
This is part of work aimed to reduce usage of language-agnostics modules
in Java-specific parts of nullsafe.
As usual, in this diff we don't convert everything and take some
shorthands.
Reviewed By: ngorogiannis
Differential Revision: D23054169
fbshipit-source-id: 70913ddfd
Summary:
Note that the current implementation lists classes and methods (not
<class:method> pairs.
This is bit clowny, but works for now (given the list of classes is small).
So I'll leave it out of the scope of this diff.
Reviewed By: ngorogiannis
Differential Revision: D23076671
fbshipit-source-id: 71c94ebd9
Summary: This diff separates purity analysis and its reporting, since sometimes we want to use the purity analysis results in other checkers, but don't want to report purity issues.
Reviewed By: ezgicicek, jvillard
Differential Revision: D23054913
fbshipit-source-id: 12cc1fc42
Summary: Since there is no discernible downside in using the write daemon unless in single-thread mode or in buck, make it only depend on these circumstances, not a command line flag.
Reviewed By: skcho
Differential Revision: D23004451
fbshipit-source-id: 5c1d06ed1
Summary:
This is part of work aimed to reduce usage of language-agnostics modules
in Java-specific parts of nullsafe.
Reviewed By: artempyanykh
Differential Revision: D23052773
fbshipit-source-id: aacd07f27
Summary:
This is part of work aimed to reduce usage of language-agnostics modules
in Java-specific parts of nullsafe.
Reviewed By: artempyanykh
Differential Revision: D23052339
fbshipit-source-id: 665126957
Summary:
This is part of work aimed to reduce usage of language-agnostics modules
in Java-specific parts of nullsafe.
This diff:
1/ migrates AssignmentRule
2/ passes Procname.Java.t around from the start of analysis.
For now we have places where we have both procname and java procname.
This is fine as we can get rid of it in follow up diffs.
Reviewed By: artempyanykh
Differential Revision: D23052226
fbshipit-source-id: 0125de297
Summary:
This is part of work aimed to reduce number of language-agnostic methods
used in Nullsafe codebase.
Reviewed By: artempyanykh
Differential Revision: D23052328
fbshipit-source-id: 2b69f5f7a
Summary:
This diff adds a false positive test that is introduced imprecise loop
invariant detection.
In the example, `i` and a temp variable `$irvar = get_size(arr)` are
all included in the control variables, so the complexity becomes
quadratic. The problem is that `$irvar` is not addressed as a loop
invariant:
* `is.read` is analyzed as modifying a global
* all return variables, including `$irvar`, of unmodeled functions are invalidated
Reviewed By: ezgicicek
Differential Revision: D23051414
fbshipit-source-id: 011fd086b
Summary: To avoid dead store false positives we skip initialization of a variable that has an `unused` attribute. However, this causes uninitialized value false positives when the variable is later used in macros. To fix this, instead of skipping initialization we record the information about `unused` attribute in local variable data that we can later use for filtering out dead store issues.
Reviewed By: jvillard
Differential Revision: D22868050
fbshipit-source-id: 4a2d0e680
Summary:
In the current implementation, we record 3 states: modelled internally,
modelled intenally, not modelled.
The follow up diffs will need to distinct first- and third- party, given
annotated signature. This diff makes it possible.
Reviewed By: artempyanykh
Differential Revision: D22977962
fbshipit-source-id: b8f3616b5
Summary:
We get duplicated variable declaration instruction for primitive type variable initialized using list initializer, e.g.
```
int* p{nullptr};
```
This happens because we add variable declaration instruction when we translate both `DeclStmt` and `InitListExpr`. To fix this, we do not add the duplicated variable declaration when we translate `InitListExpr`.
Reviewed By: jvillard
Differential Revision: D22844726
fbshipit-source-id: 422806924
Summary:
Synthetic/autogenerated methods/fields usually contain `$` in their names.
Reporting nullability violations on such code doesn't make much sense
since the violations are not actionable for users and likely need to be
resolved on another level.
This diff contributes:
1. Several test cases that involve synthetic code of different
complexity.
2. Code that handles some particular types of errors (but not all!).
Reviewed By: mityal
Differential Revision: D22984578
fbshipit-source-id: d25806209
Summary:
Use `Typ.t` for parameter types in procnames, instead of `JavaSplitName.t`. This allows more precise and usable types stored in procnames, as well as not using the string-based representation of `JavaSplitName` which meant parsing strings whenever one needed a `Typ.t`.
This also makes `JavaSplitName` dead code, so it is removed.
Reviewed By: skcho
Differential Revision: D20500423
fbshipit-source-id: a72728e3f
Summary: Since types from the java frontend are a subtype of `Typ.t` provide the means to check and enforce that for return types in procnames.
Reviewed By: ezgicicek
Differential Revision: D20495527
fbshipit-source-id: b99c784af
Summary:
`java_type` aka `JavaSplitName.t` is a pair of strings. It's used in a procname to store the return type. Replace that with `Typ.t`.
This is step one of deleting `JavaSplitName`. Step two will be to replace parameters with `Typ.t`.
Reviewed By: skcho
Differential Revision: D20323748
fbshipit-source-id: f0029c3ca
Summary:
This diff adds an option `max-jobs` that restrict the number of jobs
running simultaneously.
Reviewed By: ngorogiannis
Differential Revision: D22978328
fbshipit-source-id: 544153c1c
Summary:
Currently, the notion of a third party method signature (corresponding to
a single record in a .sig file) is scattered between unique_repr and
nullability.
Logically, ThirdPartyMethod.t should not "know" about unique_repr
because this is a detail needed only for searching for a method
definition in `.sig` storage.
This diff:
1. Introduces ThirdPartyMethod.t which exposes exactly information
stored in .sig file. We are going to use this abstraction more in follow
up diffs.
2. Moves functionality operating with `unique_repr` to the module
responsible for searching in the repository.
3. Deletes `ThirdPartyMethod.nullability` abstraction - we don't need it as we can just store ThirdPartyMethod.t where was previously `nullability` stored.
4. Introduces `to_canonical_string` method that prints back third party
method in a format needed for `.sig` file, and a test ensuring that
parsing works back and forth consistently. We are going to use this
method in follow up diffs as well.
Reviewed By: artempyanykh
Differential Revision: D22950706
fbshipit-source-id: a0e72ccdb
Summary: As per title. Eases next diffs by making Summary the only source of truth for how spec files are accessed/stored.
Reviewed By: ezgicicek
Differential Revision: D22794742
fbshipit-source-id: 0ee20ec1c
Summary: New, experimental for now, integration with buck on Java using the `#infer-java-capture` flavor.
Reviewed By: artempyanykh
Differential Revision: D22187748
fbshipit-source-id: 62cdafe6b
Summary:
This diff removes a dead field `Struct.subs`, which was used in
heuristics finding methods from sub-classes.
Reviewed By: ezgicicek
Differential Revision: D22945346
fbshipit-source-id: 4b3bf0093
Summary: This diff is on par with this change, with the same motivation
Reviewed By: artempyanykh
Differential Revision: D22924891
fbshipit-source-id: 578ca5869
Summary:
This change will simplify the further refactoring I am doing in follow
up diffs.
Logically, AssignmentRule is about passing actual value to the declared
one, and the proper abstactions for this is exactly AnnotatedNullability
and InferredNullability.
This will also make the client less error prone (no way to call if you
don't have a declaration and actual value to compare).
In the next diff, we will do the same in DereferenceRule.
Reviewed By: artempyanykh
Differential Revision: D22923779
fbshipit-source-id: 5f8f0931c
Summary: Before, `NSCollection` are modelled like array, but this will have issue when we want to say add object to array. This diff changes `NSCollection`'s model to use Java's Collection model.
Reviewed By: ezgicicek
Differential Revision: D22840079
fbshipit-source-id: b944b743b
Summary:
In Cost/Inferbo checkers, it tried to find a subclass with heuristics when a method of interface or
abstract class is called. While it makes preciser analysis results in general, sometimes introduced
tricky FPs since the behavior of the heuristics depends on targets. This diff revert the heuristics
to suppress the FPs.
Reviewed By: ngorogiannis
Differential Revision: D22924485
fbshipit-source-id: 9f151231f
Summary: Implement `alloc` and implement initialisation method that uses the return value of `alloc`, i.e. `NSString.init`.
Reviewed By: ezgicicek
Differential Revision: D22840080
fbshipit-source-id: 47a7523e3
Summary:
This diff translates for-in block in objc as a simple for-loop. For example,
`for (item_type item in items) { body }` is translated to
```
NSEnumerator *enumerator = [items objectEnumerator];
item_type item;
while (item = [enumerator nextObject]) { body }
```
Reviewed By: ezgicicek
Differential Revision: D22841524
fbshipit-source-id: 296ee84df
Summary:
Capture of certain Kotlin files fails at the javalib level (probably,
some unanticipated bytecode pattern). We however can't just completely
skip Kotlin class-files during capture as those have NotNull/Nullable
annotations on methods/params that are important for Java <- Kotlin
interop (using Kt classes from Java).
Instead we can skip translation of concrete methods from Kotlin classes
while retaining method signatures with annots in the TEnv.
Reviewed By: ngorogiannis
Differential Revision: D22897638
fbshipit-source-id: 67909aa43
Summary:
There are two ways to suppress it.
1. Field level suppression annotation (was already tested). This will
apply to all constructors.
2. Constructor level annotation (this is what this test does). Sometimes
there are "fake" constructors that are not intended to be called in
prod, they might leave some fields not initialized.
Note that there are two ways to add a suppression; one has a known
problem that is documented here.
Reviewed By: artempyanykh
Differential Revision: D22864869
fbshipit-source-id: f95aaa26a
Summary: `alloc` actually allocates the array, not the `init`. Let's reflect that in our models.
Reviewed By: roro47
Differential Revision: D22864198
fbshipit-source-id: 687f9f247
Summary:
It's typically used inside another ~fold argument and it gets too
verbose.
Reviewed By: da319
Differential Revision: D22846501
fbshipit-source-id: 2fdd4271f
Summary: `Array.copyOf` model has nothing to do with Collections. Refactor it to the appropriate place
Reviewed By: roro47
Differential Revision: D22844638
fbshipit-source-id: 48dde4d53
Summary:
When expressions use generics or typecasts, CFG contains intermediate `_fun_cast` nodes which break some nullsafe heuristics and lead to false positives.
This affects different types of checks (null-check on assignment expressions, `map.containsKey` checks in assignment expressions, regular typecasts, etc.
See added tests for examples.
Reviewed By: mityal
Differential Revision: D22815631
fbshipit-source-id: 80d444b1c
Summary:
We check for supertypes in Java. Why not ObjC?
Would be good to get dulmarod's input here.
Reviewed By: roro47
Differential Revision: D22817126
fbshipit-source-id: 52c1c3f3c
Summary: This diff adds an option to shard spec files in `infer-out/specs`. For some big analysis targets, there can be too many of spec files in the one directory, which slows down IO speed for reading the spec files.
Reviewed By: jvillard
Differential Revision: D20002128
fbshipit-source-id: bd7722883
Summary: There used to be `JoinAfter n` mode where we would try to join `n` states instead of always making disjunctions. It got deleted in D14258485 and Pulse's underlying (pre-disjuncts) domain doesn't even have a join operation. `NeverJoin` mode is not useful in Pulse anymore: pulse will diverge or OOM if we don't limit the number of disjuncts. It is also not used by any other analyzer. Let's remove it.
Reviewed By: jvillard
Differential Revision: D22817425
fbshipit-source-id: 1e658f11d
Summary: This diff refactors Java specific `PatternMatch` functions into its own module. When `PatternMatch.ml` was originally created, it was mainly for Java but now it also supports ObjC. Let's refactor it to reflect the Java/ObjC separation: move all functions that operate on Java procnames into Java submodule.
Reviewed By: jvillard
Differential Revision: D22816504
fbshipit-source-id: ff6b64b29
Summary:
This avoids wasting potentially large amount of work in some
pathological situations. Suppose `foo()` has D specs/disjuncts in its
Pulse summary, and consider a node that calls `foo()` N times, starting
with just one disjunct:
```
[x]
foo()
[x1, ..., xD]
foo()
[y1, ..., yD^2]
foo()
...
```
At the end we get `D^N` disjuncts. Except, currently (before this diff),
we prune the number of disjuncts to the limit L at each step, so really
what happens is that we (very) quickly reach the L limit, then perform
`L*D` work at each step (where we take "apply one pre/post pair to the
current state" to be one unit of work), thus doing `O(L*D*n)` amount of
work.
Instead, this diff counts how many disjuncts we get after each
instruction is executed, and we already reched the limit L then doesn't
bother accumulating more disjuncts (as they'd get discarded any way),
and crucially also doesn't bother executing the current instruction on
these extra disjuncts. This means we only do `O(L*n)` work now, which is
exactly how we expect pulse to scale: execute each instruction (in
loop-free code) at most L times.
This helps with new arithmetic domains that are more expensive and
exhibit huge slowdowns without this diff but no slowdown with this diff
(at least on a few examples).
Reviewed By: skcho
Differential Revision: D22815241
fbshipit-source-id: ce9928e7c
Summary:
The old --topl-only is now --topl-biabd-only, and there's also
--topl-pulse-only. This is WIP: the latter runs pulse, but it doesn't yet
extract Topl errors from pulse summaries. (The citv part of pulse path
conditions appears to have the necessary information.)
Reviewed By: jvillard
Differential Revision: D22815250
fbshipit-source-id: a01792945
Summary:
This diff:
1. Adds general capability to model any field as nullable /
non-nullable.
2. Uses it for Boolean.TRUE and Boolean.FALSE
Reviewed By: artempyanykh
Differential Revision: D22794226
fbshipit-source-id: 95f586592
Summary: D17500386 had added the ability to give symbolic values on functions returning exceptions. However, this might cause FPs or cryptic complexity reports (especially with subclass heuristics). This diff aims to revert it back.
Reviewed By: skcho
Differential Revision: D22764266
fbshipit-source-id: 1615544d8
Summary: We model internal builtin `__new` function to return a non-null value. This fixes nullptr_dereference false positives where we explicitly check the result of a function call for nullptr when the function returns a newly created object.
Reviewed By: jvillard
Differential Revision: D22772217
fbshipit-source-id: 37d209697
Summary:
This stopped compiling on my Debian and it seems hard to fix. It was
already having compilation issues between osx and Linux but here I don't
know how to detect which type it wants since the OS is Linux too.
Reviewed By: ezgicicek
Differential Revision: D22728282
fbshipit-source-id: 818ae87e6
Summary:
This step does extra normalization so it's useful to see what's going on
when debugging. Log stuff in the html debug of the exit node.
Reviewed By: da319
Differential Revision: D22596248
fbshipit-source-id: cde3bbb6c
Summary:
Pulse has models for iterators that make them use a fake field to
remember the element of the collection they point to. But, not all
methods are modelled, and some of them look at the real field, eg
`operator==`. Since we don't update the real field in the model, this
causes imprecision.
The imprecision was visible in pudge.
Reviewed By: skcho
Differential Revision: D22576003
fbshipit-source-id: 2af6be646
Summary:
The java frontend used an unsound flow insensitive class analysis to devirtualize
some virtual calls. We remove it and let the recent devirtualizer preanalysis do the job.
This unsoudness in the Java frontend may have been here for a long time. Removing it may
modify several analysis results (specially Nullsafe) where virtual calls may look different
now.
Reviewed By: jvillard
Differential Revision: D22662739
fbshipit-source-id: c45296dce
Summary:
Changing the order of the superclasses of a struct exposes a bug in both biabduction and the devirtualiser where a method would be resolved into a still virtual method (an interface method).
The reason is that we don't check whether a super class is an interface before exploring it, and seemingly we assume that there is only one (first) superclass worth exploring. This also ignores multiple inheritance in C++.
To fix this, refactor the resolution to a complete search (not just the first super class!) which ignores Java interface methods. Also moved it to `Tenv` so that both biabduction and the devirtualiser can use it.
Reviewed By: jvillard
Differential Revision: D22357488
fbshipit-source-id: 54b96c1f4
Summary: We recently changed the translation of NSArray literals in a way that we pass a different type of argument to `arrayWithObjects:count`, such that the biabduction model doesn't work anymore. So we remove the model for now.
Reviewed By: skcho
Differential Revision: D22691611
fbshipit-source-id: 03cd940ed
Summary:
Merging global type environments for Java needs some form of non-trivial type definition merging because:
- The frontend is likely non-deterministic, so it can capture the same type differently.
- There are classes that appear with two distinct definitions (usually ordered by inclusion) when one is produced by an ABI-like compilation process (so only public fields/methods would appear for example), and one full version.
- The frontend produces dummy versions (empty definitions), and full ones.
- The location information is variously missing/present.
This diff tries to strike a balance between a full semantic merge (which depends on the frontend/buck integration) and the current code which "merges" by clobbering old definitions with new ones.
One side-effect of this diff is that code cannot expect a special order for supers.
Reviewed By: jvillard
Differential Revision: D22630286
fbshipit-source-id: fc66c7000
Summary:
This diff adds translation of `arrayWithObjects:count:`. In the previous implementation it was
translated as if it was `arrayWithObjects:`, but their function parameters are different.
In this diff, it translates an array literal `NSArray* a = @ [ 2, 3 ];` to
```
n$1=NSNumber.numberWithInt:(2:int)
n$2=NSNumber.numberWithInt:(3:int)
temp[0]:objc_object*=n$1
temp[1]:objc_object*=n$2
n$3=NSArray.arrayWithObjects:count:(temp:objc_object* const [2*8],2:int)
a:NSArray*=n$3
```
where `temp` is an additional local variable declared as array.
See,
https://developer.apple.com/documentation/foundation/nsarray/1460145-arraywithobjectshttps://developer.apple.com/documentation/foundation/nsarray/1460096-arraywithobjects?language=objc
Reviewed By: jvillard
Differential Revision: D22631305
fbshipit-source-id: 5be0a55d4
Summary:
Add a test to the repo to try and detect perf regressions in pulse.
Currently analyzed in ~0.1s. With `--pudge`, takes ~10s.
Sledge does eager normalization and canonicalization when incorporating new facts into formula contexts and the algorithm is polynomial in the number of equalities. This example generates one equality per location in the array => boom. This bypasses the recency model of arrays because the formula needs to be constructed before it can be simplified to get rid of dead variables.
The new arithmetic is not as complete as sledge's algorithm but linear in time. We could use it to simplify the formula *before* passing it to sledge. In fact, that was the original motivation.
Reviewed By: skcho
Differential Revision: D22574366
fbshipit-source-id: e9044ae09
Summary:
When applying function summaries, we are careful not to violate the
summary's assumptions about non-aliasing. For example, the summary we
generate for `foo(x,y) { *x = *y; }` will have `x` and `y` be allocated
to two different `AbstractValue.t` in the heap, representing
disjointness.
However, the current logic is too coarse and also rejects passing the
same pure value to functions that made no assumption about them being
equal or different, eg `goo(int x,int y) { int z = x + y; }`. This is
because the corresponding `AbstractValue.t` are different in the
callee's summary, but are represented by only one same value in callers
such as `goo(i,i)`.
This diff restricts the "don't violate aliasing" condition to only
consider heap-allocated values. This is consistent with separation logic
by the way: we use the implication `x|->- * y|->- |- x≠y`, which is
valid only when both `x` and `y` are both allocated in the heap as in
the left-hand-side of `|-`.
Reviewed By: skcho
Differential Revision: D22574297
fbshipit-source-id: 206a18499
Summary:
This will allow all the analyses to be able to call closures without any special treatment: we transform the call to variables that point to closures into normal function calls. We treat only ObjC blocks at the moment, with C++ lambdas to be done as a next step.
We aimed to achieve certain results in Pulse (see tests: avoid memory leaks and NPEs FPs) while also keeping the biabduction analysis working as before.
We also checked that for the examples analyzed Pulse behaves like the correct semantics of ObjC programs with blocks.
Reviewed By: jvillard
Differential Revision: D22547333
fbshipit-source-id: efe56ed51
Summary: Lambda is called using `operator()`. We need to know the information of captured variables when `operator()` procedure is being analysed. This diff records lambda captured variables in `operator()` procdesc. The complication arises from the fact that procdesc for `operator()` is created before translating lambda expression or during the translation of lambda expression where captured variables are translated. To solve this issue, we update existing `operator()` procdesc attributes with captured variable information when we translate lambda expression.
Reviewed By: jvillard
Differential Revision: D22374495
fbshipit-source-id: 44909adea
Summary:
This diff fixes a bug that eval_arr misses the case when a stack
variable points to an array.
Reviewed By: ezgicicek, roro47
Differential Revision: D22596999
fbshipit-source-id: 7c4a13d01
Summary: Add cost model for most common `NSString` functions in cost analysis
Reviewed By: skcho
Differential Revision: D22433005
fbshipit-source-id: 2f57bbda9
Summary: If a node is unreachable and the cost of the node is Top, we were giving Top cost :( Let's fix it.
Reviewed By: skcho
Differential Revision: D22548269
fbshipit-source-id: d79743669
Summary:
We update the type of captured variables to include information about capture mode (`ByReference` or `ByValue`) both for procdesc attributes and the closure expression.
For lambda: closure expression now contains correct capture mode for capture variables. Procdesc still does not contain information about captured variables which we will address in the next diff.
For objc blocks: at the moment all captured variables have mode `ByReference`. Added TODOs to fix this.
Reviewed By: jvillard
Differential Revision: D22572054
fbshipit-source-id: 4c88678ee
Summary: This diff prints where the cost becomes top by calling `html_debug_new_node_session`. This will print them in the start node of the procedure in html. There are already printing functions in `get_instr_node_cost_record`.
Reviewed By: ezgicicek
Differential Revision: D22547578
fbshipit-source-id: 257e957c0
Summary:
The frontend was hackily adding protocols as superclasses in the tenv, with the implicit encoding that the first element in the list was the actual superclass. This was clearly very fragile.
Protocols are not used in the backend at the moment, so for now we will remove them from the list of superclasses to have more consistency in the tenv.
Reviewed By: ngorogiannis
Differential Revision: D22525078
fbshipit-source-id: 2aef1fab1
Summary: This diff extends the value domain to express multiple markers.
Reviewed By: ngorogiannis
Differential Revision: D22524864
fbshipit-source-id: b8e4af2eb
Summary:
As title
Model `NSString` as `JavaString`.
Since `NSArray` does not contain information about its type of element, we do not use associate string with collection as in Java and C++. In Java, String model is implemented using java collection, and for C++, string model is implemented using vector.
So instead, we use existing `JavaString` model.
Reviewed By: skcho
Differential Revision: D22431949
fbshipit-source-id: 7cdde1ad7
Summary:
In order to allow implementations of the single Fol interface using
multiple backend first-order logic solvers, add explicit definitions
of terms and formulas in the Fol module, and implement Context in
terms of them.
The Fol interface supports freely mixing Terms and Formulas, in
particular there is `Term.ite : cnd:Formula.t -> thn:Term.t ->
els:Term.t -> Term.t` which allows Formulas to appear in Terms. The
Fol implementation performs enough normalization to enable using an
internal representation of terms that is strictly partitioned into
"theory terms" and "formulas", which are stratified below "conditional
terms" and then below "general terms". This partitioning and
stratification enables using backend solvers that do not support
mixing formulas in terms.
Reviewed By: jvillard
Differential Revision: D22170506
fbshipit-source-id: a014ee7d7
Summary: To avoid NULLPTR_DEREFERENCE false positives we want to model some functions as returning non-null. A new flag --pulse-model-return-nonnull allows us to provide a list of such functions.
Reviewed By: ezgicicek
Differential Revision: D22431564
fbshipit-source-id: 9944c7382
Summary: Make the module interface safe wrt closing the classpath channel when done, plus reducing the exposed API.
Reviewed By: skcho
Differential Revision: D22411685
fbshipit-source-id: 11316c577
Summary: `addAll` adds elements one by one and hence takes linear time. We didn't have a model for this and considered it O(1).
Reviewed By: skcho
Differential Revision: D22375157
fbshipit-source-id: 65b82bfae
Summary: This diff prevents printing line numbers of loop in the trace description, which helps to keep the same descriptions even when the line number of a function is changed in tests.
Reviewed By: ezgicicek
Differential Revision: D22375584
fbshipit-source-id: 676d1a7cc
Summary:
This one is observed to be more memory efficient. Intuitively, maps need
to be re-allocated more often than lists for balancing. In pulse, we'll
often only ever add new values, in increasing order (when they are fresh
variables created as we symbolically execute the program), which pushes
maps into their worst-case allocation pattern. At least I suspect that's
what happens. With lists, this case is handled much better as lists are
not re-allocated when adding elements.
This is somewhat confirmed by benchmarking and observing GC stats.
Reviewed By: skcho
Differential Revision: D22140908
fbshipit-source-id: 29815112f
Summary:
Messed up the aggregation of GC stats in the previous commit.
It's cleaner to have GC stats (and analysis time) outside of
BackendStats as the rules for computing them is different than the rest,
eg notice how "analysis time" needed to be corrected at the end of the
run, and similarly for GC stats. Thus, refactor this part.
Also output different aggregations of GC stats: +/max/average.
Reviewed By: skcho
Differential Revision: D22332496
fbshipit-source-id: eefd9dd72
Summary:
Keyword `thread_local` in cpp allows us to create a variable with thread storage duration, meaning that the object's lifetime begins when the thread begins and ends when the thread ends.
We get `NULLPTR_DEREFERENCE` false positive for `thread_local` variable since we reallocate it in the `VariableLifetimeBegins` metadata instruction and we do not see further updates to the variable. To solve the issue we special case `VariableLifetimeBegins` instruction for global variables.
Reviewed By: jvillard
Differential Revision: D22284135
fbshipit-source-id: 13c14ef90
Summary: Create test for the most common unmodeled function in inferbo that acts as control variable.
Reviewed By: skcho
Differential Revision: D22331168
fbshipit-source-id: 1913682db
Summary: Add objc test for customized class and blocks. Mostly sanity test.
Reviewed By: ezgicicek
Differential Revision: D22043918
fbshipit-source-id: 917deeea7
Summary:
This diff adds a model of `File.listFiles` as returning an array with
a symbolic length.
Reviewed By: ezgicicek
Differential Revision: D22332258
fbshipit-source-id: 6ca593b8b
Summary: This diff adds support for `com.facebook.litho.sections.Section` which mimics the behavior for `com.facebook.litho.Component`.
Reviewed By: skcho
Differential Revision: D22309039
fbshipit-source-id: 3510441a8
Summary:
Following from previous diff.
**Idea** - 80% of functions with Top cost are caused by calling top-costed callees, i.e. callee's Top cost is simply propagated to its transitive callers, so the aim is to investigate such root callees along with the number of their transitive callers.
Consider the following code
```
void bar1() {
// top cost function
}
void bar2() {
// another top cost function
}
void baz(){
// baz have top cost because of bar
bar1();
}
void foo() {
// goo have top cost because of baz
baz();
bar2()
}
```
Clearly, the root cause of the foo being top cost is `bar1` and `bar2`.
1. When we are analyzing `baz`, we know that it calls `bar1`, which is top cost, so we record that `baz = { T, bar1 } `.
2. Now, say we are analyzing foo.
When we analyze the call to `baz`, we found out that the top cost of `baz` is caused by `bar1`, so we record `foo = { T, bar1 }`.
When we analyze the call to `bar2`, we know that `bar2` is top cost, but since at this stage we only want to deal with the first top cost function we met, so we ignore it.
Since we are keeping track of top cost function by examining the `Call` instruction, we would expect to see two log of `bar1` in the result. The test plan confirms it.
Reviewed By: ezgicicek
Differential Revision: D22231457
fbshipit-source-id: 45d48e4a7
Summary:
New `debug` command takes over from `explore` the `--procedures`, `--source-files` functionality and adds `--global-tenv` for printing the global type environment.
Also, uncrustify printing of type environments.
Reviewed By: jvillard
Differential Revision: D22284807
fbshipit-source-id: 9c6fb0c7a
Summary:
Log stats obtained via `Gc.stat ()` for various phases:
- capture (doesn't include child infer processes created by the build
system)
- analysis
- worker processes of the analysis, aggregated
- reporting phase
- total GC stats for the main infer process
Reviewed By: jberdine
Differential Revision: D22140131
fbshipit-source-id: b0ee39559
Summary:
We already had a heuristic to deal with assignment expressions, but it
relied on the very previous CFG node to have a non-empty list of instrs.
In some cases, however, this previous node is a Join_node with no instrs,
so we need to take one more step back to find what we're looking for.
I've also added a bit more logging around this functionality, so it's
easier to debug/tune in future.
Reviewed By: ngorogiannis
Differential Revision: D22282930
fbshipit-source-id: 024eec145
Summary:
This model is very important in the analysis of ObjC classes because the pattern
```
- (instancetype)init {
if (self = [super init]) {
...
}
return self;
}
```
is very common, so we need to know that if the super class is `NSObject`, the implementation of `init` is returning `self`, otherwise it's a skip function and we don't get the correct spec for the function. We fix some memory leak FP with this model, see test.
Reviewed By: ezgicicek
Differential Revision: D22259281
fbshipit-source-id: 3ee48c827
Summary:
We need to check if `folly::Optional` is not `folly::none` if we want to retrieve the value, otherwise a runtime exception is thrown:
```
folly::Optional<int> foo{folly::none};
return foo.value(); // bad
```
```
folly::Optional<int> foo{folly::none};
if (foo) {
return foo.value(); // ok
}
```
This diff adds a new issue type that reports if we try to access `folly::Optional` value when it is known to be `folly::none`.
Reviewed By: ezgicicek
Differential Revision: D22053352
fbshipit-source-id: 32cb00a99
Summary: This linters were not used much anymore, so we can delete them.
Reviewed By: ngorogiannis
Differential Revision: D22233895
fbshipit-source-id: f31180a05
Summary: There is now a compilation check for UNAVAILABLE_API_IN_SUPPORTED_IOS_SDK so this check is less useful. Also the check REGISTERED_OBSERVER_BEING_DEALLOCATED is useful only in an old version of iOS.
Reviewed By: ngorogiannis
Differential Revision: D22231851
fbshipit-source-id: 72151fef5
Summary:
Extend BasicCost to BasicCostWithReason which contains a record of the form
```{cost: BasicCost.t; proc_name_list: Procname.t list}```
This is done so that we can keep track of top cost function.
So the idea is that 80% of functions with Top cost are caused by calling top-costed callees, i.e. callee's Top cost is simply propagated to its transitive callers, so the aim is to investigate such root callees along with the number of their transitive callers.
Therefore, we create an extension that match `cost` to the root cause function.
This diff only handles the extension. Details about how we update the root cause function is in the next diff.
Reviewed By: skcho
Differential Revision: D22158717
fbshipit-source-id: 6498d904f
Summary:
This diff tries to support a specific form of linked list iteration in Java.
```
while (p != null) {
p = p.getNext();
}
```
This example was a constant cost before because the cost checker could not detect that it is an iteration on a linked list.
The heuristic this diff implemented is:
(1) `p = p.getNext()`: It tries to find this specific form of assignment. Then, it increments `p.linked_list_index` by 1. Note that `linked_list_index` is a virtual field for keeping an index in the linked list. Its initial value is always 0.
(2) At `p != null`, it tries to prune the value of `p.linked_list_index`: the upper-bound of `p.linked_list_index` is pruned by `<= p.linked_list_length`. Here again, `p.linked_list_length` is also a virtual field to denote the length of the linked list.
Reviewed By: ezgicicek
Differential Revision: D22234892
fbshipit-source-id: 2fee176bb
Summary: Log unmodeled function in cost analysis and send result to scuba.
Reviewed By: ezgicicek
Differential Revision: D22158510
fbshipit-source-id: c6eade67e
Summary: This continues on the previous diff by removing the model for `__bridge_transfer` in biabduction. This also had the name __free_cf which we kept for compatibility with biabduction until now but that we can now change.
Reviewed By: ezgicicek
Differential Revision: D22207396
fbshipit-source-id: 7a175eca6
Summary: These models for Memory Leaks have been ported to Pulse, so we can remove the models in biabduction and corresponding tests.
Reviewed By: skcho
Differential Revision: D22206287
fbshipit-source-id: e17499ad3
Summary:
Move the implementation of implicit getters and setters from the biabduction to the clang frontend so these methods are accessible to all the checkers.
*Background*: In Objective-C when properties are created in the interface of a class, the compiler creates automatically the instance variable for it and also the getter and setter in the implementation of the class. In the frontend we collect the information about which method is the implicit getter and setter of which instance variable (we get the method declaration but not the implementation), and here we add the implicit implementation.
Reviewed By: skcho
Differential Revision: D22187238
fbshipit-source-id: 76e0508ed
Summary: Let's make package name match the directory name to follow Java's file lookup conventions
Reviewed By: skcho
Differential Revision: D22183964
fbshipit-source-id: b9958b975
Summary:
Document FP due to imprecision in tracking outer lock release. In a nested `synchronized` block the outer release is not registered by the abstract domain. The reason is that HIL is not resolving what `$bcvarX` is pointing to (in this case to `lockE`).
Reported by Andreea Costea.
Reviewed By: ezgicicek
Differential Revision: D22186240
fbshipit-source-id: 84e5e72b1
Summary:
There is a lot of subtlety in our parsing of buck targets on the command line, that is then just thrown away. Push this one level up, getting rid of the special case where in Clang mode if we only have "normal" targets we don't resolve them.
Also introduce a proper variant for buck target types.
Reviewed By: skcho
Differential Revision: D22160490
fbshipit-source-id: 500c1b12c
Summary:
This diff revises assignment semantics, so it can store/load from the
heap location.
Reviewed By: ezgicicek
Differential Revision: D22042823
fbshipit-source-id: 20d91bfc5
Summary:
Nullability of the assignment result is not refined in code snippets
like:
```
while ((a = foo.getA()) != null) {
nonNullableVal = a;
}
```
Let's add a test for this.
Reviewed By: jvillard
Differential Revision: D22136218
fbshipit-source-id: 206c368d6
Summary:
Better API for creating issue types:
- distinguish hidden/normal/dynamic issue types
- normal issue types should always be documented
- add "TODO" to missing documentation
- dynamic issue types are the only ones that can be created outside of
IssueType.ml
I had to document the new CCBM and the resource leak lab exercise to
keep Help.ml happy, did `make doc-publish`.
Reviewed By: ngorogiannis
Differential Revision: D22118766
fbshipit-source-id: 3d0194518
Summary:
A bug in docusaurus makes relative URLs fail depending on how the page
was accessed, because the URL of a page in docs/ will end in / if
accessed directly or via hyperlink, but that / will be omitted when
clicking on the sidebar. The final / makes all the difference when
interpreting relative URLs so relative URLs are essentially broken.
See https://github.com/facebook/docusaurus/issues/2832 for more details.
This changes URL generation to generate URLs /docs/next/..., and
manually substitute relative URLs that had been written by hand.
Also fix a few other things about outdated links/comments.
Finally, `make doc-publish`.
Reviewed By: dulmarod
Differential Revision: D22117187
fbshipit-source-id: 32e2ba7e1
Summary: Buck uses its own estimate for how many workers to spawn, there is no need to pass our own estimate for capture.
Reviewed By: ezgicicek
Differential Revision: D22065565
fbshipit-source-id: 4c062a9aa
Summary:
Needed to remove user_documentation for the new
CONFIG_CHECK_BETWEEN_MARKERS issue type otherwise it violated the
invariant that the corresponding checker should be documented too but
its development has just started.
Reviewed By: skcho
Differential Revision: D22065820
fbshipit-source-id: 4b3a58850
Summary:
Add objc test for ```NSArray``` and ```NSMutableArray```.
```NSMutableArray``` is a subclass of ```NSArray```.
For documentation of ```NSArray```, https://developer.apple.com/documentation/foundation/nsarray?language=objc
For documentation of ```NSMutableArray```, https://developer.apple.com/documentation/foundation/nsmutablearray?language=objc
The underlying mechanism for ```NSMutableArray``` is quite complicated. It changes the underlying data structure during runtime, so it is possible to have say O(log n) complexity for accessing element in array. (See here https://opensource.apple.com/source/CF/CF-855.11/CFArray.h) However, this is unlikely to happen if the engineer does not abuse the usage of the class ```NSMutableArray``` according to at least two ios engineers. So here the complexity is set to match the normal expectation of the complexity.
Reviewed By: ezgicicek
Differential Revision: D22041277
fbshipit-source-id: c27f43167
Summary:
This diff adds a prototype of a new checker that collects config checkings between markers.
Basically, what the checker is doing is a taint analysis.
* Taint source: function calls of "marker start"
* Taint sink: function calls of config checking
* Taint remove: function calls of "marker end"
By the taint analysis, the analysis results will say that which taints can reach to the sink. In other words, which marker ID that has been started can reach to the config checks, before marker's ending.
I am separating the diff into three steps:
(1/3) Add basic abstract semantics
(2/3) Add trace information
(3/3) Add reporting with test examples
Reviewed By: jvillard
Differential Revision: D21998170
fbshipit-source-id: e7759f62f
Summary:
This diff adds a prototype of a new checker that collects config checkings between markers.
Basically, what the checker is doing is a taint analysis.
* Taint source: function calls of "marker start"
* Taint sink: function calls of config checking
* Taint remove: function calls of "marker end"
By the taint analysis, the analysis results will say that which taints can reach to the sink. In other words, which marker ID that has been started can reach to the config checks, before marker's ending.
I am separating the diff into three steps:
(1/3) Add basic abstract semantics
(2/3) Add trace information
(3/3) Add reporting with test examples
Reviewed By: jvillard
Differential Revision: D21998150
fbshipit-source-id: 337a8938a
Summary:
This diff adds a prototype of a new checker that collects config checkings between markers.
Basically, what the checker is doing is a taint analysis.
* Taint source: function calls of "marker start"
* Taint sink: function calls of config checking
* Taint remove: function calls of "marker end"
By the taint analysis, the analysis results will say that which taints can reach to the sink. In other words, which marker ID that has been started can reach to the config checks, before marker's ending.
I am separating the diff into three steps:
(1/3) Add basic abstract semantics
(2/3) Add trace information
(3/3) Add reporting with test examples
Reviewed By: jvillard
Differential Revision: D21935546
fbshipit-source-id: 092abb92c
Summary: Most of them had some form of documentation in source comments.
Reviewed By: ngorogiannis
Differential Revision: D22020016
fbshipit-source-id: 468f86658
Summary: Pulse has now a better version of this check, so let's delete it.
Reviewed By: ngorogiannis
Differential Revision: D22019247
fbshipit-source-id: 344678225
Summary: Remove unsafe invocations of `String.rsplit2_exn` as this code may run on C++.
Reviewed By: dulmarod, jvillard
Differential Revision: D22042750
fbshipit-source-id: b2879e17b