Summary:
This will allow all the analyses to be able to call closures without any special treatment: we transform the call to variables that point to closures into normal function calls. We treat only ObjC blocks at the moment, with C++ lambdas to be done as a next step.
We aimed to achieve certain results in Pulse (see tests: avoid memory leaks and NPEs FPs) while also keeping the biabduction analysis working as before.
We also checked that for the examples analyzed Pulse behaves like the correct semantics of ObjC programs with blocks.
Reviewed By: jvillard
Differential Revision: D22547333
fbshipit-source-id: efe56ed51
Summary: Lambda is called using `operator()`. We need to know the information of captured variables when `operator()` procedure is being analysed. This diff records lambda captured variables in `operator()` procdesc. The complication arises from the fact that procdesc for `operator()` is created before translating lambda expression or during the translation of lambda expression where captured variables are translated. To solve this issue, we update existing `operator()` procdesc attributes with captured variable information when we translate lambda expression.
Reviewed By: jvillard
Differential Revision: D22374495
fbshipit-source-id: 44909adea
Summary: Add cost model for most common `NSString` functions in cost analysis
Reviewed By: skcho
Differential Revision: D22433005
fbshipit-source-id: 2f57bbda9
Summary: If a node is unreachable and the cost of the node is Top, we were giving Top cost :( Let's fix it.
Reviewed By: skcho
Differential Revision: D22548269
fbshipit-source-id: d79743669
Summary:
We update the type of captured variables to include information about capture mode (`ByReference` or `ByValue`) both for procdesc attributes and the closure expression.
For lambda: closure expression now contains correct capture mode for capture variables. Procdesc still does not contain information about captured variables which we will address in the next diff.
For objc blocks: at the moment all captured variables have mode `ByReference`. Added TODOs to fix this.
Reviewed By: jvillard
Differential Revision: D22572054
fbshipit-source-id: 4c88678ee
Summary:
As title
Model `NSString` as `JavaString`.
Since `NSArray` does not contain information about its type of element, we do not use associate string with collection as in Java and C++. In Java, String model is implemented using java collection, and for C++, string model is implemented using vector.
So instead, we use existing `JavaString` model.
Reviewed By: skcho
Differential Revision: D22431949
fbshipit-source-id: 7cdde1ad7
Summary: To avoid NULLPTR_DEREFERENCE false positives we want to model some functions as returning non-null. A new flag --pulse-model-return-nonnull allows us to provide a list of such functions.
Reviewed By: ezgicicek
Differential Revision: D22431564
fbshipit-source-id: 9944c7382
Summary: `addAll` adds elements one by one and hence takes linear time. We didn't have a model for this and considered it O(1).
Reviewed By: skcho
Differential Revision: D22375157
fbshipit-source-id: 65b82bfae
Summary: This diff prevents printing line numbers of loop in the trace description, which helps to keep the same descriptions even when the line number of a function is changed in tests.
Reviewed By: ezgicicek
Differential Revision: D22375584
fbshipit-source-id: 676d1a7cc
Summary:
Keyword `thread_local` in cpp allows us to create a variable with thread storage duration, meaning that the object's lifetime begins when the thread begins and ends when the thread ends.
We get `NULLPTR_DEREFERENCE` false positive for `thread_local` variable since we reallocate it in the `VariableLifetimeBegins` metadata instruction and we do not see further updates to the variable. To solve the issue we special case `VariableLifetimeBegins` instruction for global variables.
Reviewed By: jvillard
Differential Revision: D22284135
fbshipit-source-id: 13c14ef90
Summary: Create test for the most common unmodeled function in inferbo that acts as control variable.
Reviewed By: skcho
Differential Revision: D22331168
fbshipit-source-id: 1913682db
Summary: Add objc test for customized class and blocks. Mostly sanity test.
Reviewed By: ezgicicek
Differential Revision: D22043918
fbshipit-source-id: 917deeea7
Summary:
This diff adds a model of `File.listFiles` as returning an array with
a symbolic length.
Reviewed By: ezgicicek
Differential Revision: D22332258
fbshipit-source-id: 6ca593b8b
Summary: This diff adds support for `com.facebook.litho.sections.Section` which mimics the behavior for `com.facebook.litho.Component`.
Reviewed By: skcho
Differential Revision: D22309039
fbshipit-source-id: 3510441a8
Summary:
We already had a heuristic to deal with assignment expressions, but it
relied on the very previous CFG node to have a non-empty list of instrs.
In some cases, however, this previous node is a Join_node with no instrs,
so we need to take one more step back to find what we're looking for.
I've also added a bit more logging around this functionality, so it's
easier to debug/tune in future.
Reviewed By: ngorogiannis
Differential Revision: D22282930
fbshipit-source-id: 024eec145
Summary:
This model is very important in the analysis of ObjC classes because the pattern
```
- (instancetype)init {
if (self = [super init]) {
...
}
return self;
}
```
is very common, so we need to know that if the super class is `NSObject`, the implementation of `init` is returning `self`, otherwise it's a skip function and we don't get the correct spec for the function. We fix some memory leak FP with this model, see test.
Reviewed By: ezgicicek
Differential Revision: D22259281
fbshipit-source-id: 3ee48c827
Summary:
We need to check if `folly::Optional` is not `folly::none` if we want to retrieve the value, otherwise a runtime exception is thrown:
```
folly::Optional<int> foo{folly::none};
return foo.value(); // bad
```
```
folly::Optional<int> foo{folly::none};
if (foo) {
return foo.value(); // ok
}
```
This diff adds a new issue type that reports if we try to access `folly::Optional` value when it is known to be `folly::none`.
Reviewed By: ezgicicek
Differential Revision: D22053352
fbshipit-source-id: 32cb00a99
Summary: This linters were not used much anymore, so we can delete them.
Reviewed By: ngorogiannis
Differential Revision: D22233895
fbshipit-source-id: f31180a05
Summary: There is now a compilation check for UNAVAILABLE_API_IN_SUPPORTED_IOS_SDK so this check is less useful. Also the check REGISTERED_OBSERVER_BEING_DEALLOCATED is useful only in an old version of iOS.
Reviewed By: ngorogiannis
Differential Revision: D22231851
fbshipit-source-id: 72151fef5
Summary:
This diff tries to support a specific form of linked list iteration in Java.
```
while (p != null) {
p = p.getNext();
}
```
This example was a constant cost before because the cost checker could not detect that it is an iteration on a linked list.
The heuristic this diff implemented is:
(1) `p = p.getNext()`: It tries to find this specific form of assignment. Then, it increments `p.linked_list_index` by 1. Note that `linked_list_index` is a virtual field for keeping an index in the linked list. Its initial value is always 0.
(2) At `p != null`, it tries to prune the value of `p.linked_list_index`: the upper-bound of `p.linked_list_index` is pruned by `<= p.linked_list_length`. Here again, `p.linked_list_length` is also a virtual field to denote the length of the linked list.
Reviewed By: ezgicicek
Differential Revision: D22234892
fbshipit-source-id: 2fee176bb
Summary: These models for Memory Leaks have been ported to Pulse, so we can remove the models in biabduction and corresponding tests.
Reviewed By: skcho
Differential Revision: D22206287
fbshipit-source-id: e17499ad3
Summary:
Move the implementation of implicit getters and setters from the biabduction to the clang frontend so these methods are accessible to all the checkers.
*Background*: In Objective-C when properties are created in the interface of a class, the compiler creates automatically the instance variable for it and also the getter and setter in the implementation of the class. In the frontend we collect the information about which method is the implicit getter and setter of which instance variable (we get the method declaration but not the implementation), and here we add the implicit implementation.
Reviewed By: skcho
Differential Revision: D22187238
fbshipit-source-id: 76e0508ed
Summary: Let's make package name match the directory name to follow Java's file lookup conventions
Reviewed By: skcho
Differential Revision: D22183964
fbshipit-source-id: b9958b975
Summary:
Document FP due to imprecision in tracking outer lock release. In a nested `synchronized` block the outer release is not registered by the abstract domain. The reason is that HIL is not resolving what `$bcvarX` is pointing to (in this case to `lockE`).
Reported by Andreea Costea.
Reviewed By: ezgicicek
Differential Revision: D22186240
fbshipit-source-id: 84e5e72b1
Summary:
Nullability of the assignment result is not refined in code snippets
like:
```
while ((a = foo.getA()) != null) {
nonNullableVal = a;
}
```
Let's add a test for this.
Reviewed By: jvillard
Differential Revision: D22136218
fbshipit-source-id: 206c368d6
Summary:
Add objc test for ```NSArray``` and ```NSMutableArray```.
```NSMutableArray``` is a subclass of ```NSArray```.
For documentation of ```NSArray```, https://developer.apple.com/documentation/foundation/nsarray?language=objc
For documentation of ```NSMutableArray```, https://developer.apple.com/documentation/foundation/nsmutablearray?language=objc
The underlying mechanism for ```NSMutableArray``` is quite complicated. It changes the underlying data structure during runtime, so it is possible to have say O(log n) complexity for accessing element in array. (See here https://opensource.apple.com/source/CF/CF-855.11/CFArray.h) However, this is unlikely to happen if the engineer does not abuse the usage of the class ```NSMutableArray``` according to at least two ios engineers. So here the complexity is set to match the normal expectation of the complexity.
Reviewed By: ezgicicek
Differential Revision: D22041277
fbshipit-source-id: c27f43167
Summary:
This diff adds a prototype of a new checker that collects config checkings between markers.
Basically, what the checker is doing is a taint analysis.
* Taint source: function calls of "marker start"
* Taint sink: function calls of config checking
* Taint remove: function calls of "marker end"
By the taint analysis, the analysis results will say that which taints can reach to the sink. In other words, which marker ID that has been started can reach to the config checks, before marker's ending.
I am separating the diff into three steps:
(1/3) Add basic abstract semantics
(2/3) Add trace information
(3/3) Add reporting with test examples
Reviewed By: jvillard
Differential Revision: D21998170
fbshipit-source-id: e7759f62f
Summary: Pulse has now a better version of this check, so let's delete it.
Reviewed By: ngorogiannis
Differential Revision: D22019247
fbshipit-source-id: 344678225
Summary:
This issue type was not giving good results and can be replaced by
Pulse's version.
Reviewed By: ngorogiannis
Differential Revision: D22019551
fbshipit-source-id: 5cf3db46d
Summary:
Add objc test case for ```NSInteger``` and ```NSString```.
The test cases are adapted from java test case: ```IntTest.java```, ```StringBuilder.java```, and ```StringTest.java```.
Inspection of the record will be done later.
Reviewed By: ezgicicek
Differential Revision: D21994620
fbshipit-source-id: 0c1d7b34e
Summary: To avoid NULLPTR_DEREFERENCE false positives we want to treat some functions as `abort`. A new flag `--pulse-model-abort` allows us to provide a list of such functions.
Reviewed By: ezgicicek
Differential Revision: D21962555
fbshipit-source-id: d46b93c99
Summary: The new memory leaks analysis is now ready to be enabled by default and turned on in production. This also replaces the biabduction one which is now disabled.
Reviewed By: jvillard
Differential Revision: D21998666
fbshipit-source-id: 9cd95e894
Summary:
This models ARC implementation of dealloc, see https://clang.llvm.org/docs/AutomaticReferenceCounting.html#dealloc. Dealloc methods can be added to ObjC classes to free C memory for example, but the deallocation of the ObjC instance variables of the object is done automatically. So here we add this explicitly to Infer:
1. First, we add an empty dealloc method when it is not written explicitly.
2. For each dealloc method (including the implicitly added ones) we add calls to dealloc of the ObjC instance variables.
Reviewed By: jvillard
Differential Revision: D21883546
fbshipit-source-id: f5d4930f2
Summary: We don't rely on `external-java-packages` in the inferconfig anymore. Let's remove it altogether.
Reviewed By: jvillard
Differential Revision: D21997962
fbshipit-source-id: 7a2e13cfe
Summary:
Similarly as for issue types, we want to generate the website
documentation from infer itself so we can easily cross-reference
checkers and the issue types they report.
This imports the website documentation written for some (very few) of
the checkers. I wrote some cursory one-liners for the rest.
Reviewed By: dulmarod
Differential Revision: D21934375
fbshipit-source-id: 8c9dc2b08
Summary:
See previous diff: issues are always reported with the same severity so
recognise that and just use their default severity in "modern" checkers.
Reviewed By: ngorogiannis
Differential Revision: D21904591
fbshipit-source-id: fb5387e35
Summary:
This diff implements part of the memory management for Objective-C classes in ARC, namely that `dealloc` is called when the objects become unreachable. In reality the semantics of ARC says that this happens when their reference count becomes 0, but we are not modelling this yet in Pulse. However, we could in the future.
This fixes false positives memory leaks when the memory is freed in dealloc.
`dealloc` is often implicit in Objective-C, it also calls the dealloc of instance variables and superclass. None of this is implemented yet, and will be done in a future diff. This will be added in the frontend probably, similarly to how it's done for C++ destructors.
This is an important part of modelling Objective-C semantics in Infer, I looked at whether this should be a preanalysis to be used by all analyses but this needs Pulse. So the idea is that any analysis that needs to understand Objective-C memory model well, should have Pulse as a preanalysis.
Reviewed By: jvillard
Differential Revision: D21762292
fbshipit-source-id: ced014324
Summary:
Since Java8, interfaces mays contain implementations
(default methods). We modify the resolve algorith in the Java frontend
to take care of that.
Reviewed By: jvillard
Differential Revision: D21785182
fbshipit-source-id: ffab8124c
Summary: We do not use an arbitrary threshold to test cost results anymore but instead rely on `cost-issues` which do not have any trace attached. This diff adds traces to `costs-report.json` so that we can test cost issues with traces.
Reviewed By: skcho
Differential Revision: D21858846
fbshipit-source-id: e73321a92
Summary:
Now that we have a way to write cost issues, let's not rely on some arbitrary threshold (and also get rid of `EXPENSIVE_EXECUTION_TIME` issues in tests).
One consequence of this is that we will loose the cost traces in tests since `costs-report.json` doesn't have any traces. Next diff fixes that.
Reviewed By: skcho
Differential Revision: D21837574
fbshipit-source-id: 86b4d028d
Summary:
The model returns an array the length of which is the same to that of enum entries.
It takes the length of enum entries from the summary of `Enum.values` because it is not written in `tenv`. In order to do that, the model semantics should be able to request the summary of the function with `get_summary`, so I extended `model_env` to include the functionality.
Reviewed By: ezgicicek
Differential Revision: D21843319
fbshipit-source-id: d6f10eb91
Summary:
The model returns an array the length of which is the number of known
fields in `tenv`.
Reviewed By: ezgicicek
Differential Revision: D21840375
fbshipit-source-id: 891517c6e
Summary: D21816312 forgot to add the new cost testing mechanism to `fb-performance` and `performance-exlusive` directories. This diff fixes that.
Reviewed By: skcho
Differential Revision: D21837912
fbshipit-source-id: 407dafcd3
Summary: The models were too naive before since they invalidated the underlying array completely (copying C++'s push_back model), causing spurious vector invalidation issues in Java. This diff adds more reasonable models.
Reviewed By: skcho
Differential Revision: D21787543
fbshipit-source-id: a5a59ff69
Summary:
In order to test cost analysis results, currently we rely on having an arbitrary cost threshold (200) and report issues that exceed this cost. For instance, a cost of 201 is considered expensive and reported as `EXPENSIVE_EXECUTION_TIME` issue in cost tests.
This means, if we change the cost analysis in a slight way that results in some constant cost increase under 200, we wouldn't able to detect it. I find this unsatisfactory and somewhat hacky.
This diff adds the ability to write the result of `costs-report.json` into a separate `cost-issues.exp` and then compare the actual costs (not only than relying on this arbitrary threshold reporting mechanism).
Reviewed By: skcho
Differential Revision: D21816312
fbshipit-source-id: 93b531928
Summary: Add models for `View` methods that schedule on the UI thread.
Reviewed By: skcho
Differential Revision: D21767954
fbshipit-source-id: 015441ea7
Summary:
Android OS calls certain overridden class methods always on the UI thread. The function changed here attempted to build a list of all these methods, one by one. It's much more complete to simply consider a method as callable on the UI thread if it's an override of an Android library method, and it starts with "on". Only a single instance is known not to obey this pattern, so it's easier to blacklist than to whitelist.
Also clarify the name to `is_android_lifecycle_method`.
Reviewed By: ezgicicek
Differential Revision: D21703365
fbshipit-source-id: 41ca3e998
Summary:
I think `Analysis_stops` ought to achieve roughly the same thing (except
that weird filtering logic which I removed).
Reviewed By: dulmarod
Differential Revision: D21686562
fbshipit-source-id: 53d40729f
Summary: Assigning `nullptr` to `std::function` was causing `NULLPTR_DEREFERENCE` as our model was expecting to get an object in the right hand side of the assignment (`std::function::operator=`) and was dereferencing that object. Assigning `nullptr` to `std::function` removes callable object from it. We model this special case by creating a fresh value.
Reviewed By: skcho
Differential Revision: D21685318
fbshipit-source-id: 2d4af1933
Summary:
Introduce BIABD_ prefixes for a few issue types that were duplicated
between analyses, and also prefix the lab exercise issue type to avoid
sharing with biabduction.
Reviewed By: ngorogiannis
Differential Revision: D21660226
fbshipit-source-id: 3435916e6
Summary:
This seems to make sense as it's a separate analysis (that depends on
biabduction). This introduces unpleasant `|| is_checker_enabled TOPL`
whenever we try to figure out if biabduction will run. I think this is a
more general problem that deserves a more general solution to express
the fact that checkers can depend on others, so that, eg,
`is_checker_enabled Purity` is true when we pass `--loop-hoisting`. Will
address in another diff.
Reviewed By: ngorogiannis
Differential Revision: D21618460
fbshipit-source-id: 8b0c9a015
Summary: Later on, this can be changed again or made customizable.
Reviewed By: artempyanykh
Differential Revision: D21618730
fbshipit-source-id: fe517c766
Summary:
Start with tests about dynamic dispatch to test the upcoming
pre-analysis.
Reviewed By: ezgicicek
Differential Revision: D21594496
fbshipit-source-id: 1771ea968
Summary:
This function had been computing the name for ObjC methods wrong, with only the class name. This was causing wrong error messages in Pulse.
The main issue was that `Procname.to_simplified_string` was writing `Classname::methodname` for ObjC methods, which is not the convention. This confused the `hashable_name` funtion. So changing the method name to `Classname.methodname` which is more standard, and this also fixes `hashable_name`.
Reviewed By: ngorogiannis, jvillard
Differential Revision: D21570880
fbshipit-source-id: 13ed62cf8
Summary: In an intra-procedural analysis, we assume that parameters passed by reference to a function will be initialized inside that function. To do that we use the type information of a formal parameter to initialize the fields of the struct. This was causing false positives if the formal parameter in function signature had type `void*`. To solve this, we used type information from local variables instead. However, we also get false positives for any kind of pointer if we use cast. We fix this by using type information of local variables as in `void*` case.
Reviewed By: jvillard
Differential Revision: D21522979
fbshipit-source-id: 4222ff134
Summary:
Previous translation of enum constants were wrong since they assumed that the enum constant didn't include any global variable (hence they just looked up the enum exp from the map, forgetting to tie the respective instructions into the cfg).
```
const int gvar = 0;
enum {
evar = gvar,
};
int dangling() {
return evar;
}
```
as a result, the CFG was missing the instruction for the load of the `gvar`.
{F237004587}
This diff fixes this issue by hooking up the instructions that load the enum constant in to the CFG. Note that in this example, it is only a load instruction but there could be more instructions (e.g. if we had `gvar > 1`, we would have prune +join).
{F237004493}
Reviewed By: ezgicicek
Differential Revision: D21549781
fbshipit-source-id: 525534fb2
Summary:
Just like `CFBridgingRelease` we want to be able to model functions that are specific to a given codebase that make a transfer of memory ownership so that developers don't need to worry about releasing that memory anymore, and hence, we don't want to report leaks on that memory.
Things get a little more complicated, because some of the functions we want to model are in a specific namespace, so with this flag we take both cases into account, when we are dealing with namespaces or not.
Reviewed By: jvillard
Differential Revision: D21404409
fbshipit-source-id: c36bd7afc
Summary: Currently we get false positive if we apply `operator--` to the `end()` iterator. To solve this, we model iterator `operator--` not to raise an error for the `EndIterator` invalidation, but to create a fresh element in the underlying array.
Reviewed By: ezgicicek
Differential Revision: D21476353
fbshipit-source-id: 5c722372e
Summary:
It is undefined behavior to dereference end iterator.
To catch end iterator dereferencing issues we change iterator model: instead of having `internal pointer` storing the current index, we model it as a pointer to a current index. This allows us to model `end()` iterator as having an invalid pointer and there is no need to create an invalidated element in the vector itself.
Reviewed By: ezgicicek
Differential Revision: D21178441
fbshipit-source-id: fd6a94b0b
Summary: We mistakenly invalidated the set element which causes spurious vector invalidation errors. Instead, we should modify it without any invalidation.
Reviewed By: jvillard
Differential Revision: D21521943
fbshipit-source-id: 67963967e
Summary:
This diff adds semantics of assume null, heuristics. When `assume(x == null)`, it removes the
methods called on the builder `*x` from the abstract state.
```
x -> {p}
p -> {method1 called}
assume(x == null)
x -> {p}
```
This heuristics is unsound: Even though `x` (a pointer to builder object) points-to an builder
object, which cannot imply that the object `p` does not exist in the concrete semantics. The
unsoundness may appear when there is an alias (see the FP test added).
Reviewed By: ezgicicek
Differential Revision: D21502923
fbshipit-source-id: 2e392bd89
Summary: Java's iterator models were wrong. This causes `VECTOR_INVALIDATION` errors in fbandroid projects. This diff aims to fix it by modeling Java iterators with a current pointer and an underlying collection array.
Reviewed By: skcho
Differential Revision: D21448322
fbshipit-source-id: 7d44354b5
Summary:
These 2 methods are automatically supplied for all enums, with
predefined behavior and nullability: https://www.geeksforgeeks.org/enum-in-java/
(Note that they are not part of java.lang.Enum class).
This will allow using them in unvetted third part and under strict mode.
Reviewed By: artempyanykh
Differential Revision: D21501716
fbshipit-source-id: 104082d15
Summary:
Because in the real semantics CFRelease can be used more than once, and also the variables can be used after CFRelease in general, modelling this as `free` causes many `USE_AFTER_FREE` errors. Now we change the model to not add the `Invalid CFree` attribute, but to just remove the `Allocated` attribute. So we can model memory leaks in the simple case of `Create` and not `CFRelease` before going out of scope, but we avoid the `USE_AFTER_FREE`.
Since the model for CFRelease now diverges from free, changed the command line option for modelling to `pulse-model-release-pattern`.
Reviewed By: jvillard
Differential Revision: D21324895
fbshipit-source-id: ab323d981
Summary: The contract for reporting races in C++ is to flag races between writes under lock with reads without a lock. This diff restores that contract which had been violated by recent changes.
Reviewed By: jberdine
Differential Revision: D21383876
fbshipit-source-id: 6a84e1506
Summary:
This diff changes way we treat classes w.r.t. to Nullsafe modes when
issuing meta-issues.
Previously, we considered nested class independently of the outer one.
This was leading to a tricky case: when the class is clean but nested
class needs fixing, meta-info told that class can be Nullsafe.
This is counter-intutive and lead to problems when users tried to follow
wrong nullsafe suggestions for this case.
After this diff we:
1. Start calculating meta-issues only on the outermost level. This will simplify
reasoning about nullsafe stats.
2. Aggregate all nested issues counters to corresponding outer-level class.
Among others, CLASS_CAN_BE_NULLSAFE Advice will finally become
actionable in all known cases.
Reviewed By: artempyanykh
Differential Revision: D21353607
fbshipit-source-id: a17c6958a
Summary:
List of things happening in this unreviewable diff:
- moved PulsePathCondition to PulseSledge
- renamed --pulse-path-conditions to --pudge
- PulsePathCondition now contains all the arithmetic of pulse
(inferbo+concrete intervals+pudge). In particular, moved arithmetic
attributes into PulsePathCondition.t. PulsePathCondition plays the
role of PulseArithmetic (combining all domains).
- added tests for a false positive involving free()
- PulseArithmetic is now just a thin wrapper around PulsePathCondition
to operate on states directly (instead of on path conditions).
- The rest is mostly moving code into PulsePathCondition (eg, from
PulseInterproc) and adjusting it.
Reviewed By: jberdine
Differential Revision: D21332073
fbshipit-source-id: 184c8e0a9
Summary:
We have a common entry point where we skip analysis in nullsafe.
This logic is copied from `Reporting.log_issue_from_summary`.
I believe this should not exist in Reporting: it is not the right place
to decide whether to suppress issues: we should not try to report it in
first place.
Because of that we falsely report "needs improvement" meta-issue while
we don't issue any (they were suppressed but participated in needs
improvement count calculations).
Now this change will make meta-issue to be synced with what the user
actually sees.
Down the line we should have a more reliable fix for that.
So far I reviewed suppressing code and looks like we should not suppress
anything else (unless explicitly SuppressLint-ed, which is fine).
Reviewed By: artempyanykh
Differential Revision: D21328634
fbshipit-source-id: 120ce06d1
Summary:
Add a new data structure and use it for the map of memory accesses to
limit the number of destinations reachable from a given address. This
avoids remembering details of each index in large arrays, or even each
field in large structs.
Reviewed By: skcho
Differential Revision: D18246091
fbshipit-source-id: 5d3974d9c
Summary:
The C++ tests were a bit of a mess. This diff tries to enforce the following principles:
- mark every function with `_ok` or `_bad` so that when a function appears in `issues.exp` it's easy to figure out the intention;
- mark every false negative and positive with `FP_` and `FN_` to document expectations;
- make every function access one field and participate in at most one issue report so that it's easier to assess changes.
Reviewed By: jvillard
Differential Revision: D21278627
fbshipit-source-id: 9698f716f
Summary:
We were invalidating "*(vec.__infer_backing_array)" instead of the
address of the field itself.
Reviewed By: ezgicicek
Differential Revision: D21280357
fbshipit-source-id: 48b984800
Summary:
The directory was created to have several sets of nullsafe tests but
there is only one in the end. Remove the redundant "-default".
Reviewed By: mityal
Differential Revision: D21300205
fbshipit-source-id: 46ed8b032
Summary:
The directory names had some interesting variety due to historical
reasons.
- {c,cpp,objc,objcpp}/errors/ date from the time when infer was only
biabduction
- java/infer/ dates from the time when we had an "--analyzer" option and
"infer" was one of them (sic), and eg another was "eradicate".
- c/biabduction/ dates from the time when the biabduction analysis was
being migrated to the "checkers" (AI) framework. For some reasons the
tests there are not a subset of c/infer/ but seem to be entirely new
tests.
The convention now dictates that we should name all of these
*/biabduction/. This diff moves the existing tests from c/biabduction/
into c/biabduction/misc/.
Reviewed By: mityal
Differential Revision: D21300147
fbshipit-source-id: 516d1cb15
Summary: We currently don't support abducing the spec that we need to delete an attribute, that makes the model for `CFBridgingRelease` work les well when it is, for instance, wrapped in a method. We show examples of how this doesn't work at the moment.
Reviewed By: jvillard
Differential Revision: D21176108
fbshipit-source-id: 79aed7a5d
Summary:
We model `malloc` in Objective-C as `malloc_not_fail` I think because the null case is not normally handled in iOS apps because the OS will just killed the app after giving some memory warnings.
So adding `malloc_not_fail` model to Pulse.
Reviewed By: jvillard
Differential Revision: D21278527
fbshipit-source-id: 17a5008fe
Summary:
This translates the construct `ObjCBridgedCastExpr` when the cast_kind is `OBC_BridgeTransfer`, or in syntax, the cast (`__bridge_transfer`).
This cast means that the object is passed from manual memory management to ARC, so one doesn't need to call `release` manually. It is important to model this to avoid false positives.
It translates it as a builtin that we then model in Pulse, the same way we modelled `CFBridgingRelease` which does the same thing.
The name of the builtin is `__free_cf` which is not ideal but I left it like that for compatibility with biabduction. We can change it once we remove this check from biabduction.
update-submodule: facebook-clang-plugins
Reviewed By: jvillard
Differential Revision: D21176337
fbshipit-source-id: 736ceeb9b
Summary:
In the previous diffs, nullsafe behavior was changed to the following:
nested class mode is inherited from the outer class mode.
Though it is possible to e.g. make nested class Nullsafe and outer not,
or make nested class STRICT and outer just LOCAL, this is an edge case
and we don't want to recommend annotating nested classes by default. The
right way is to make outer class Nullsafe instead.
In this decision, we take into account user experience and codebase
readability.
Reviewed By: ngorogiannis
Differential Revision: D21255806
fbshipit-source-id: 0200cb555
Summary:
With this change, set of possibilities will be more actionable. Most
importantly, this will also educate users and make them realize how
Nullsafe trust works.
NOTE: yes, parenthesis are bit clumsy, but it was the easiest way to
make this change and let the phrase remain grammatically correct.
Reviewed By: artempyanykh
Differential Revision: D21231468
fbshipit-source-id: 4b5349fb5
Summary:
In the previous diff we changed the semantics of nested classes w.r.t.
to Nullsafe.
Let's make it clear if users will attempt to misuse it.
Reviewed By: artempyanykh
Differential Revision: D21230717
fbshipit-source-id: 0ecc0dd06
Summary: Similarly to Enum.name, we model Class.getCanonicalName as returning an arbitrary non-empty string.
Reviewed By: ngorogiannis
Differential Revision: D21207120
fbshipit-source-id: 1e2dbd1fd
Summary:
From the user perspective, the current behavior is confusing.
The users intutitively expect the inner class to inherit Nullsafe mode
from the outer one.
Having a class that is Nullsafe but the inner is not is hence dangerous
and misleading.
For the sake of completeness and to support gradual strictification, we
allow the nested class to improse additional strictness. Particularly,
the inner class can be Nullsafe but the outer can be not.
A follow up to this diff will include warnings telling about redundant and wrong usages of nested annotations.
Reviewed By: artempyanykh
Differential Revision: D21228055
fbshipit-source-id: 75755ad1d
Summary: Iterator invalidation traces were based on vector rather than iterator itself.
Reviewed By: ezgicicek
Differential Revision: D21202047
fbshipit-source-id: 62ce8a488
Summary:
We model Enum.name as returing a constant name, rather than getting real field names. We did this
because we couldn't think of any big gains, in terms of analysis precision/performance, from getting
the real names.
Reviewed By: ezgicicek
Differential Revision: D21201730
fbshipit-source-id: a2dc01a44
Summary: This diff revises the models of Collection.set and get to handle its elements.
Reviewed By: ezgicicek
Differential Revision: D21201242
fbshipit-source-id: 9c248453d
Summary:
We ignored allocator models for vectors, and were not able to initialize vectors properly. This diff fixes this issue.
It also adds a test which was a FN before.
Reviewed By: skcho, jvillard
Differential Revision: D21089492
fbshipit-source-id: 6906cd1d1
Summary: D21155014 replaced `skip` call with a Load but this was not right. Instead, let's add a new builtin function (rather than skip) so that other analyses can freely model it as they want.
Reviewed By: jvillard
Differential Revision: D21178286
fbshipit-source-id: c214ccfb0
Summary: Java has this pattern of wrapping non-thread-safe containers in factory methods producing identically-typed results, but wrapped in a synchronised shell. This diff teaches RacerD about some common factory methods and uses the attribute domain to track the dynamic type of their results.
Reviewed By: ezgicicek
Differential Revision: D21155538
fbshipit-source-id: 42ebe6251
Summary: Complete the set of models for java containers that Infer should not report thread safety violations.
Reviewed By: ezgicicek
Differential Revision: D21138280
fbshipit-source-id: 01e1944b6
Summary: Models were partial and/or simply missing (`Map` writes!). Now the modelled containers use inheritance for conciseness (`List` reads are only those not caught by the `Collection` matcher, etc). Also, add URLs to documentation sources.
Reviewed By: ezgicicek
Differential Revision: D21132069
fbshipit-source-id: fefb360f0
Summary: `CFBridgingRelease` and `__bridge_transfer` which I'll model later, transfer the memory model from manual memory ref count to ARC (automatic ref count), so to avoid false positives this needs to be modelled. We can simply remove the Allocated attribute from the state, which means we won't try to track that memory anymore.
Reviewed By: skcho
Differential Revision: D21088218
fbshipit-source-id: 3520a0d59
Summary: This diff suppresses cost issues on lambda and auto-generated procedures, since they were too noisy.
Reviewed By: ezgicicek
Differential Revision: D21153619
fbshipit-source-id: 65ad6dcc3
Summary:
Replace horrible hack with ok hack.
The main difficulty in implementing the disjunctive domain is to avoid
the quadratic time complexity of executing the same disjuncts over and
over again when going around loops:
First time around a loop, assuming for example a single disjunct `d`:
```
[d]
loop body
[d1' \/ d2']
```
Second time around the same loop: the new pre will be the join of the
posts of predecessor nodes, so `old_pre \/ post(loop,old_pre)`, i.e.
`d \/ d1' \/ d2'`. Now we need to execute `loop body` again
*without running the symbolic execution of `d` again* (and the time after
that we'll want to not execute `d`, `d1'`, or `d2'`).
Horrible hack (before): Disjuncts have a boolean "visited" attached
that does its best to keep track of whether a given disjunct is old or
new. When executing a single *instruction* look at the flag and skip the
state if it's old. Of course we have no way to know for sure so it turns
out it was often wrongly re-executing old disjuncts. This was also
producing the wrong results over even simple loops: only the last
iteration would make it outside the loop for some reason. Overall, the
semantics were pretty untractable and shady at best.
New hack (this diff): only run instructions of a given *node* on
disjuncts that are not physically equal to the "pre" ones already in the
invariant map for the current node.
This gives the correct result over simple loops and a nice performance
improvement in general (probably the old heuristic was hitting the
quadratic bad case more often).
Reviewed By: skcho
Differential Revision: D21154063
fbshipit-source-id: 5ee38c68c
Summary:
We translated the expression `CXXStdInitializerListExpr` naively in D3058895 as a call to
a skip function, with the hope that it would be translated better in the future. However, the naive means that we lose access to the initialized list/array because we are simply skipping it. So, even if we want to model the initializer properly, we have to deal with the skip specially.
This diff tries to solve this problem by removing the skip call whenever
possible. Instead, we translate the underlying array/list as a Load, so
that when it is passed to the constructor, we can pick it up.
For the following initialization:
``` std::vector<int*> vec = {nullptr};
```
Before, we translated it as
```
*&0$?%__sil_tmpSIL_materialize_temp__n$7[0]:int* const =null
n$8=_fun___infer_skip_function(&0$?%__sil_tmpSIL_materialize_temp__n$7:int* const [1*8] const )
n$9=_fun_std::vector<int*,std::allocator<int*>>::vector(&vec:std::vector<int*,std::allocator<int*>>*,n$8:std::initializer_list<int*>)
```
However, this means, `n$8` would be result of something skipped which we can't reason about. Instead, we just pass the underlying initialized array now, so we get the following translation:
```
*&0$?%__sil_tmpSIL_materialize_temp__n$7[0]:int* const =null
n$8=*&0$?%__sil_tmpSIL_materialize_temp__n$7:int* const [1*8] const
n$9=_fun_std::vector<int*,std::allocator<int*>>::vector(&vec:std::vector<int*,std::allocator<int*>>*,n$8:std::initializer_list<int*>)
```
Reviewed By: jvillard
Differential Revision: D21155014
fbshipit-source-id: 75850b1e6
Summary:
When encountering a constant, pulse creates an abstract value (a
variable) to represent it, and remembers that it's equal to it. The
problem is that pulse doesn't yet know how to deal with the fact that
some variables are going to be equal to each other.
This hacks around this issue in the case of constants, within the same
procedure, by remembering which constants have been assigned to which
place-holder variables, and serving those variables again when the same
constant is translated again.
Limitation: this doesn't work across procedure calls as the "constant
maps" are not saved in summaries.
Something to look out for: we don't want to make `if (p == NULL)` create
a path where `p` is invalid (we only make null invalid when we see an
assignment from 0, i.e. `p = NULL;`).
Reviewed By: ezgicicek
Differential Revision: D21089961
fbshipit-source-id: 5ebb85d0a
Summary:
1. Package will make the error too verbose.
2. We don't even need to say it is "class" because we say it in the error
description ("Class has 0 issues and can be marked Nullsafe").
Reviewed By: artempyanykh
Differential Revision: D21131998
fbshipit-source-id: 6ccca7615
Summary:
One source of false positives on container races is when the container member field is initialised to a concurrent version in a constructor, but the static type of the field doesn't reflect the thread safety of it.
This solution
- tracks flows from constructors of safe data structures to abstract addresses;
- initialises the initial attribute state when analysing a non-constructor method to that achieved by all constructors/class-initializers.
- checks for that attribute when recording container accesses.
Reviewed By: jvillard
Differential Revision: D21089428
fbshipit-source-id: 02a88f6e8
Summary: Modeling vector iterator with two internal fields: an internal array and an internal pointer. The internal array field points to the internal array field of a vector; the internal pointer field represents the current element of the array. For now `operator++` creates a fresh element inside the array.
Reviewed By: ezgicicek
Differential Revision: D21043304
fbshipit-source-id: db3be49ce
Summary:
Add a path condition to each symbolic state, represented in sledge's arithmetic domain. This gives a precise account of arithmetic constraints. In particular, it is relation and thus is more robust in the face of inter-procedural analysis.
This is gated behind a flag for now as there are performance issues with the new arithmetic.
Reviewed By: jberdine
Differential Revision: D20393947
fbshipit-source-id: b780de22a
Summary:
There are two types of anonymous classes (not user defined classes):
- classic anonymous classes (defined as $<int> suffixes)
- lambda classes (corresponding to lambda expressions). Experimentally,
they all have form `$Lambda$_<int>_<int>`, but the code just uses
`$Lambda$` as a heuristic so it is potentially more robust.
# Problem this diff solves
When generate meta-issues for nullsafe, we are interested only in
user-defined classes, so we merge all nested anonymous stuff to
corresponding user-defined classes and hence aggregate the issues.
Without this diff, for each lambda in the code, we would report this as
a separate meta-issue, which would both screw up stats and be confusing
for the user (when we start reporting mode promo suggestions!).
Reviewed By: artempyanykh
Differential Revision: D21042928
fbshipit-source-id: a7be266af
Summary:
This diff revises how to handle the unknown location in inferbo in two ways:
* stop appending field to the `Unknown` location, e.g. `Unknown.x.a` is evaluated to `Unknown`
* redesign the abstract of multiple locations, like `Bottom` < `Unknown` < `Known` locations
I am doing them in one diff since applying only one of them showed bad results.
Background: `Unknown` was adopted for abstracting all unknown concrete locations, so we could avoid missing semantics of assignments to unknown locations. We tried to keep soundness. However, it brought some other problems related to precision and performance.
1. Sometimes especially when Inferbo failed to reason precise pointer values, `Unknown` may point to many other abstract locations.
2. At that time, value assignments to `*Unknown` makes the situation worse: many abstract locations are updated with imprecise values.
This problem harmed not only its precision, but also its performance since it introduced more location entries in the abstract memory.
Reviewed By: jvillard
Differential Revision: D21017789
fbshipit-source-id: 0bb6bd8b5
Summary: The flags `--biabduction-fallback-model-alloc-pattern` and `--biabduction-fallback-model-free-pattern` were unused because we removed the models from .inferconfig a while ago because of too many false positives. We are implementing a better memory leak check based on Pulse, and are adding the similar flags `--pulse-model-alloc-pattern` and `--pulse-model-free-pattern`.
Reviewed By: jvillard
Differential Revision: D21061511
fbshipit-source-id: 1b3476c22
Summary:
See the code comment re: why don't we also recommend "strict" at this
stage. We can always change it later when we think users are happy with
strict.
Reviewed By: artempyanykh
Differential Revision: D21039553
fbshipit-source-id: 758ccf32c
Summary:
This diff is a step forward to the state when the list of type violations is
independent of the mode (and we use mode solely to decide re: whether to
report or not).
This fixes a case when we incorrectly defined possible promo mode (see
the test payload)
Reviewed By: artempyanykh
Differential Revision: D20948897
fbshipit-source-id: 616b96f96
Summary:
See the comments in the code why it makes logical sense.
This diff is a step forward the state when list of type violations is
independent of the mode (and we use mode solely to decide re: whether to
report or not).
This fixes majority of cases in ModePromotions.java
Reviewed By: artempyanykh
Differential Revision: D20948656
fbshipit-source-id: 82c0d530b
Summary:
Currently we exlude only if the method is based on deprecated config
packages.
Lets use the proper method, which covers both cases (config +
user-defined third party repo).
Reviewed By: artempyanykh
Differential Revision: D20946506
fbshipit-source-id: c3332667f
Summary:
Previously, we learned to detect if Default mode class can be made
Nullsafe(LOCAL).
Lets generalize it and calculate the precise mode.
NOTE 1: We don't distinct shades of "Trust some". We also don't
recommend trust some and recommend "Trust all" instead.
NOTE 2: As you can see from the test payload (see ModePromotions.java),
the precise calculation is not working as expected. This is due to a bug
in nullsafe implementation/design. See follow up diffs that will fix
this test.
Reviewed By: artempyanykh
Differential Revision: D20941345
fbshipit-source-id: 2255359ba
Summary: Unify the models of malloc and for the Create and Copy functions for Core Graphics. This add the null case from the malloc model to the Core Graphics models.
Reviewed By: jvillard
Differential Revision: D20890956
fbshipit-source-id: 278ac9d2f
Summary:
As soon as pulse detects an error, it completely stops the analysis and loses the state where the error occurred. This makes it difficult to debug and understand the state the program failed. Moreover, other analyses that might build on pulse (e.g. impurity), cannot access the error state.
This diff aims to restore and display the state at the time of the error in `PulseExecutionState` along with the diagnostic by extending it as follows:
```
type exec_state =
| represents the state at the program point that caused an error *)
```
As a result, since we don't immediately stop the analysis as soon as we find an error, we detect both errors in conditional branches simultaneously (see test result changes for examples).
NOTE: We need to extend `PulseOperations.access_result` to keep track of the failed state as follows:
```
type 'a access_result = ('a, Diagnostic.t * t [denoting the exit state] ) result
```
Reviewed By: jvillard
Differential Revision: D20918920
fbshipit-source-id: 432ac68d6
Summary: Consider functions that simply exit as impure by extending the impurity domain with `AbstractDomain.BooleanOr` that signifies whether the program exited.
Reviewed By: skcho
Differential Revision: D20941628
fbshipit-source-id: 19bc90e66
Summary:
This information can be useful for tooling responsible for further
processing (e.g. metric calculation and logging)
Reviewed By: artempyanykh
Differential Revision: D20914583
fbshipit-source-id: 61804d88f
Summary: The heuristics is to find a method in non-abstract sub-classes. See D20647101.
Reviewed By: jvillard
Differential Revision: D20491461
fbshipit-source-id: 759713ef4
Summary:
This diff distinguishes array declaration and size-setting in trace. For example, when there is an
assume statement on an array size, the array size can be pruned to another value. In which case, we
want to see "Set array size" in the trace, instead of "Array declaration".
Reviewed By: jvillard
Differential Revision: D20914930
fbshipit-source-id: 0253fb69e
Summary:
This diff lifts the `PulseAbductiveDomain.t` in `PulseExecutionState` by tracking whether the program continues the analysis normally or exits unusually (e.g. by calling `exit` or `throw`):
```
type exec_state =
| ContinueProgram of PulseAbductiveDomain.t (** represents the state at the program point *)
| ExitProgram of PulseAbductiveDomain.t
(** represents the state originating at exit/divergence. *)
```
Now, Pulse's actual domain is tracked by `PulseExecutionState` and as soon as we try to analyze an instruction at `ExitProgram`, we simply return its state.
The aim is to recover the state at the time of the exit, rather than simply ignoring them (i.e. returning empty disjuncts). This allows us to get rid of some FNs that we were not able to detect before. Moreover, it also allows the impurity analysis to be more precise since we will know how the state changed up to exit.
TODO:
- Impurity analysis needs to be improved to consider functions that simply exit as impure.
- The next goal is to handle error state similarly so that when pulse finds an error, we recover the state at the error location (and potentially continue to analyze?).
Disclaimer: currently, we handle throw statements like exit (as was the case before). However, this is not correct. Ideally, control flow from throw nodes follows catch nodes rather than exiting the program entirely.
Reviewed By: jvillard
Differential Revision: D20791747
fbshipit-source-id: df9e5445a
Summary:
Malloc returns either an allocated object or a null pointer if there is no memory available. Modelling that.
This has always been a bit contentious because this leads to NPEs that people often ignores because they don't care. But if we don't model this, then we have FPs when people do take this into account when freeing the memory.
Reviewed By: jvillard
Differential Revision: D20791692
fbshipit-source-id: 6fd259f12
Summary:
This diff limits the depth of abstract location by a constant.
problem: Inferbo generated too many of abstract locations, especially when struct types had many pointer fields and Inferbo was not able to analyze the objects precisely. Since the number of generated abstract locations were exponential to the number of fields, it resulted in OOM in the end.
(reported by zyh1121 in https://github.com/facebook/infer/issues/1246)
Reviewed By: jvillard
Differential Revision: D20818471
fbshipit-source-id: f8af27e5c
Summary:
Currenlty the cost issue is printed at the first node of a function, which is usually the first
statment of the function. This may give a wrong impression that the cost of the statement is
changed.
This diff re-locate where to print issues with heuristics. Going backward from the first node
lines, it looks up a line satisfying,
1. A line should start with <fname> or should include " <fname>".
2. The <fname> found in 1 should be followed by a space, '<', '(', or end of line.
Reviewed By: jvillard
Differential Revision: D20766876
fbshipit-source-id: b4fee3180
Summary:
It's easy to create large arrays in code, eg `int x[1UL << 16];`, but
these can generate huge nodes in SIL because zero-initialization is
translated by zero-ing structures element by element. Introduce a
builtin to use instead. Keep the naive method for small structures (with
a configurable limit on "small").
Reviewed By: dulmarod
Differential Revision: D20836836
fbshipit-source-id: 6bf5410f8
Summary: Modelling `CG.*Release ` and `CFRelease` as `free`. This is what we were doing in biabduction.
Reviewed By: skcho
Differential Revision: D20767174
fbshipit-source-id: c77c1cdc6