Summary: Instead of variable having the value of a single location on stack, we now allow variables to have multiple locations. Consequently, we also allow a memory location to point to a set of locations in the heap. We enforce a limit on a maximum number of locations in a set (currently 5).
Reviewed By: jvillard
Differential Revision: D13190876
fbshipit-source-id: 5cb5ba9a6
Summary: It is not used yet and still manages to cause false positives.
Reviewed By: mbouaziz
Differential Revision: D13102948
fbshipit-source-id: 2122666c2
Summary:
It's useful for checkers to know when variables go out of scope to
perform garbage collection in their domains, especially for complex
domains with non-trivial joins. This makes the analyses more precise at
little cost.
This could have been added as a custom function call to a builtin, but I
decided against it because this instruction doesn't have the semantics
of any function call. It's better for each checker to explicitly not
deal with the custom instruction instead.
Reviewed By: jberdine
Differential Revision: D13102951
fbshipit-source-id: 33be22fab
Summary:
Before, the liveness pre-analysis would place extra instructions in the
CFG for either:
1. marking an `Ident.t` as dead, or
2. marking a `Pvar.t` as `= 0`
But we have no way of marking pvars dead without setting them to 0. This
is bad because setting pvars to 0 is not possible everywhere they are
dead. Indeed, we only do it when we haven't seen their address being
taken anyway. This prevents the following situation, recorded in our tests:
```
int address_taken() {
int** x;
int* y;
int i = 7;
y = &i;
x = &y;
// if we don't reason about taken addresses while adding nullify instructions,
// we'll add
// `nullify(y)` here and report a false NPE on the next line
return **x;
}
```
So we want to mark pvars as dead without nullifying them. This diff
extends the `Remove_temps` SIL instruction to accept pvars as well, and
so renames it to `ExitScope`.
Reviewed By: da319
Differential Revision: D13102953
fbshipit-source-id: aa7f03a52
Summary: In this diff, it passes the parameter of integer type widths to evaluation functions. The parameter which will be used for casting in the following diff.
Reviewed By: mbouaziz
Differential Revision: D12920581
fbshipit-source-id: 48bbc802b
Summary:
It enables the translation of casting expression. As of now, it
translates only the castings of pointers to integer types, in order to
avoid too much of change, which may mess the checkers up.
Reviewed By: jvillard
Differential Revision: D12920568
fbshipit-source-id: a5489df24
Summary:
It turns out keeping attributes (such as invalidation facts) separate
from the memory is a bad idea and leads to loss of precision and false
positives, as seen in the new test (which previously generated a
report).
Allow me to illustrate on this example, which is a stylised version of
the issue in the added test: previously we'd have:
```
state1 = { x = 1; invalids={} }
state2 = { x = 2; invalids ={1} }
join(state1, state2) = { x = {1, 2}; invalids={{1, 2}} }
```
So even though none of the states said that `x` pointed to an invalid
location, the join state says it does because `1` and `2` have been
glommed together. The fact `x=1` from `state1` and the fact "1 is
invalid" from `state2` conspire together and `x` is now invalid even
though it shouldn't.
Instead, if we record attributes as part of the memory we get that `x`
is still valid after the join:
```
state1 = { x = (1, {}) }
state2 = { x = (2, {}) }
join(state1, state2) = { x = ({1, 2}, {}) }
```
Reviewed By: mbouaziz
Differential Revision: D12958130
fbshipit-source-id: 53dc81cc7
Summary:
I hear that this scheduler is better. I want the best scheduler
possible. Also pulse's join is a bit complex so it might matter one day.
whydididothis
Reviewed By: mbouaziz
Differential Revision: D12958131
fbshipit-source-id: 3bd77ccba
Summary: For a general case of `operator=` we want to create a fresh location for the first parameter as `operator=` behaves as copy assignment.
Reviewed By: jvillard
Differential Revision: D12940635
fbshipit-source-id: 89c6e530d
Summary:
Whenever `vec.reserve(n)` is called, remember that the vector is
"reserved". When doing `vec.push_back(x)` on a reserved vector, assume
enough size has been reserved in advance and do not invalidate the
underlying array.
This gets rid of false positives.
Reviewed By: mbouaziz
Differential Revision: D12939837
fbshipit-source-id: ce6354fc5
Summary:
Instead of keeping at most one invalidation fact for each address, keep
a set of them and call them "attributes". Keeping a set of invalidation
facts is redundant since we always only want the smallest one, but
makes the implementation simpler, especially once we add more kinds of
attributes (used for modelling, see next diffs).
Reviewed By: mbouaziz
Differential Revision: D12939839
fbshipit-source-id: 4a54c2132
Summary:
Copied on the ownership checker logic: return the initial value of the
domain as return. This can probably be improved.
Reviewed By: mbouaziz
Differential Revision: D12888102
fbshipit-source-id: 9e2dac7fc
Summary:
When initialising a variable via semi-exotic means, the frontend loses
the information that the variable was initialised. For instance, it
translates:
```
struct Foo { int i; };
...
Foo s = {42};
```
as:
```
s.i := 42
```
This can be confusing for backends that need to know that `s` actually
got initialised, eg pulse.
The solution implemented here is to insert of dummy call to
`__variable_initiazition`:
```
__variable_initialization(&s);
s.i := 42;
```
Then checkers can recognise that this builtin function does what its
name says.
Reviewed By: mbouaziz
Differential Revision: D12887122
fbshipit-source-id: 6e7214438
Summary:
Now that arrays are dealt with separately (see previous diff), we can
turn the join back into an over-approximation as far as invalid
locations are concerned.
Reviewed By: skcho
Differential Revision: D12881989
fbshipit-source-id: fd85e49c0
Summary:
Arrays are the main source of false positives that prevent us from
having a better (less under-approximate) join in general. The next diff
improves join and I split this off to make it easier to review.
Reviewed By: mbouaziz
Differential Revision: D12881986
fbshipit-source-id: 5f52dea27
Summary:
This prevents the join from wrongly assuming that we haven't seen a
variable on one side of the join.
Reviewed By: skcho
Differential Revision: D12881987
fbshipit-source-id: 42a776adb
Summary:
Smaller numbers are easier to read and abstract addresses should never
be shared across functions anyway.
Reviewed By: da319
Differential Revision: D12881988
fbshipit-source-id: f9bcfa343
Summary:
The upcoming ocamlformat has the ability to parse and format
docstrings. This requires that the docstrings conform to the ocamldoc
spec a bit more strongly. If a docstring does not parse, it is left
alone, but if it is morally ill-formed but parses by chance, it can be
reformatted incorrectly. This patch fixes the existing instances of
this problem.
Reviewed By: mbouaziz
Differential Revision: D12911937
fbshipit-source-id: 1c2eb590b
Summary: For `operator=(lhs, rhs)` we want to model it as an assignment if rhs is materialized temporary created in the constructor.
Reviewed By: jvillard
Differential Revision: D10462510
fbshipit-source-id: 998341e69
Summary: Do not create a new location for placement new argument if it already exists.
Reviewed By: jvillard
Differential Revision: D12839942
fbshipit-source-id: 758b67a82
Summary:
Get rid of `USE_AFTER_LIFETIME`. This could be useful to deploy pulse
alongside the ownership checker too.
Reviewed By: da319
Differential Revision: D12857477
fbshipit-source-id: 8e2a2a37c
Summary: Make the whole type private, introduce constructors for each variant, and deal with the consequences.
Reviewed By: da319
Differential Revision: D12825810
fbshipit-source-id: a01922812
Summary:
Keep `USE_AFTER_LIFETIME` for unclassified errors (for now it contains
vector invalidation too because I can't think of a good name for
them, and maybe it makes sense to wait until we have more types of them
to decide on a name).
Reviewed By: da319
Differential Revision: D12825060
fbshipit-source-id: bd75ef698
Summary:
Getting this right will be long and complex so for now the easiest is to
underreport and only consider as invalid the addresses we know to be invalid on
both sides of a join. In fact the condition for an address to be invalid after
a join is more complex than this: it is invalid only if *all* the addresses in
its equivalence class as discovered by the join are invalid.
Reviewed By: skcho
Differential Revision: D12823925
fbshipit-source-id: 2ca109356
Summary: Similarly as for destructors, we provide an address of an object as a first parameter to constructors. When constructor is called we want to create a fresh location for a new object.
Reviewed By: jvillard
Differential Revision: D10868433
fbshipit-source-id: b60f32953
Summary: We provide an address of an object as a parameter to destructor. When destructor is called the object itself is invalidated, but not the address.
Reviewed By: jvillard
Differential Revision: D12824032
fbshipit-source-id: 516eebcf8
Summary:
Seems useful to know when we're printing one instruction only, but not when we
print lots of them for readability.
Reviewed By: mbouaziz
Differential Revision: D12823481
fbshipit-source-id: 2beb339f2
Summary:
Turns out once a vector array became invalid it stayed that way, instead
of the vector getting a new valid internal array.
Reviewed By: skcho
Differential Revision: D10853532
fbshipit-source-id: f6f22407f
Summary:
Now the domain can reason about `&` and `*` too. When recording `&`
between two locations also record a back-edge `*`, and vice-versa.
Reviewed By: mbouaziz
Differential Revision: D10509335
fbshipit-source-id: 8091b6ec0
Summary: This is more flexible and allows us to give more details when reporting.
Reviewed By: mbouaziz
Differential Revision: D10509336
fbshipit-source-id: 79c3ac1c8
Summary: Just to organise PulseDomain a bit more since it's quite big.
Reviewed By: mbouaziz
Differential Revision: D10509334
fbshipit-source-id: a81b36aa6
Summary:
Invalidating addresses for destructors to catch use after destructor errors.
To pass ownership tests for use after destructor errors, we still need to:
(1) fix pointer arithmetic false positives
(2) add model for placement new to fix false positives
(3) add model for operator= to fix false positives
(4) support inter-procedural analysis for destructor_order_bad test
Reviewed By: jvillard
Differential Revision: D10450912
fbshipit-source-id: 2d9b1ee68
Summary:
Instead of the non-sensical piecewise join we had until now write
a proper one. Hopefully the comments explain what it does. Main one:
```
(* high-level idea: maintain some union-find data structure to identify locations in one heap
with locations in the other heap. Build the initial join state as follows:
- equate all locations that correspond to identical variables in both stacks, eg joining
stacks {x=1} and {x=2} adds "1=2" to the unification.
- add all addresses reachable from stack variables to the join state heap
This gives us an abstract state that is the union of both abstract states, but more states
can still be made equal. For instance, if 1 points to 3 in the first heap and 2 points to 4
in the second heap and we deduced "1 = 2" from the stacks already (as in the example just
above) then we can deduce "3 = 4". Proceed in this fashion until no more equalities are
discovered, and return the abstract state where a canonical representative has been chosen
consistently for each equivalence class (this is what the union-find data structure gives
us). *)
```
Reviewed By: mbouaziz
Differential Revision: D10483978
fbshipit-source-id: f6ffd7528
Summary:
Instead of propagating a partial state give up the analysis of the
function entirely on error. The state after an error is mostly
non-sensical so until we know better just giving up makes sure the
analysis remains sensible and produce fewer spurious warnings.
Reviewed By: mbouaziz
Differential Revision: D10483979
fbshipit-source-id: 171ec8469
Summary: First version of an analyzer collecting classes transitively touched.
Reviewed By: mbouaziz
Differential Revision: D10448025
fbshipit-source-id: 0ddfefd46
Summary:
`Location` was clashing with the `Location` module, so use `Address`
instead.
When invalidating an address, remember the "actor" of its invalidation,
i.e. the access expression leading to the address and the source
location of the corresponding instruction.
When checking accesses, also pass the actor responsible for the access,
so that when we raise an error we know:
1. when and why a location was invalidated
2. when and why we tried to read it after that
Reviewed By: mbouaziz
Differential Revision: D10446282
fbshipit-source-id: 3ca4fb3d4
Summary:
Model `x[y]` and `x.push_back(i)` to catch the classic bug of "take
reference inside vector, invalidate, then use again".
Reviewed By: da319
Differential Revision: D10445824
fbshipit-source-id: 21ffd9677
Summary:
Do the intersection of the heap and stack domains, and the union of the
invalid location sets. This forgets invalid locations that appear only
in one heap, unfortunately. We can start with this and improve later.
Reviewed By: mbouaziz
Differential Revision: D10445825
fbshipit-source-id: cc24460af
Summary:
New analysis in foetal form to detect invalid use of C++ objects after their
lifetime has ended. For now it has:
- A domain consisting of a graph of abstract locations representing the heap, a map from program variables to abstract locations representing the stack, and a set of locations known to be invalid (their lifetime has ended)
- The heap graph is unfolded lazily when we resolve accesses to the heap down to an abstract location. When we traverse a memory location we check that it's not known to be invalid.
- A simple transfer function reads and updates the stack and heap in a rudimentary way for now
- C++ `delete` is modeled as adding the location that its argument resolves to to the set of invalid locations
- Also, the domain has a really crappy join and widening for now (see comments in the code)
With this we already pass most of the "use after delete" tests from the
Ownership checker. The ones we don't pass are only because we are missing
models.
Reviewed By: mbouaziz
Differential Revision: D10383249
fbshipit-source-id: f414664cb
Summary:
Using debugging on uninit raised an exception. A file was opened twice and closed twice.
This happened because the two abstract interpreters (SIL, LowerHIL) conflicted.
Let's use the LowerHIL-AI directly
Reviewed By: jvillard
Differential Revision: D10126442
fbshipit-source-id: 113c9e131
Summary: Use `PerfEvent` to record the execution time of individual checkers.
Reviewed By: jeremydubreil, mbouaziz
Differential Revision: D9832102
fbshipit-source-id: 678fca155
Summary:
Callsites of `Reporting.log_error/warning` always use `Exceptions.Checkers`, let's simplify the API.
Under the hood it still creates an exception, but this can be cleaned up later.
Reviewed By: jeremydubreil
Differential Revision: D9799860
fbshipit-source-id: 6492a60b4
Summary: We had a special case for fixing false positives on constexpr implicitly captured by lambdas. However, we do not report dead stores on constexpr anymore, hence, do not need the special case anymore. Moreover, the special case was not only capturing constexpr in lambdas, but also any variables which type had `const` (see new test `capture_const_bad` which was not being reported before this diff)
Reviewed By: mbouaziz
Differential Revision: D9654848
fbshipit-source-id: 882fd2804
Summary:
It simplifies abstract memory instantiations of function calls. Now it instantiates callee memories by directly evaluating symbol paths, rather than constructing `subst_map`.
main changes are:
- no construction of `subst_map` and `trace_map`
- no symbol table in Inferbo's summary
- no `Symbol_not_found` exception (for when a required symbol was unavailable in `subst_map`)
Reviewed By: mbouaziz
Differential Revision: D9495597
fbshipit-source-id: 18cdcd6f7
Summary: We report dead store false positives in template arguments when constexpr is used. To remove the false positives, with the expense of some false negatives, we do not report dead stores on constexpr anymore.
Reviewed By: mbouaziz
Differential Revision: D9608095
fbshipit-source-id: 91b0c71c4
Summary:
Lambdas can capture references to locals of the enclosing method as long as
they are not propagated outside the method. However to keep things simple
always allow them to capture locals of the enclosing method at the price of
some false negatives.
Reviewed By: da319
Differential Revision: D8974434
fbshipit-source-id: 957ae44bd
Summary:
In SIL, (1) some program variables (e.g., array parameter) are used as pointers to heap addresses and (2) the other program variables (e.g., local array) are used as addresses themselves. So, the values of (1) are retrieved by the `Load` command, while that of (2) are by `Exp.Lvar` expressions directly.
To address them differently, we had managed two maps (`Mem.Stack` and `Mem.Heap`), but which introduced function duplications on abstract memory and increased complexity. This diff merges the two maps, and instead a location set is used for distinguishing two types of abstract locations during analysis.
Reviewed By: mbouaziz
Differential Revision: D9420388
fbshipit-source-id: 13f824850
Summary:
`Errlog` will merge similar issues (same severity, name, description) reported at the same location, so let's make sure the locaiton is mandatory.
Issues:
- errors happening in `Ondemand` still use the `State` which makes sense only for biabduction and eradicate
- a case of `NullabilitySuggest` didn't have a location, I did my best to patch it but I'm sure the location could be more precise
Reviewed By: jvillard
Differential Revision: D9332840
fbshipit-source-id: ee7898146
Summary:
Before we would convert it to string in `Reporting` and pass it to `Errlog` which would use it only to 'log events'.
I guess the reason is that there was a cyclic dependency between `Errlog` and `clang_method_kind` defined in `ProcAttributes`.
This diff:
- moves it to its own module
- defers the conversion to string
Reviewed By: jvillard
Differential Revision: D9332819
fbshipit-source-id: 43a028b61
Summary: C++17 introduce guaranteed copy elision which omits constructor calls. In ownership analysis, we depended on these constructor calls to acquire ownership. In particular, when a method returns struct, previously, a constructor was used to acquire ownership. In this diff, we acquire ownership of the returned structs directly.
Reviewed By: mbouaziz
Differential Revision: D9244302
fbshipit-source-id: ae8261b99
Summary:
To keep up with the times. Changes consist of new features and moving modules
around so shouldn't change anything on our side.
Depends on D9239803
The controller you requested could not be found.: facebook-clang-plugins
Reviewed By: da319
Differential Revision: D9239817
fbshipit-source-id: d02a2076a
Summary:
Use `ignore` instead, as this will warn if the argument is an arrow type,
unlike `let _ = ...`. This makes the code more future-proof: if an argument is
added to a function called in `let _ = f x` then the compiler will complain
instead of silently turning a value into a partial evaluation.
Also got rid of particularly irksome `let _ = <stuff returning unit> in` where I could.
Reviewed By: mbouaziz
Differential Revision: D9217176
fbshipit-source-id: 3be463405
Summary:
- changes the `Ondemand` callbacks to take the execution environment instead of a `get_proc_desc` function.
- removes all the cases passing `get_proc_desc` as parameter to use `Ondemand.get_proc_desc` instead.
Reviewed By: jvillard
Differential Revision: D9200583
fbshipit-source-id: d16c218b5
Summary: This should be functionally equivalent but removes one call to `Summary.get`
Reviewed By: jvillard
Differential Revision: D9153924
fbshipit-source-id: d49789d2f
Summary: It uses a SymbolPath map to Symbol in Inferbo's summary instead of an entry memory of callee, which is used for instantiations of the abstract memories on function calls.
Reviewed By: mbouaziz
Differential Revision: D9081631
fbshipit-source-id: 478cda0de
Summary:
It adds relational domains to Inferbo: octagon of Apron and polyhedra of Elina.
- Each Mem domain value includes one relational value containing relations among symbols. The relational values are modified by the `Prune` and `Store` commands.
- Each abstract value includes three symbols, which represent integer value, array offset, and array size of an abstract value.
The relational domain is deactivated by default. Use the `--bo-relational-domain {oct, poly}` option for the activation, though Inferbo with the relational domains does not work at this point because some modifications of Apron and Elina we made has not been applied to their opam repositories yet.
Reviewed By: jvillard
Differential Revision: D8874102
fbshipit-source-id: 08e5883cb
Summary:
It adds relational domains to Inferbo: octagon of Apron and polyhedra of Elina.
- Each `Mem` domain value includes one relational value containing relations among *symbols*. The relational values are modified by the `Prune` and `Store` commands.
- Each abstract value includes three *symbols*, which represent integer value, array offset, and array size of an abstract value.
The relational domain is deactivated by default, so this diff should not make any differences in CI.
Use `--bo-relational-domain {oct, poly}` for the activation, though Inferbo with the relational domains does not work at this point because some modifications of Apron and Elina we made has not been applied to their opam repositories yet.
Reviewed By: mbouaziz, jvillard
Differential Revision: D8478542
fbshipit-source-id: 510ff53
Summary:
The addresses of global variables do not need initialisation to exist and be valid as they are part of the code or data segment of the program. This means that taking the address of a global is not in itself a danger for SIOF. However, dereferencing such an address would be. In order to avoid false positives but avoid being too unsound, only ignore them when the address is taken only to set another global. The general case would require a more complicated abstract domain.
Fixes#866
Reviewed By: ngorogiannis
Differential Revision: D8055627
fbshipit-source-id: 92307b2
Summary:
I realized that control variable analysis was broken when we had multiple back-edges for the same loop. This is often the case when we have a switch statement combined with continue in a loop (see `test_switch` in `switch_continue.c`) or when we have disjunctive guards in do-while loops.
This diff fixes that by
- defining a loop by its loophead (the target of its backedges) rather than its back-edges. Then it converts back-edge list to a map from loop_head to sources of the loop's back-edges.
- collecting multiple guard nodes that come from potentially multiple exit nodes per loop head
In addition, it also removes the wrong assumption that an exit node belongs to a single loop head.
Reviewed By: mbouaziz
Differential Revision: D8398061
fbshipit-source-id: abaf288
Summary: Introduce an annotation that forces the summary of a method to be free of blocking events, without suppressing other reports.
Reviewed By: jeremydubreil
Differential Revision: D8276787
fbshipit-source-id: be9eed8
Summary:
`make doc` will use `jbuilder` (which in turn uses `odoc`) to generate the
documentation for infer's modules. This is useful to browse the APIs of infer
and gives a more discoverable place to host more general documentation about
infer's internals.
Besides the actual plumbing necessary to generate the docs, this diff also
- Moves the various infer/src/*/README.md to index.mld files that make it to the generated docs
- Fixes some doc comments that would anger `ocamldoc`
Closes#435
Reviewed By: mbouaziz
Differential Revision: D8314572
fbshipit-source-id: 4a5c70e
Summary: We get a lot of false positives for union types as union fields are treated as separate memory locations at the moment. For now we do not treat union fields as uninitialised.
Reviewed By: mbouaziz
Differential Revision: D8277363
fbshipit-source-id: efe5b4a
Summary:
Having the `Node` module including in the `CFG` one is confusing.
Let's keep it separate.
Reviewed By: ngorogiannis
Differential Revision: D8185754
fbshipit-source-id: 62077e6
Summary:
Change the license of the source code from BSD + PATENTS to MIT.
Change `checkCopyright` to reflect the new license and learn some new file
types.
Generated with:
```
git grep BSD | xargs -n 1 ./scripts/checkCopyright -i
```
Reviewed By: jeremydubreil, mbouaziz, jberdine
Differential Revision: D8071249
fbshipit-source-id: 97ca23a
Summary:
For now: just moving this list behind an abstract type.
Next: changing the internal representation.
Reviewed By: ngorogiannis
Differential Revision: D8140926
fbshipit-source-id: 5b959b0
Summary:
Preparing for the future change, we won't see instructions as lists but as an abstract type.
This change may be a very minor perf regression: does a few more (but bounded by a constant) instructions traversals only for the nodes involving a Printf-like function call, only for the PrintfArgs checker...
Reviewed By: jvillard
Differential Revision: D8094124
fbshipit-source-id: e2e2c5e
Summary:
We never really need the list of nodes/succs/preds, we only need to fold over them.
This will reduce garbage for computed lists like in the Exceptional CFG or the OneInstrPerNode CFG.
Reviewed By: ngorogiannis
Differential Revision: D8185665
fbshipit-source-id: d042beb
Summary:
Moving away from C++ include-based models means that we cannot reliably detect
anymore whether a file includes <iostream> or not. In order not to be too
spammy, let's always assume standard streams are initialized for now when the
include models are off.
Recent versions of libstdc++ make these models redundant so there is hope that in a
bright future the analysis of std streams initialisation will work correctly without infer
having to have its own models anyway.
Reviewed By: mbouaziz
Differential Revision: D8043467
fbshipit-source-id: d118043
Summary: We want both pointer and pointer dereference to be uninitialised at the beginning. Forgot to add the expression of type pointer when updating the analysis from access paths to access expressions.
Reviewed By: ddino
Differential Revision: D8117011
fbshipit-source-id: 534f7ef
Summary: Set arguments of pointer type as initialised for indirect function calls.
Reviewed By: mbouaziz
Differential Revision: D8097895
fbshipit-source-id: 830f568
Summary: Treat array accesses as initialised if they are passed by reference.
Reviewed By: jvillard
Differential Revision: D8071247
fbshipit-source-id: 5480e90
Summary: Use AccessExpressions instead of AccessPath in uninit analysis. This will allow us to distinguish between pointers and their dereferences.
Reviewed By: jvillard
Differential Revision: D8042359
fbshipit-source-id: 604bcbc
Summary:
Preparing for bigger changes...
- Rename `payload` field to `payloads`
- Move `payload` type to `Payloads.t`
- `SummaryPayload`s only have to implement a change on `Payloads.t` rather than `Summary.t`
Reviewed By: sblackshear
Differential Revision: D7987211
fbshipit-source-id: c9d7a74
Summary:
Before we were computing the size of an abstract state (`range`) using the `NonNegativeBound` domain but it wasn't able to express product of symbolic values.
This diff introduces a domain for that.
The range of an interval is still computed in `NonNegativeBound` but then the product is done in `TopLiftedPolynomial` so all costs end up being of that type.
The //symbols// of a polynomial are `NonNegativeBound` (so the polynomial only represent non-negative values, perfect for a cost), which handles substitution correctly, i.e. it gives zero instead of negative values.
Reviewed By: ddino
Differential Revision: D7397229
fbshipit-source-id: 6868bb7
Summary:
Attempt at a better naming scheme:
- `Specs.summary` are now `Summary.t`. The `Summary` module (replacing `Specs`) contains the summary of a procedure: the results of all the analyses, etc.
- `Summary.ml` is now `SummaryPayload.ml`. This concerns how each (AI) analysis extracts its payload from the master summary.
- Accordingly, checkers now define a `Payload` module where previously they defined a `Summary` module. The type is also cleaned up to use `t` instead of `payload`, etc.
- Cleaned up some names as a result, for instance `Specs.get_summary` -> `Summary.get`, etc.
Reviewed By: ngorogiannis
Differential Revision: D7935883
fbshipit-source-id: 1766545
Summary:
This is an attempt to make things more consistent, and maybe save some work
from the `Format` module in case flambda doesn't have our backs.
Reviewed By: jberdine
Differential Revision: D7775496
fbshipit-source-id: 59a6314
Summary:
The Cost analysis uses `Bound` for non-negative values only, let's make it a separate module (and abstract type).
This also separates the abstract domain part of `Bound` which we wanted anyway.
Depends on D7844267
Depends on D7843351
Depends on D7782184
Reviewed By: ddino
Differential Revision: D7844572
fbshipit-source-id: 0e6b620
Summary: We were wrongly using the underapproximation of `min` rather than the overapproximation
Reviewed By: ddino
Differential Revision: D7844267
fbshipit-source-id: c9d9247
Summary:
This simplifies the frontends and backends in most cases. Before this diff,
returning `void` could be modelled either with a `None` return, or a dummy
return variable with type `Tvoid`. Now it's always the latter.
Reviewed By: sblackshear, dulmarod
Differential Revision: D7832938
fbshipit-source-id: 0a403d1
Summary:
Add warning 60 (unused module) to the list of fatal warnings. Whitelisting
modules at toplevel is tricky (see inline comments) but doable.
Reviewed By: mbouaziz
Differential Revision: D7790073
fbshipit-source-id: 6f591c4
Summary:
We want instr-granular invariant maps so let's use the OneInstrPerNode CFG in the AI analyzers.
This requires specializing the TransferFunctions.
Keep using the normal CFG where we only need node-granular informations.
Depends on D7587241
Depends on D7608526
Reviewed By: sblackshear
Differential Revision: D7618320
fbshipit-source-id: 73918f0
Summary:
Upgrade ocamlformat, and base which needs to be done in sync in order to build
ocamlformat, and the other deps can come for the ride.
Reviewed By: jvillard
Differential Revision: D7663537
fbshipit-source-id: 3e90970
Summary: We already suppress race reports if the field is marked in this way; makes sense to do the same thing for these reports.
Reviewed By: ngorogiannis
Differential Revision: D7589275
fbshipit-source-id: 8f0aeab
Summary:
Now that everything can run at the same time and we have preanalyses, it can be quite hard to read debug sessions.
Here come session names!
Depends on D7607336
Reviewed By: sblackshear
Differential Revision: D7607481
fbshipit-source-id: 676af86
Summary:
Use an adhoc type for `StructuralConstraints` instead of hacky `Exp`s.
Also use a cleaner `Node.IdMap` instead of `Int.Map`.
Depends on D7586645
Reviewed By: ddino
Differential Revision: D7587241
fbshipit-source-id: f9d65bb
Summary:
So we can share stuff between analyses using the same CFG and node representation.
Depends on D7586302
Depends on D7586348
Depends on D7568701
Reviewed By: sblackshear
Differential Revision: D7586645
fbshipit-source-id: ed64b2c
Summary:
If an aggregate `a` has a field `f` whose type has a constructor (e.g., `std::string`), we translate creating a local aggregate `A { "hi" }` as `string(&(a.f), "hi")`.
This diff makes sure that we recognize this as initializing `a`.
Reviewed By: jeremydubreil
Differential Revision: D7404624
fbshipit-source-id: 0ba90a7
Summary:
Limit the scope of what gets included into IStd.ml to only values that we want
to shadow. New values go into other files.
Also, build istd/ with `Core` open.
Reviewed By: mbouaziz
Differential Revision: D7382111
fbshipit-source-id: 969f0e8
Summary: This makes the code cleaner, and also makes it easier to look at the type of a var (needed in a successor).
Reviewed By: jeremydubreil
Differential Revision: D7359466
fbshipit-source-id: 5acdb7a
Summary:
Show where the invalidation occurred in the trace.
Should make things easier to understand.
Reviewed By: jeremydubreil
Differential Revision: D7312182
fbshipit-source-id: 44ba9cc
Summary: Should be no semantic change, just trying to avoid code duplication.
Reviewed By: jeremydubreil
Differential Revision: D7268588
fbshipit-source-id: 8b00125
Summary: Add new clang_method_kind field to AnalysisIssue, logged similarly to the existing one in AnalysisStats
Reviewed By: dulmarod
Differential Revision: D7273660
fbshipit-source-id: d1ca79b
Summary:
Aggregate initialization (e.g., `S s{1, 2}`) doesn't invoke a contructor.
Our frontend translates aggregation initialization as assigning to each field in the struct.
To avoid the appearance of the struct being uninitialized, count any assignment to a field of an aggregate struct as initializing the struct.
Reviewed By: jeremydubreil
Differential Revision: D7189671
fbshipit-source-id: ace02fc
Summary: If a `Closure` expression `e` captures variable `x`, consider `e` as borrowing from `x`. When the closure is invoked via `operator()`, check that the borrow is still valid.
Reviewed By: jeremydubreil
Differential Revision: D7071839
fbshipit-source-id: d923a6a
Summary: Add a new command-line option `--external-java-packages` which allows the user to specify a list of Java package prefixes for external packages. Then the analysis will not report non-actionable warnings on those packages (e.g., inconsistent `Nullable` annotations in external packages).
Reviewed By: jeremydubreil
Differential Revision: D7126960
fbshipit-source-id: c4f3c7c
Summary:
Fairly simple approach here:
- If the RHS of an assignment is a frontend-generated temporary variable, assume it transfers ownership to the LHS variable
- If the RHS of an assignment is a program variable, assume that the LHS variable is borrowing from it.
- If we try to access a variable that has borrowed from a variable that is now invalid, complain.
Reviewed By: jeremydubreil
Differential Revision: D7069947
fbshipit-source-id: 99b8ee2
Summary:
Enrich capability domain with borrowing info. Inline comments explain what the domain is doing.
The analyzer isn't actually using the borrowing functionality yet--that comes in the successor.
Reviewed By: jeremydubreil
Differential Revision: D7067942
fbshipit-source-id: 4c03c69
Summary:
Before D7100561, the frontend translated capture-by-ref and capture-by-value in the same way.
Now we can tell the difference and report bugs in the capture-by-value case.
Reviewed By: jeremydubreil
Differential Revision: D7102214
fbshipit-source-id: e9d3ac7
Summary:
You can capture a variable by reference in a lambda, assign to it, and then invoke the lambda.
This looks like a dead store from the perspective of the current analysis.
This diff mitigates the problem by computing an additional analysis that tracks variables captured by ref at each program point.
It refuses to report a dead store on a variable that has already been captured by reference.
Later, we might want to incorporate the results of this analysis directly into the liveness analysis instead of just using it to gate reporting.
Reviewed By: jeremydubreil
Differential Revision: D7090291
fbshipit-source-id: 25eeffa
Summary: Just some minor renaming to be more consistent with other modules. I was about to use these modules and was too lazy to type `Ident.IdentSet`.
Reviewed By: da319, avarun42
Differential Revision: D6999808
fbshipit-source-id: c24edef
Summary:
A simple intraprocedural analysis that tracks when a storage location is read or deleted.
For now, this works only with local variable storage locations; field and array accesses are ignored.
In order to test this, I added a new "use-after-lifetime" warning. It complains when a variable is read or deleted after it has already been deleted.
Reviewed By: jeremydubreil
Differential Revision: D6961314
fbshipit-source-id: 75e95a2
Summary: More preparation for extending HIL with dereference and address of. We need left hand side of the assignment to also include dereference and address of.
Reviewed By: sblackshear
Differential Revision: D6976150
fbshipit-source-id: 47d1d76
Summary:
See comment--`Prop(resType = blah) myProp` will generate `.myProp`, `.myPropRes`, and `.myPropAttr`, and any of them can be used to set the prop.
Because our annotation parameter parsing is a bit primitive, handle this by simply checking the `Res` and `Attr` suffixes for every `Prop`.
This shouldn't lead to false negatives because these methods will only exist if the `resType` annotation is specified anyway.
Reviewed By: jeremydubreil
Differential Revision: D6955362
fbshipit-source-id: ec59b21
Summary:
1) Fixes some false negatives when a method annotated with `nullable` in the header is not annotated in the implementation and the attribute lookup returns the implementation. In that case, we should follow the information given in the header.
2) Fixes some false positives when annotations are in the other way around, i.e. annotated in the implementation but not in the headers. For now, there should be no report in this case, but the analysis should be extended to report the inconsistency between the header and the implementation
3) Fixes some cases of weird reports caused by name conflicts where the method in the include has the same name has another method annotated differently.
Reviewed By: jvillard
Differential Revision: D6935379
fbshipit-source-id: 3577eb0
Summary: Preparing to extend HIL with Dereference and AddressOf expressions. Next steps: (1) change SIL -> HIL translation to preserve address of and dereference; (2) adapt analyses based on HIL to make use access expressions.
Reviewed By: jeremydubreil
Differential Revision: D6961928
fbshipit-source-id: 51da919
Summary:
- Combine two fields from ProcAttributes.t into a single field `method_kind` with more information
- New field details whether the procedure is an `OBJC_INSTANCE`, `CPP_INSTANCE`, `OBJ_CLASS`, `CPP_CLASS`, `BLOCK`, or `C_FUNCTION`
- `is_objc_instance_method` and `is_cpp_instance_method` fields no longer necessary
- Changed `is_instance` field in CMethod_signature to `method_kind` field of type ProcAttributes.method_kind
Reviewed By: dulmarod
Differential Revision: D6884402
fbshipit-source-id: 4b916c3
Summary:
- small optimization by starting deconstructing procnames/types in the dispatcher rather than the matchers
- as a consequence, returns fast for unhandled constructs like Java procnames or types
- Java is still not handled but at least does not crash
- re-enable Inferbo for Java
Reviewed By: jberdine
Differential Revision: D6912304
fbshipit-source-id: 76e95a8
Summary:
Make dead code detection part of `make test` so that dead code stops creeping
in. It's only enabled if all the analysers are enabled and if this is a
facebook build, because the dead code detection will have false positives
otherwise.
Reviewed By: mbouaziz
Differential Revision: D6807395
fbshipit-source-id: ebbd835
Summary: It was getting a bit difficult to tell which functions belonged where.
Reviewed By: jvillard
Differential Revision: D6764764
fbshipit-source-id: f9faada
Summary:
This lets us fix the limitation of reporting false positives when a `private` function calls `build()` on a parameter without passing all of the required props.
We will now report such issues in the caller only if it fails to pass the required props.
An unfortunate consequence of this change is that we lose track of where the actual call to `build` occurs--we now report on the declaration of the caller function rather than on the call site of `build`.
I'll work on addressing that in a follow-up.
Reviewed By: jeremydubreil
Differential Revision: D6764153
fbshipit-source-id: 3b173e5
Summary:
Also make it optional, since it's only used for debug messages. Name a couple
more of these for other similar functions.
Reviewed By: sblackshear
Differential Revision: D6797385
fbshipit-source-id: e6e9b2e
Summary:
I needed to do this for something, now I don't know if I want to do the thing
anymore but this seems generally useful to decrease a little bit the size of
Config.ml.
Reviewed By: sblackshear, mbouaziz
Differential Revision: D6796427
fbshipit-source-id: d9c009d
Summary: At each call to `Component$Builder.build()`, checks that the required props for `Component` have been set via prior calls to the `Builder`. Does not yet handle `Prop(optional = true)`, but will address that in a follow-up.
Reviewed By: jeremydubreil
Differential Revision: D6735524
fbshipit-source-id: 0c812fd
Summary:
and add mli. We already had the logic for iterating over call chains, but it was overfitted to the should-update analysis.
Will use the generalized version in a follow-up.
Reviewed By: jvillard
Differential Revision: D6740692
fbshipit-source-id: 8c0d89f
Summary:
Was trying to decide where to add a new Java utility function and realized that things are a bit disorganized.
Some operations on `Typ.Name.t`'s live in `Typ.Procname`, and some live inside an inner `Java` module whereas some are outside of the module with a `java_` prefix.
Let's move toward putting all Java/C/Objc/C++-specific functions in dedicated modules.
This diff does some of the work for Java.
There are Java-specific functions that operate on `Typ.Procname.t`'s that will have to be converted to work on `Typ.Procname.Java.t`'s, but changing those clients will be more involved.
Will also move C/Objc/C++ functions in a follow-up.
Reviewed By: jeremydubreil
Differential Revision: D6737724
fbshipit-source-id: cdd6e68
Summary: Use the Hashtbl functions directly as `Cfg` knows that a cfg is a hashtbl.
Reviewed By: sblackshear, jeremydubreil
Differential Revision: D6727732
fbshipit-source-id: 2cdda91
Summary:
Found the dead code with the script in the next commit, iteratively until no
warnings remained.
Methodology:
1. I kept pretty-printers for values, which can be useful to use from infer's REPL (or
when printf-debugging infer in general)
2. I kept functions that formed some consistent API (but not often, so YMMV), for instance if it looked like `Set.S`, or if it provides utility functions for stuff in development (mostly the procname dispatcher functions)
3. I tried not to lose comments associated with values no longer exported: if the value is commented in the .mli and not the .ml, I moved the comment
4. Some comments needed updating (not claiming I caught all of those)
5. Sometimes I rewrote the comments a bit when I noticed mis-attached comments
Reviewed By: mbouaziz
Differential Revision: D6723482
fbshipit-source-id: eabaafd
Summary:
Almost all the files are supposed to do this. When they don't, the build adds
it automatically anyway, but it's better to always include it to avoid
confusion (and help other automated tools).
Reviewed By: mbouaziz
Differential Revision: D6723285
fbshipit-source-id: 0fe8a16
Summary:
In Java, static variables are distinguished by package/class:
the file where they are defined doesn't matter.
Fixes#831.
Closes https://github.com/facebook/infer/pull/833
Reviewed By: jeremydubreil
Differential Revision: D6661240
Pulled By: sblackshear
fbshipit-source-id: beeb2f9
Summary: This should avoid making copies of procedure descriptions which are mutable data-stuctures.
Reviewed By: sblackshear
Differential Revision: D6658527
fbshipit-source-id: 688a142
Summary: The checker should only propagate the nullablility on the lhs when of pointer type.
Reviewed By: sblackshear
Differential Revision: D6630294
fbshipit-source-id: 07fe3d6
Summary: There was several implementations of the same function accross the codebase
Reviewed By: sblackshear
Differential Revision: D6658266
fbshipit-source-id: e12507b
Summary:
Upgrade ocamlformat to 0.3, and (necessarily) base to v0.10.0.
- Fix accumulated mis-formatting
- Update opam.lock to unbreak clean build
- Update to base v0.10.0
- Update opam.lock for base
- Update offline opam repo
- Everyone should already have removed their ocamlformat pin
- ocamlformat 0.3 supports output to stdout natively
- bump version of ocamlformat
Reviewed By: jeremydubreil
Differential Revision: D6636741
fbshipit-source-id: 41a56a8
Summary: This is to allow the bi-abduction analysis and the nullable checker for Clang languages to run together without stepping on each other toes.
Reviewed By: sblackshear
Differential Revision: D6567934
fbshipit-source-id: a318c33
Summary: There was a back and forth conversion between `string` and `IssueType.t` which was not necessary.
Reviewed By: sblackshear
Differential Revision: D6562747
fbshipit-source-id: 70b57a2
Summary: Local `CKComponentScope`'s are often created purely for their side effects, so it's fine for them to be unread.
Reviewed By: jeremydubreil
Differential Revision: D6475475
fbshipit-source-id: 17e869a
Summary: This would allow the checker to detect indirect nullable violations, i.e. violations that are involving intermediate method calls on potentially `nil` values.
Reviewed By: sblackshear
Differential Revision: D6464900
fbshipit-source-id: 3663729
Summary: NSDictionary initialization will crash when using `nil` as a key or as a value
Reviewed By: dulmarod
Differential Revision: D6466349
fbshipit-source-id: 57bb012
Summary: I always get confused by `accessPath.ml` not being next to HIL when trying to open files
Reviewed By: sblackshear
Differential Revision: D6462980
fbshipit-source-id: 8ba9b71
Summary:
As da319 points out, we did not handle this case correctly before. There were a few reasons why:
(1) An assignment like `struct S s = mk_s()` gets translated as `tmp = mk_s(); S(&s, tmp)`, so we didn't see the write to `s`.
(2) We counted uses of variables in destructors and dummy `_ = *s` assignments as reads, which meant that any struct values were considered as live.
This diff fixes these limitations so we can report on dead stores of struct values.
Reviewed By: da319
Differential Revision: D6327564
fbshipit-source-id: 2ead4be
Summary: Adding a null key or a null value will cause a runtime exception.
Reviewed By: sblackshear
Differential Revision: D6378618
fbshipit-source-id: 8bd27c6
Summary:
Naming a variable `_foo` makes the compiler not warn about them if they are
unused, but there are lots of instances of such variables in the code where
they are in fact used, defeating the warning and introducing confusion for
those used to this naming convention.
Basically `sed -i -e "s/ _\([a-zA-Z][a-zA-Z0-9_']*\)/ \1_/g" **/*.ml` followed
by manual fixing of compilation errors (lots of `compare__foo` ->
`compare_foo_`).
Reviewed By: mbouaziz
Differential Revision: D6358837
fbshipit-source-id: 7ffb4ac
Summary: Adding a nil object to an NSArray will crash. Adding this case to the checker.
Reviewed By: sblackshear
Differential Revision: D6346241
fbshipit-source-id: 3fe6f20
Summary: The clang compiler introduces a materialized temporary expression which should be treated similarly to the Infer internal temporary variables.
Reviewed By: sblackshear
Differential Revision: D6331237
fbshipit-source-id: 81d8196
Summary:
We need to use the procedure description of the callees for lazy dynamic dispatch and for the resolution of the lambda. We may also need this information in other analyses, e.g. for RacerD. This diff makes the procedure description of the callees as part of the summary.
The procedure description has been part of the summary for a while already without noticeable decrease in performance.
Reviewed By: mbouaziz
Differential Revision: D6322038
fbshipit-source-id: 84101cb
Summary:
Change ocamlformat installation procedure to use opam instead of
pinning.
Reformat all code with v0.2, which has a few improvements.
Reviewed By: jvillard
Differential Revision: D6292057
fbshipit-source-id: 759967f
Summary: The checker should not report unitinialzed values on the throw branch.
Reviewed By: ddino
Differential Revision: D6267019
fbshipit-source-id: 05768f1
Summary: Better error message for the direct dereference of nullable method without intermediate variable.
Reviewed By: sblackshear
Differential Revision: D6244494
fbshipit-source-id: 2ca2d22
Summary: This is a hack to removes most of the false positives of this checker in Objective C.
Reviewed By: sblackshear
Differential Revision: D6239914
fbshipit-source-id: 1cf05de
Summary:
The issue is with classes defining static data members:
```
$ cat foo.h
struct A {
static int foo;
};
$ cat foo.cpp
#include "foo.h"
int A::foo = 12;
int f() { return A::foo; // should see A::foo as defined in this translation unit
$ cat bar.cpp
#include "foo.h"
void g() { return A::foo; // should see A::foo defined externally
```
Previously, both foo.cpp and bar.cpp would see `A::foo` as defined within their
translation unit, because it comes from the header. This is wrong, and static
data members should be treated as extern unless they're defined in the same
file.
This doesn't change much except for frontend tests. SIOF FP fix in the next diff.
update-submodule: facebook-clang-plugins
Reviewed By: da319
Differential Revision: D6221744
fbshipit-source-id: bef88fd
Summary: This only works for Java at the moment but we can re-organise the code later to add the Objective C equivalent of these assertion methods.
Reviewed By: mbouaziz
Differential Revision: D6230588
fbshipit-source-id: 46ee98e
Summary: The checker should not report nullable violations on repeated calls
Reviewed By: sblackshear
Differential Revision: D6195471
fbshipit-source-id: 16ff76d
Summary: A stepping stone to have descriptive issue types for each kind of flow rather that lumping everything into `QUANDARY_TAINT_ERROR`.
Reviewed By: mbouaziz
Differential Revision: D6126690
fbshipit-source-id: a7230c0
Summary:
Before this change, analyses using HIL needed to pass `IdAcessPathMapDomain.empty` to abstract interpreter, and would get back the map as part of the post.
This is a confusing API and was a pain point for Dino in trying to use HIL.
This diff adds a HIL wrapper around the abstract interpreter that hides these details.
It replaces `LowerHIL.makeDefault` as the new "simplest possible way" to use HIL.
Reviewed By: jberdine
Differential Revision: D6125597
fbshipit-source-id: 560856b
Summary:
Looked at some problematic summaries and am noticing some common patterns.
Adding some dynamic checks to be run in debug mode in order to make sure my fixes for these patterns are real.
Reviewed By: jeremydubreil
Differential Revision: D5779593
fbshipit-source-id: 9de6497
Summary:
Install ocamlformat from github as part of `make devsetup`, and use it
for formatting OCaml (and jbuild) code.
Reviewed By: jvillard
Differential Revision: D6092464
fbshipit-source-id: 4ba0845
Summary: Sinks weren't being printed when passthroughs are empty (which, for now, is always). Oops!
Reviewed By: jvillard
Differential Revision: D6110164
fbshipit-source-id: 4488ab0
Summary: This will make it easier to generalize the checker to handling uninitialized struct fields.
Reviewed By: ddino
Differential Revision: D6099484
fbshipit-source-id: b9c534b
Summary:
Refactor `RegisterCheckers` to give a record type to checkers instead of a tuple type.
Print active checkers with their per-language information.
Improve the manual entries slightly.
Reviewed By: sblackshear
Differential Revision: D6051167
fbshipit-source-id: 90bcb61
Summary:
Whenever we see a use of a lock, infer that the current method can run in a multithreaded context. But only report when there's a write under a lock that can be read or written without synchronization elsewhere.
For now, we only infer this based on the direct usage of a lock; we don't assume a caller runs in a multithreaded context just because its (transitive) callee can.
We can work on that trickier case later, and we can work on smarter inference that takes reads under sync into account. But for now, warning on unprotected writes of reads that occur under sync appears to be too noisy.
Reviewed By: jberdine
Differential Revision: D5918801
fbshipit-source-id: 2450cf2
Summary:
This will allow most of the checkers, except the bi-abduction, to skip the analysis on the specialized clone of the methods used to handle dynamic dispatch. Doing this, we can run the bi-abduction analysis using:
infer -a checkers --biabduction
without risk of conflicts on the resolution of dynamic dispatch.
Reviewed By: sblackshear
Differential Revision: D6052347
fbshipit-source-id: 0c75bf3
Summary:
It looks like the old code for expanding access paths assumed that `FormalMap.get_formals_indexes` assumed the returned list tuples would be sorted by index, but it's actually sorted by var name.
As a consequence, formals might be expanded into the wrong actuals.
This diff fixes the problem by not relying on `get_formals_indexes`.
Reviewed By: jberdine
Differential Revision: D6056365
fbshipit-source-id: 09f3208
Summary:
This was a crutch from the days before ownership analysis.
We shouldn't need it anymore, and it was actually causing FP's because we were skipping analysis of `ImmutableList.builder()` and not understanding that the return value is owned.
Reviewed By: jeremydubreil
Differential Revision: D6035631
fbshipit-source-id: afa0ade
Summary:
9c7fc65 introduced a large performance regression, this diff eliminates it and a bit more.
Instead of constructing the quotiented access list map in a two-step process of first constructing a map of all accesses and then quotienting it, the quotiented map is constructed directly by using a coarser comparison function on keys. Partitioning the access map O(number of access paths) times, using an apparently expensive partition predicate, seems to be causing trouble based on rough profile data.
Reviewed By: da319
Differential Revision: D6005262
fbshipit-source-id: 077846c
Summary: Stack-allocated variables cannot be raced on in cpp as every thread has its own stack. At the beginning of the analysis we add ownership to the local variables.
Reviewed By: jberdine
Differential Revision: D6020506
fbshipit-source-id: 0a90a97
Summary: Now that we report write-write races involving more than one write, we need to improve the traces accordingly.
Reviewed By: jberdine
Differential Revision: D6026845
fbshipit-source-id: b1366dd
Summary: The may alias analysis relation that the thread safety analysis uses is very specific to Java and causes many false alarms for C++ code. This diff disables it when analyzing C++ code. Improving it to avoid false negatives is left for later.
Reviewed By: sblackshear
Differential Revision: D5974182
fbshipit-source-id: 9c7fc65
Summary:
The analyzer currently does not understand the control flow of
Singletons, which leads to false alarms. This diff is an unsound hack
that simply ignores any read or write accesses made when computing the
value of a singleton.
Reviewed By: sblackshear
Differential Revision: D5979639
fbshipit-source-id: 34caecb
Summary:
Model folly::SharedMutex lock and unlock operations, some
apache::thrift::concurrency::ReadWriteMutex operations, some
folly::RWSpinLock operations, and folly::MicroSpinLock operations.
Reviewed By: sblackshear
Differential Revision: D5974225
fbshipit-source-id: 19e2816
Summary:
Previously, annotating something ThreadSafe meant "check that it is safe to run all of this procedure's methods in parallel with each other" (including self-parallelization).
This makes sense, but it means that if the user writes no annotations, we do no checking.
I'm moving toward a model of inferring when an access might happen on a thread that can run concurrently with other threads, then automatically checking that it is thread-safe w.r.t to all other accesses to the same memory (on or off the current thread thread).
This will let us report even when there are no `ThreadSafe` annotations.
Any method that is known to run on a new thread (e.g., `Runnable.run`) will be modeled as running on a thread that can run in parallel with other threads, and so will any method that is `synchronized` or acquires a lock.
In this setup, adding `ThreadSafe` to a method just means: "assume that the current method can run in parallel with any thread, including another thread that includes a different invocation of the same method (a self race) unless you see evidence to the contrary" (e.g., calling `assertMainThread` or annotating with `UiThread`).
The key step in this diff is changing the threads domain to abstract *what threads the current thread may run in parallel with* rather than *what the current thread* is. This makes things much simpler.
Reviewed By: jberdine
Differential Revision: D5895242
fbshipit-source-id: 2e23d1e
Summary:
Indicate if read or write is protected, and do not print only the
field but also the object involved in the race.
Reviewed By: sblackshear
Differential Revision: D5974250
fbshipit-source-id: 351a576
Summary:
Expanding traces currently works in the following way:
Given a `TraceElem.Kind` `k` we want to report in `foo`, we look for a callee `C` of `foo` that has a `TraceElem.Kind` equal to `k` in its summary, grab the summary for `C`, then repeat until we bottom out.
This isn't very flexible: it insists on equality between `TraceElem.Kind`'s as the criteria for expanding a trace.
This diff introduces a new `matches` function for deciding when to expand a trace from a caller into a callee.
Clients that don't want strict equality can implement a fuzzier kind of equality inside this function.
I've gone ahead and done this for the trace elemes of thread-safety.
In the near future, equivalent access paths won't always compare equal from caller to callee, so we want to match their suffixes instead.
Reviewed By: jvillard
Differential Revision: D5914118
fbshipit-source-id: 233c603
Summary: Not using this for now, and it seems good to simplify the complex domain as much as we can.
Reviewed By: jberdine
Differential Revision: D5970233
fbshipit-source-id: a451503
Summary:
Inject a marker using a global variable in <iostream>, and whitelist it so that
the frontend translates it.
Use the marker in the SIOF checker to tell whether a file includes <iostream>.
If so, start the analysis of its methods assuming that the standard streams are
initialised.
Reviewed By: sblackshear
Differential Revision: D5941343
fbshipit-source-id: 3388d55
Summary:
The previous domain for SIOF was duplicating some work with the generic Trace
domain, and basically was a bit confused and confusing. A sink was a set of
global accesses, and a state contains a set of sinks. Then the checker has to
needlessly jump through hoops to normalize this set of sets of accesses into a
set of accesses.
The new domain has one sink = one access, as suggested by sblackshear. This simplifies
a few things, and makes the dedup logic much easier: just grab the first report
of the list of reports for a function.
We only report on the fake procedures generated to initialise a global, and the
filtering means that we keep only one report per global.
Reviewed By: sblackshear
Differential Revision: D5932138
fbshipit-source-id: acb7285
Summary:
The only language types we have are Java/Clang/Python. The unit of analysis is a source file, and you can't write a source file that mixes two or more of these languages (to the best of my knowledge).
This diff simplifies using the assumption that all procedures in a file are written in the same language.
Reviewed By: jeremydubreil
Differential Revision: D5886942
fbshipit-source-id: 88c3759
Summary:
The only language types we have are Java/Clang/Python. The unit of analysis is a source file, and you can't write a source file that mixes two or more of these languages (to the best of my knowledge).
This diff simplifies using the assumption that all procedures in a file are written in the same language.
Reviewed By: jeremydubreil
Differential Revision: D5886942
fbshipit-source-id: 8555a16
Summary: Only Eradicate uses this, no need to create it for every checker.
Reviewed By: jeremydubreil
Differential Revision: D5886775
fbshipit-source-id: 7242437
Summary:
A Java cluster checker currently defines a "cluster" as all of the procedures in the same class.
But the cluster checker actually knows about all the procedures defined in the same source file.
In some checkers (such as thread-safety), we want to aggregate results across classes in the same file, not just methods in the same class.
This refactoring leaves the behavior the same for now, but will make it easier to do this in the near future.
Reviewed By: jeremydubreil
Differential Revision: D5885896
fbshipit-source-id: 0815fca
Summary: Handling the utility functions for asserting that we're on background thread.
Reviewed By: jberdine
Differential Revision: D5863435
fbshipit-source-id: 3ad95b5
Summary:
Previously, we just tracked a boolean representing whether we were possibly on the main thread (true) or definitely not on the main thread (false).
In order to start supporting `Thread.start`, `Runnable.run`, etc., we'll need something more expressive.
This diff introduces a lattice:
```
Any
/ \
Main Background
\ /
Unknown
```
as the new threads domain. The initial value is `Unknown`, and we introduce `Main` in situations where we would have introduced `true` before.
This (mostly) preserves behavior: the main difference is that before code like
```
if (*) {
assertMainThread()
} else {
x.f = ...
}
```
would have recorded that the access to `x.f` was on the main thread, whereas now we'll say that it's on an unknown thread.
Reviewed By: peterogithub
Differential Revision: D5860256
fbshipit-source-id: efee330
Summary:
It's useful to be able to disable de-duplication on the command line with `--no-filtering`.
Gate de-duplication with `Config.filtering` and move the de-duplication tests to a new directory under the build systems tests.
Reviewed By: jeremydubreil
Differential Revision: D5865329
fbshipit-source-id: 5094f5b
Summary:
Suggesting to add `_Nullable` on the fields checked for, or assigned to, `nullptr` will allow the biabduction analysis to report null dereferences that are related to the lifetime of objects.
Depends on D5832147
Reviewed By: sblackshear
Differential Revision: D5836538
fbshipit-source-id: c1b8e48
Summary: That match branch is Java-only but could be reached with a C++ method, causing a crash.
Reviewed By: jberdine, mbouaziz
Differential Revision: D5814041
fbshipit-source-id: 6b1b501
Summary: It's useful to be able to configure both max depth and max width
Reviewed By: jeremydubreil
Differential Revision: D5801567
fbshipit-source-id: 0138cd7
Summary:
Races on the internal implementation of data structures defined in
system headers are currently not detected, since the memory accesses
are in procedures that are not analyzed.
This diff adds models for a few std::map operations that indicate if
they read or write the underlying representation.
Reviewed By: da319
Differential Revision: D5804293
fbshipit-source-id: 55ff28c
Summary:
Read the documentation and it doesn't seem like these functions are guaranteed to choose the same value in different runs.
I hypothesize that these may be the source of flakiness in the thread-safety tests/smoke tests.
Reviewed By: jvillard
Differential Revision: D5794384
fbshipit-source-id: 02b7a96
Summary:
We supported globals as sources before, but we did so by allowing ClangTrace etc. to match against any access path in the footprint of the trace.
This is very powerful/flexible, but it's ultimately not a good idea because it leads to traces that are hard to read.
This is because a footprint source doesn't have any information about its provenance: we might know that the value came from a global, but we don't know where the read occurred.
The mechanism for handling procedure calls as sources already knows how to solve this problem.
This diff implements globals as sources as a special case of procedure call sources instead.
This will give us much nicer traces with full provenance of the read from the global.
Reviewed By: mbouaziz
Differential Revision: D5772299
fbshipit-source-id: 491ae81
Summary:
When a lambda has an `auto` parameter, the inferred type of the parameter because part of the name.
Our heuristic for identifying lambda was checking if the lambda's name was exactly `operator()`, which won't catch this case.
Reviewed By: jvillard
Differential Revision: D5753323
fbshipit-source-id: 85ff75a
Summary:
- failwith police: no more `failwith`. Instead, use `Logging.die`.
- Introduce the `SimpleLogging` module for dying from modules where `Logging`
cannot be used (usually because that would create a cyclic dependency).
- always log backtraces, and show backtraces on the console except for usage errors
- Also point out in the log file where the toplevel executions of infer happen
Reviewed By: jeremydubreil
Differential Revision: D5726362
fbshipit-source-id: d7a01fc
Summary:
I'm working on parameterizing access trees with a config that will limit max depth + perhaps width if needed.
This is a stepping stone to enforcing max depth.
Reviewed By: jvillard
Differential Revision: D5693453
fbshipit-source-id: c15b0ee
Summary:
Rather than printing the footprint using its actual representation (an access trie with bool nodes), print it as a set of access paths.
This makes it easier to read sources in Quandary specs/debug Quandary.
Reviewed By: jeremydubreil
Differential Revision: D5682630
fbshipit-source-id: ac55b7f
Summary:
Add `is_empty` to `AbstractDomain.WithBottom` sig and use the empty checks for nicer printing of access trees: don't print empty nodes/traces.
This should make it easier to debug Quandary; it's pretty hard to stare at an access tree and see what's going on right now.
Reviewed By: jberdine
Differential Revision: D5682248
fbshipit-source-id: 56d2a9d
Summary:
This simplifies the jbuild files: no need to list these files explicitly
anymore, nor to exclude them explicitly from the main `InferModules` library
(due to their different compilation flags).
Isolate common parts into jbuild.common do `cat`-based code inclusion into
jbuild files to factorize code.
Reviewed By: jberdine
Differential Revision: D5678328
fbshipit-source-id: 6d7d925
Summary:
We now represent the footprint with an access trie, so this code is no longer required.
This lets us simplify things a bit
Reviewed By: jberdine
Differential Revision: D5664484
fbshipit-source-id: c35edf2
Summary:
In looking at summaries that Quandary took a long time to compute, one thing I notice frequently is redundancy in the footprint sources (e.g., I might see `Footprint(x), Footprint(x.f), Footprint(x*)`).
`sudo perf top` indicates that joining big sets of sources is a major performance bottleneck, and a large number of footprint sources is surely a big part of this (since we expect the number of non-footprint sources to be small).
This diff addresses the redundancy issue by using a more complex representation for a set of sources. The "known" sources are still in a set, but the footprint sources are now represented as a set of access paths (via an access trie).
The access path trie is a minimal representation of a set of access paths, so it would represent the example above as a simple `x*`.
This should make join/widen/<= faster and improve performance
Reviewed By: jberdine
Differential Revision: D5663980
fbshipit-source-id: 9fb66f8
Summary:
The previous widening operator added stars to the *end* of paths that existed in `next` but not `prev`. This is not enough to ensure termination in the case where the trie is growing both deeper and wider at the same time.
The newly added test demonstrates this issue. In the code, there's an ever-growing path of the form `tmp.prev.next.prev.next...` that wasn't summarized by the previous widening operator. The new widening is much more aggressive: it replaces *any* node present in `next` but not `prev` with a `*` (rather than trying to tack a star onto the end). This fixes the issue.
This issue was causing divergence on tricky doubly-linked list code in prod.
Reviewed By: jeremydubreil
Differential Revision: D5665719
fbshipit-source-id: 1310a92
Summary:
Instead of a whitelist and blacklist and default issue types and default
blacklist and filtering, consider a simpler semantics where
1. checkers can be individually turned on or off on the command line
2. most checkers are on by default
3. `--no-filtering` turns all issue types on, but they can then be turned off again by further arguments
This provides a more flexible CLI and is similar to other options in the infer
CLI, where "global" behaviour is generally avoided.
Dynamically created checkers (eg, AL linters) cause some complications in the
implementation but I think the semantics is still clear.
Also change the name of the option to mention "issue types" instead of
"checks", since the latter can be confused with "checkers".
Reviewed By: jberdine
Differential Revision: D5583238
fbshipit-source-id: 21de476
Summary:
We previously lumped ownership predicates in with all other predicates. That limited us to a flat ownership domain.
This diff separates out the ownership predicates so we can have a richer lattice of predicates with each access path.
This lets us be more precise; for example, we can now show that
```
needToOwnBothParams(Obj o1, Obj o2) {
Obj alias;
if (*) { alias = o1; } else { alias = o2; }
alias.f = ... // both o1 and o2 need to be owned for this to be safe
}
void ownBothParamsOk() {
needToOwnBothParams(new Obj(), new Obj()); // ok, would have complained before
}
```
is safe.
Reviewed By: jberdine
Differential Revision: D5589898
fbshipit-source-id: 9606a46
Summary: Useful for identifying user-controlled array accesses that could lead to buffer overflows
Reviewed By: mbouaziz
Differential Revision: D5520985
fbshipit-source-id: 92984f6
Summary:
Record the list of access paths (if any) used in the index expression for each array access.
This will make it possible to use array accesses as sinks in Quandary
Reviewed By: jeremydubreil
Differential Revision: D5531356
fbshipit-source-id: 8204909
Summary:
It's nice to have "raw" as the default kind of access path, since it's used much more often than the abstraction.
This is also a prereq for supporting index expressions in access paths, since we'll need mutual recursion between accesses and access paths.
Reviewed By: jeremydubreil
Differential Revision: D5529807
fbshipit-source-id: cb3f521
Summary:
Works the same way as read/write races on fields, except that are more relaxed (er, unsound) in deciding whether two containers may alias.
This is needed to avoid reporting a ton of FP's; full explanation in comments.
Reviewed By: da319
Differential Revision: D5493404
fbshipit-source-id: 0a5d8b1
Summary:
The way we represented container writes before was pretty hacky: just use a dummy field for the name of the method that performs the container write.
This diff introduces a new access kind for container writes that is much more structured.
This will make it easier to soundly handle aliasing between containers and support container reads in the near future.
Reviewed By: da319
Differential Revision: D5465747
fbshipit-source-id: e021ec2
Summary: This gives us more expressive power when defining sources--we can use heuristics like "`foo(o)` only returns a source when `o` is not a constant".
Reviewed By: jvillard
Differential Revision: D5467935
fbshipit-source-id: f3d581d
Summary:
Pretty basic: warn when we see an assignment instruction `x = ...` and `x` is not live in the post of the instruction.
Only enabled for Clang at the moment because linters already warn on this for Java. But we can enable it later if we want to (should be fully generic).
Reviewed By: jeremydubreil
Differential Revision: D5450439
fbshipit-source-id: 693514c
Summary: Making it simple to add a new access type for "un-annotated interface call" in an upcoming diff.
Reviewed By: da319
Differential Revision: D5445914
fbshipit-source-id: f29e342
Summary:
This just makes the warnings silent for now. We may improve the analysis to check if the null check on the captured fields are consistent with the annotation on the corresponding parameters.
Eradicate also has the same issue. I added a test to outline this. The biabduction analysis will also probably fail on the same of annotation lookup. We may want implement the proper fix at the level of `Annotation.field_has_annot`.
Reviewed By: sblackshear
Differential Revision: D5419243
fbshipit-source-id: 6460de8
Summary: This will allow us to gradually get rid of the exceptions thrown during the analysis while detecting the regressions earlier
Reviewed By: jberdine, jvillard
Differential Revision: D5385154
fbshipit-source-id: 605e3f5
Summary:
Conversion and reformat of infer source using ocamlformat
auto-formatting tool.
Current status:
- Because Reason does not handle docstrings, the output of the
conversion is not 'Warning 50'-clean, meaning that there are
docstrings with ambiguous placement. I'll need to manually fix
them just before landing.
Reviewed By: jvillard
Differential Revision: D5225546
fbshipit-source-id: 3bd2786