Summary: This is more flexible and allows us to give more details when reporting.
Reviewed By: mbouaziz
Differential Revision: D10509336
fbshipit-source-id: 79c3ac1c8
Summary:
Invalidating addresses for destructors to catch use after destructor errors.
To pass ownership tests for use after destructor errors, we still need to:
(1) fix pointer arithmetic false positives
(2) add model for placement new to fix false positives
(3) add model for operator= to fix false positives
(4) support inter-procedural analysis for destructor_order_bad test
Reviewed By: jvillard
Differential Revision: D10450912
fbshipit-source-id: 2d9b1ee68
Summary:
It uses platform-dependent integer type widths information when
constructing Sizeof expressions which have a field(`nbytes`)
representing the static results of the evaluation of `sizeof(typ)`.
Reviewed By: mbouaziz
Differential Revision: D10504715
fbshipit-source-id: 0c79d37d8
Summary:
Instead of the non-sensical piecewise join we had until now write
a proper one. Hopefully the comments explain what it does. Main one:
```
(* high-level idea: maintain some union-find data structure to identify locations in one heap
with locations in the other heap. Build the initial join state as follows:
- equate all locations that correspond to identical variables in both stacks, eg joining
stacks {x=1} and {x=2} adds "1=2" to the unification.
- add all addresses reachable from stack variables to the join state heap
This gives us an abstract state that is the union of both abstract states, but more states
can still be made equal. For instance, if 1 points to 3 in the first heap and 2 points to 4
in the second heap and we deduced "1 = 2" from the stacks already (as in the example just
above) then we can deduce "3 = 4". Proceed in this fashion until no more equalities are
discovered, and return the abstract state where a canonical representative has been chosen
consistently for each equivalence class (this is what the union-find data structure gives
us). *)
```
Reviewed By: mbouaziz
Differential Revision: D10483978
fbshipit-source-id: f6ffd7528
Summary: This diff changes pp of binary operation condition in order to avoid a `make test` failure. For the same `uint64_t` type, it is translated to `unsigned long long` in 64bit mac, but `unsigned long` in 64bit linux, which made a `make test` failure.
Reviewed By: mbouaziz
Differential Revision: D10459466
fbshipit-source-id: 449ab548e
Summary:
`Location` was clashing with the `Location` module, so use `Address`
instead.
When invalidating an address, remember the "actor" of its invalidation,
i.e. the access expression leading to the address and the source
location of the corresponding instruction.
When checking accesses, also pass the actor responsible for the access,
so that when we raise an error we know:
1. when and why a location was invalidated
2. when and why we tried to read it after that
Reviewed By: mbouaziz
Differential Revision: D10446282
fbshipit-source-id: 3ca4fb3d4
Summary:
Model `x[y]` and `x.push_back(i)` to catch the classic bug of "take
reference inside vector, invalidate, then use again".
Reviewed By: da319
Differential Revision: D10445824
fbshipit-source-id: 21ffd9677
Summary:
Do the intersection of the heap and stack domains, and the union of the
invalid location sets. This forgets invalid locations that appear only
in one heap, unfortunately. We can start with this and improve later.
Reviewed By: mbouaziz
Differential Revision: D10445825
fbshipit-source-id: cc24460af
Summary:
New analysis in foetal form to detect invalid use of C++ objects after their
lifetime has ended. For now it has:
- A domain consisting of a graph of abstract locations representing the heap, a map from program variables to abstract locations representing the stack, and a set of locations known to be invalid (their lifetime has ended)
- The heap graph is unfolded lazily when we resolve accesses to the heap down to an abstract location. When we traverse a memory location we check that it's not known to be invalid.
- A simple transfer function reads and updates the stack and heap in a rudimentary way for now
- C++ `delete` is modeled as adding the location that its argument resolves to to the set of invalid locations
- Also, the domain has a really crappy join and widening for now (see comments in the code)
With this we already pass most of the "use after delete" tests from the
Ownership checker. The ones we don't pass are only because we are missing
models.
Reviewed By: mbouaziz
Differential Revision: D10383249
fbshipit-source-id: f414664cb
Summary:
It avoids raising an exception when unexpected arguments are given to
placement new. We will revert this after fixing the frontend to parse
user defined `new` correctly in the future.
Reviewed By: mbouaziz
Differential Revision: D10378136
fbshipit-source-id: d494f781b
Summary:
It unsets `var_exp_typ` of `trans_state` during the translations of
placement parameters, so they are translated independently against the
target variable and class of the `new` function.
Reviewed By: mbouaziz, jvillard
Differential Revision: D10161419
fbshipit-source-id: 7f588a91c
Summary: It enables placement_new to get three parameters, which happens when placement_new is overloaded (e.g. Boost).
Reviewed By: mbouaziz
Differential Revision: D10100324
fbshipit-source-id: 0ecb0a404
Summary:
Fix the logic for computing duplicate symbols. It was broken at some point and some duplicate symbols creeped into our tests. Fix these, and add a test to avoid duplicate symbols detection to regress again.
Also, this removes one use of `Cfg.load`, on the way to removing file-wide CFGs from the database.
Reviewed By: ngorogiannis
Differential Revision: D10173349
fbshipit-source-id: a0d2365b3
Summary:
New clang in the plugin \o/
Changes that were needed:
- (minor) Some extra AST nodes
- defining a lambda and calling it in the same line (`[&x]() { x = 1; }()`) used to get translated as a call of the literal but now an intermediate variable gets created, which confuses uninit in one test. I added another test to showcase the limitation this is hitting: storing the lambda in a variable then calling it will not get caught by the checker.
The controller you requested could not be found.: facebook-clang-plugins
Reviewed By: jeremydubreil
Differential Revision: D10128626
fbshipit-source-id: 8ffd19f3c
Summary:
Goal of the stack: deprecate the `--analyzer` option in favour of turning
individual features on and off. This option is a mess: some of the options are
now subcommands (compile, capture), others are aliases (infer and checkers),
and they can all be replicated using some straightforward combination of other
options.
This diff: stop using `--analyzer` in tests. It's mostly `checkers` everywhere,
which is already the default. `linters` becomes `--no-capture --linters-only`.
`infer` is supposed to be `checkers` already. `crashcontext` is
`--crashcontext-only`.
Reviewed By: mbouaziz
Differential Revision: D9942689
fbshipit-source-id: 048281761
Summary: We had a special case for fixing false positives on constexpr implicitly captured by lambdas. However, we do not report dead stores on constexpr anymore, hence, do not need the special case anymore. Moreover, the special case was not only capturing constexpr in lambdas, but also any variables which type had `const` (see new test `capture_const_bad` which was not being reported before this diff)
Reviewed By: mbouaziz
Differential Revision: D9654848
fbshipit-source-id: 882fd2804
Summary: We report dead store false positives in template arguments when constexpr is used. To remove the false positives, with the expense of some false negatives, we do not report dead stores on constexpr anymore.
Reviewed By: mbouaziz
Differential Revision: D9608095
fbshipit-source-id: 91b0c71c4
Summary:
Lambdas can capture references to locals of the enclosing method as long as
they are not propagated outside the method. However to keep things simple
always allow them to capture locals of the enclosing method at the price of
some false negatives.
Reviewed By: da319
Differential Revision: D8974434
fbshipit-source-id: 957ae44bd
Summary: C++17 introduce guaranteed copy elision which omits constructor calls. In ownership analysis, we depended on these constructor calls to acquire ownership. In particular, when a method returns struct, previously, a constructor was used to acquire ownership. In this diff, we acquire ownership of the returned structs directly.
Reviewed By: mbouaziz
Differential Revision: D9244302
fbshipit-source-id: ae8261b99
Summary: Errors that include temporary variables are difficult to understand. Do not report stack variable address escape on temporary variables.
Reviewed By: jvillard
Differential Revision: D9117517
fbshipit-source-id: 9ebd75ecc
Summary: Exceptional successors were not meant to be created for return nodes, but they were created if try block had a single return statement.
Reviewed By: jvillard
Differential Revision: D8913371
fbshipit-source-id: 6ac85b21d
Summary:
Do no computation of stability abstract state if not explicitly requested via the command line flag.
Also, simplify the reporting.
Reviewed By: jeremydubreil
Differential Revision: D8614885
fbshipit-source-id: 25dd9de
Summary:
The addresses of global variables do not need initialisation to exist and be valid as they are part of the code or data segment of the program. This means that taking the address of a global is not in itself a danger for SIOF. However, dereferencing such an address would be. In order to avoid false positives but avoid being too unsound, only ignore them when the address is taken only to set another global. The general case would require a more complicated abstract domain.
Fixes#866
Reviewed By: ngorogiannis
Differential Revision: D8055627
fbshipit-source-id: 92307b2
Summary:
These just point to expressions that we know how to translate.
Fixes#950
Reviewed By: mbouaziz
Differential Revision: D8713784
fbshipit-source-id: 9eafa39
Summary:
- Do not add actuals of a call as unstable.
- Replace access trie with simple set of paths, which is easier to debug/argue correct.
- Fix bug where a prefix path was searched, as opposed to a *proper* prefix.
- Restrict interface to the minimum so that alternative implementations are easier.
Reviewed By: ilyasergey
Differential Revision: D8573792
fbshipit-source-id: 4c4e174
Summary: C/C++ code can, in some cases, generate a large number of temporary (Sil) variables. Since we are already not reporting races on these, not recording them gives some perf back.
Reviewed By: mbouaziz, jvillard
Differential Revision: D8566999
fbshipit-source-id: 148ac46
Summary: We were missing reads of `a` if it was used in void cast, i.e. `(void) a;` This caused dead store false positives: we were not using `exp` that was the result of translating `a`. This diff creates a call to built-in skip function with `exp` as its argument, which causes the analyses to see reads of `exp`.
Reviewed By: mbouaziz
Differential Revision: D8332092
fbshipit-source-id: f3b0e10
Summary: There is a number of dangling pointer dereference false positives coming from our treatment of union in c/cpp. For now, do not treat union fields as uninitialised.
Reviewed By: mbouaziz
Differential Revision: D8279802
fbshipit-source-id: a339b0e
Summary: We get a lot of false positives for union types as union fields are treated as separate memory locations at the moment. For now we do not treat union fields as uninitialised.
Reviewed By: mbouaziz
Differential Revision: D8277363
fbshipit-source-id: efe5b4a
Summary:
It's useful to test that the bucket a given error is classified as doesn't
change over time without notice.
This records the bucket for *all* the tests, even though some never produce a
bucket. This is to be on the safe size instead of risking to forget adding the
bucket information when the test changes, or when copy/pasting from a test that
doesn't have buckets to one that does.
The implementation is pretty crude: it greps the beginning of the qualifier
string for a `[bucket]`.
Reviewed By: mbouaziz
Differential Revision: D8236393
fbshipit-source-id: b3b1eb9
Summary:
Change the license of the source code from BSD + PATENTS to MIT.
Change `checkCopyright` to reflect the new license and learn some new file
types.
Generated with:
```
git grep BSD | xargs -n 1 ./scripts/checkCopyright -i
```
Reviewed By: jeremydubreil, mbouaziz, jberdine
Differential Revision: D8071249
fbshipit-source-id: 97ca23a
Summary:
Moving away from C++ include-based models means that we cannot reliably detect
anymore whether a file includes <iostream> or not. In order not to be too
spammy, let's always assume standard streams are initialized for now when the
include models are off.
Recent versions of libstdc++ make these models redundant so there is hope that in a
bright future the analysis of std streams initialisation will work correctly without infer
having to have its own models anyway.
Reviewed By: mbouaziz
Differential Revision: D8043467
fbshipit-source-id: d118043
Summary: Treat array accesses as initialised if they are passed by reference.
Reviewed By: jvillard
Differential Revision: D8071247
fbshipit-source-id: 5480e90
Summary: Use AccessExpressions instead of AccessPath in uninit analysis. This will allow us to distinguish between pointers and their dereferences.
Reviewed By: jvillard
Differential Revision: D8042359
fbshipit-source-id: 604bcbc
Summary:
Labels inside switch statements were causing havoc (see test), and the translation of switch statements in general could be improved to handle more cases.
It turns out that `case` (and `default`) statements are more or less fancy labels into the code. In other words, if you erase all the `case XXX:` and `default:` strings in the `switch` statement you get the real structure of the program, and `switch` just jumps straight to the first `case` directives (and to the second if the first one is not satisfied, etc. until all `case`/`default` have been considered).
This suggests an alternative implementation: translate the body of the `switch` and simply record the list of switch cases inside that body, along with where they point to. Then post-process this list to construct the control flow of the `switch`, which points into the control-flow of the `body`. In order not to modify every function in `CTrans` to propagate the current list of cases, I created an ugly `ref` inside `SwitchCase` instead (but it cannot be directly accessed and it's guaranteed to be well-parenthesised wrt nested switches by the `SwitchCase` API so it's not too bad).
[unrelated] Also make translation failures output more information about what exactly in the source code is causing the crash, and the ancestors in the AST that lead to the crash site.
Reviewed By: martinoluca
Differential Revision: D8011046
fbshipit-source-id: 8455090
Summary:
This diff:
- translates C++ `catch` blocks
- adds an exceptional control-flow edge from the end of a `try` block to the beginning of a `catch` block
This obviously doesn't reflect the way exceptions actually work, but I think it is better than what we have now. For one thing, we'll see/translate code inside `catch` blocks, which were opaque before. If Clang analyses don't want this behavior, they can simply use `ProcCfg.Normal` (which, up until this diff, behaved identically to `ProcCfg.Exceptional`.
In the future, we can extend `trans_state` to track blocks that might throw an exception, and have each of these blocks transition to `catch` instead.
Reviewed By: jvillard
Differential Revision: D7814521
fbshipit-source-id: 67b86a6
Summary:
Previously, the type of `trans_result` contained a list of SIL expressions.
However, most of the time we expect to get exactly one, and getting a different
number is a soft(!) error, usually returning `-1`.
This splits `trans_result` into `control`, which contains the information
needed for temporary computation (hence when we don't necessarily know the
return value yet), and a new version of `trans_result` that includes `control`,
the previous `exps` list but replaced by a single `return` expression instead,
and a couple other values that made sense to move out of `control`. This allows
some flexibility in the frontend compared to enforcing exactly one return
expression always: if they are not known yet we stick to `control` instead (see
eg `compute_controls_to_parent`).
This creates more garbage temporary identifiers, however they do not show up in
the final cfg. Instead, we see that temporary IDs are now often not
consecutive...
The most painful complication is in the treatment of `DeclRefExpr`, which was
actually returning *two* expressions: the method name and the `this` object.
Now the method name is a separate (optional) field in `trans_result`.
Reviewed By: mbouaziz
Differential Revision: D7881088
fbshipit-source-id: 41ad3b5
Summary:
This is an attempt to make things more consistent, and maybe save some work
from the `Format` module in case flambda doesn't have our backs.
Reviewed By: jberdine
Differential Revision: D7775496
fbshipit-source-id: 59a6314
Summary: std::lock allows for locking multiple lockable objects, while avoiding deadlock. This will fix some FPs in C++.
Reviewed By: da319
Differential Revision: D7844198
fbshipit-source-id: 2b7140a
Summary:
This simplifies the frontends and backends in most cases. Before this diff,
returning `void` could be modelled either with a `None` return, or a dummy
return variable with type `Tvoid`. Now it's always the latter.
Reviewed By: sblackshear, dulmarod
Differential Revision: D7832938
fbshipit-source-id: 0a403d1
Summary: Returning the list of sub-expressions is not right and can cause assertion failures elsewhere in the frontend.
Reviewed By: dulmarod
Differential Revision: D7813493
fbshipit-source-id: 33ac9c1
Summary:
When looking at large CFGs, at least in `xdot`, it's often difficult to find
the procedure you're looking for. Sorting the proc names puts them in
alphabetical order, which makes searching one procedure easier.
Reviewed By: mbouaziz
Differential Revision: D7758521
fbshipit-source-id: 8e9997f
Summary: Currently when we look for already abduced expression and find an assertion [exp|->strexp:typexp], we use typexp rather than strexp.
Reviewed By: sblackshear
Differential Revision: D7617193
fbshipit-source-id: c089720
Summary:
This information is already available in the trace, and can contain absolute
paths to system includes (or infer's own clang runtime), which confuses the
diff analysis.
Reviewed By: mbouaziz
Differential Revision: D7534609
fbshipit-source-id: 5bd8f8b
Summary:
If an aggregate `a` has a field `f` whose type has a constructor (e.g., `std::string`), we translate creating a local aggregate `A { "hi" }` as `string(&(a.f), "hi")`.
This diff makes sure that we recognize this as initializing `a`.
Reviewed By: jeremydubreil
Differential Revision: D7404624
fbshipit-source-id: 0ba90a7
Summary:
Show where the invalidation occurred in the trace.
Should make things easier to understand.
Reviewed By: jeremydubreil
Differential Revision: D7312182
fbshipit-source-id: 44ba9cc
Summary: It adds an issue type, `BUFFER_OVERRUN_U5`, for alarms involving unknown values, i.e., when the trace set includes an unknown function call.
Reviewed By: mbouaziz
Differential Revision: D7178841
fbshipit-source-id: bfe857b
Summary:
Aggregate initialization (e.g., `S s{1, 2}`) doesn't invoke a contructor.
Our frontend translates aggregation initialization as assigning to each field in the struct.
To avoid the appearance of the struct being uninitialized, count any assignment to a field of an aggregate struct as initializing the struct.
Reviewed By: jeremydubreil
Differential Revision: D7189671
fbshipit-source-id: ace02fc
Summary:
Show some `SymAssign`s (corresponding to parameters) in the trace.
Depends on D7194448
Reviewed By: skcho
Differential Revision: D7194479
fbshipit-source-id: 0deff6c
Summary: It simply resizes the target structure instead of allocating new heap memories and copying values.
Reviewed By: mbouaziz
Differential Revision: D7179353
fbshipit-source-id: 9c20f64
Summary: If a `Closure` expression `e` captures variable `x`, consider `e` as borrowing from `x`. When the closure is invoked via `operator()`, check that the borrow is still valid.
Reviewed By: jeremydubreil
Differential Revision: D7071839
fbshipit-source-id: d923a6a
Summary: It collects array accesses from all sub expressions in commands.
Reviewed By: mbouaziz
Differential Revision: D7165098
fbshipit-source-id: 584dc80
Summary: It does not only malloc a new heap memory, but also copy its contents.
Reviewed By: mbouaziz
Differential Revision: D7152194
fbshipit-source-id: 58cba5e
Summary: This is to make sure than the analysis produces the same results independently from the order in which the members of a call cycle are analyzed.
Reviewed By: sblackshear, mbouaziz
Differential Revision: D6881971
fbshipit-source-id: 23872e1
Summary:
Fairly simple approach here:
- If the RHS of an assignment is a frontend-generated temporary variable, assume it transfers ownership to the LHS variable
- If the RHS of an assignment is a program variable, assume that the LHS variable is borrowing from it.
- If we try to access a variable that has borrowed from a variable that is now invalid, complain.
Reviewed By: jeremydubreil
Differential Revision: D7069947
fbshipit-source-id: 99b8ee2
Summary:
At function calls, it copies a subset of heap memory that is newly
allocated by callees and is reachable from the return value.
Reviewed By: mbouaziz
Differential Revision: D7081425
fbshipit-source-id: 1ce777a
Summary:
Before D7100561, the frontend translated capture-by-ref and capture-by-value in the same way.
Now we can tell the difference and report bugs in the capture-by-value case.
Reviewed By: jeremydubreil
Differential Revision: D7102214
fbshipit-source-id: e9d3ac7
Summary:
The `may_last_field` boolean value in the `decl_sym_val` function presents that the location *may* (not *must*) be a flexible array member.
By the modular analysis nature, it is impossible to determine whether a given argument is a flexible array member or not---because of lack of calling context. For example, there are two function calls of `foo` below: (2) passes a flexible array member as an argument and (1) passes a non-flexible array, however it is hard to notice when analyzing the `foo` function.
```
struct T {
int c[1];
};
struct S {
struct T a;
struct T b;
};
void foo(struct T x) { ... }
void goo () {
struct S* x = (struct S*)malloc(sizeof(struct S) + 10 * sizeof(int));
foo(&(x->a)); // (1)
foo(&(x->b)); // (2)
}
```
We assume that any given arguments may stem from the last field of struct, i.e., flexible array member. (This is why `decl_sym_val` is called with `may_last_field:true` at the first time.) With some tests, we noticed that the assumption does not harm the analysis precision, because whether regarding a parameter as a flexible array member or not is about using a symbolic array size instead of a constant array size written in the type during the analysis of callee. Therefore still it can raise correct alarms if the actual parameter is given in its caller.
Reviewed By: mbouaziz
Differential Revision: D7081295
fbshipit-source-id: a4d57a0
Summary:
Switch to the current stable branch for clang.
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz
Differential Revision: D7067890
fbshipit-source-id: aedff90
Summary:
You can capture a variable by reference in a lambda, assign to it, and then invoke the lambda.
This looks like a dead store from the perspective of the current analysis.
This diff mitigates the problem by computing an additional analysis that tracks variables captured by ref at each program point.
It refuses to report a dead store on a variable that has already been captured by reference.
Later, we might want to incorporate the results of this analysis directly into the liveness analysis instead of just using it to gate reporting.
Reviewed By: jeremydubreil
Differential Revision: D7090291
fbshipit-source-id: 25eeffa
Summary:
It supports flexible array member using the following heuristic:
- a memory for a class is allocated by `malloc(sizeof(C) + n * sizeof(T))` format
- the last field of the class is an array
- the static size of the last field is one, i.e., `T field_name[1]`
When allocating and initializing members of classes, it sets the size of flexible array to `n+1` if the above conditions are met.
Reviewed By: mbouaziz
Differential Revision: D7056291
fbshipit-source-id: 31c5868
Summary:
The semantics of "placement new" is defined simply as an assignment.
For example, `C* x = new (y) C();` is analyzed as if `C* x = y;`.
Reviewed By: mbouaziz
Differential Revision: D7054007
fbshipit-source-id: 1c6754f
Summary:
The struct fields in Cil have been sorted for long time, however the
checkers do not seem to depend on the sortedness.
Reviewed By: sblackshear
Differential Revision: D7027858
fbshipit-source-id: 9e7ab96
Summary:
This commit improves precision of symbol instantiations.
When a return value of a callee is `[s1 + s2, _]` and if we want to
instantiate `s1` to `c3 + max(c4, s5)`, the lower bound was
substituted to `-oo` because our domain cannot express `c3 + max(c4,
s5) + s2`.
However, we can have instantiations that are preciser than `-oo`:
(1) `c3 + c4 + s2`
(2) or `c3 + s5 + s2`
because they are smaller than the ideal instantiation, `c3 + max(c4,
s5) + s2` and it is on the lower bound position.
For now, the implementation instantiates to (1) between the two ones,
because constant values introduced by `assert` or `assume`(`if`)
command are often used as safety conditions, e.g., `assert(index >=
0);` can place before array accesses. (We can change the stratege
later if we find that it doesn't work on some other cases.)
Reviewed By: mbouaziz
Differential Revision: D7020063
fbshipit-source-id: 62fb390
Summary:
A simple intraprocedural analysis that tracks when a storage location is read or deleted.
For now, this works only with local variable storage locations; field and array accesses are ignored.
In order to test this, I added a new "use-after-lifetime" warning. It complains when a variable is read or deleted after it has already been deleted.
Reviewed By: jeremydubreil
Differential Revision: D6961314
fbshipit-source-id: 75e95a2
Summary: We do not inject a destructor call if the destructor declaration does not contain a body in AST. We miss all the cases where the destructor is declared in `.h` file and defined in `.cpp` file as other files include `.h` file and do not contain the body of the destructor when destructor calls are being injected based on AST information. After this diff we inject destructor calls even if we do not have body for the destructor in AST.
Reviewed By: sblackshear
Differential Revision: D6796567
fbshipit-source-id: 1c187ec