Summary:
This diff differentiates proof obligations by allocsites. Sizes and
offsets of arrays were joined when making proof obligations.
Reviewed By: mbouaziz
Differential Revision: D14163149
fbshipit-source-id: cb6608c16
Summary:
This diff adds a constant to the set of widening thresholds if the
constant is compared to an abstract value in condition expressions.
Each abstract value has its own set of thresholds.
Reviewed By: mbouaziz
Differential Revision: D14147150
fbshipit-source-id: ca0db34d4
Summary:
In this diff, it avoids a precision-losing pruning, which was needed
to keep effects of assume commands.
```
unsigned int c = a + b; // (1)
if (c > 0) { // (2)
char result[c];
result[c - 1] = 0; // (4)
}
```
For example, in the example, `c` is assigned by `[a+b,a+b]` at (1),
then it tried to prune the lower bound of `c` to 1 at (2) while losing
precision, in order to say `c - 1` at (4) is safe in terms of integer
underflow. Instead, it could not say that `c - 1` is smaller than `c`
in the buffer access, because the former is analyzed to `[0,a+b-1]` and
the latter `[1,a+b]` at (4).
Now, the situation has changed. By adopting conditional proof
obligation (D13749914), the FP of integer overflow can be suppressed
without the precision-losing pruning.
Reviewed By: mbouaziz
Differential Revision: D14122770
fbshipit-source-id: 634744e99
Summary: Since Inferbo's current min/max domain can keep only a single symbol, e.g. min(constant, symbol) , it loses precision when trying to prune a value including multiple symbols.
Reviewed By: ezgicicek
Differential Revision: D14099399
fbshipit-source-id: 71d677b75
Summary: Record where each symbol in a polynomial is coming from: either a loop, function call or a modeled call.
Reviewed By: mbouaziz
Differential Revision: D14047420
fbshipit-source-id: 56d0bd926
Summary: It keeps alias of simple plus/minus arithmetic in order to pruning the value of "++i" expression.
Reviewed By: mbouaziz
Differential Revision: D14080230
fbshipit-source-id: d3af32a32
Summary:
- There is no need to use AI to compute a dot product: let's just fold over all nodes, but still do it in order (using the WTO) to report at the right place
- The previous version was computing a dot product on nodes for each node, which was quadratic, the new version is linear
- Report only once, the first time the threshold is reached (if in a loop, report at the loop head)
Reviewed By: ddino
Differential Revision: D14028171
fbshipit-source-id: b4a840c6e
Summary:
If the result of `fgets` is not-null, the length of `s` should be
bigger than or equal to 1.
Reviewed By: mbouaziz
Differential Revision: D13939994
fbshipit-source-id: 298fe33f4
Summary:
It adds a semantics for evaluation of abstract location of literal
string, which was missing.
Reviewed By: mbouaziz
Differential Revision: D13915265
fbshipit-source-id: 741df843b
Summary:
This diff extends the abstract domain to keep binary conditions on
prunings, so Inferbo can suppress more proof obligations (i.e., false
positives) that are known to be unreachable according to the binary
conditions.
Depends on D13729600
Reviewed By: mbouaziz
Differential Revision: D13749914
fbshipit-source-id: 314f048f1
Summary:
`memcpy` should copy the contents of the source to the destination.
Depends on D13634754
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D13668414
fbshipit-source-id: cb0ff2010
Summary: It extends the abstract location for C string length, i.e., the first index of the null character in character array.
Reviewed By: mbouaziz
Differential Revision: D13634241
fbshipit-source-id: d2727d5f5
Summary:
This diff prevents deduplications of issues when they have different
conditions to reach.
Reviewed By: mbouaziz
Differential Revision: D13596220
fbshipit-source-id: 95f802edb
Summary: This diff unset powloc and arrayblk values of p when assume(p==Null).
Reviewed By: mbouaziz, jvillard
Differential Revision: D13415366
fbshipit-source-id: a491a957f
Summary:
For abstract values representing one concrete value, create only one symbol instead of two.
Still create two symbols (lb, ub) for abstract values representing multiple concrete values (like array cells).
As a consequence, comparisons of symbolic values are more precise (we can even prove equality). I expect to remove a bunch of FPs.
Another consequence is the disappearance of `.lb` and `.ub` in many reports.
Reviewed By: skcho
Differential Revision: D13072084
fbshipit-source-id: 9bc0b9881
Summary:
It weakens canonical path in order to avoid an explosion of locations when a struct type has pointers to struct.
For example:
```
struct Tree {
struct Tree *root;
struct Tree *left;
struct Tree *right;
};
```
It was able to generate lots of abstract locations before this diff:
```
t->root
t->left
t->right
t->root->left
t->root->right
t->left->root
t->left->right
t->right->root
t->right->left
t->root->left->right
t->root->right->left
t->left->root->right
t->left->right->root
t->right->root->left
t->right->left->root
```
By this diff, pointer fields that have the same type are (unsoundly) canonicalized to the same one. For example,
```
t->root
t->root->left
t->root->right
t->root->left->right
t->root->right->left
```
are canonicalized to `t->root`. This is definitely unsound but I believe it is better than the location explosion by which analyses do not terminate in a reasonable time or giving a fixed limit of depth to the field access.
Reviewed By: mbouaziz
Differential Revision: D13503808
fbshipit-source-id: 867018712
Summary: It weakly updates array when there are more than two contents.
Reviewed By: mbouaziz
Differential Revision: D13318443
fbshipit-source-id: fa740d8b1
Summary:
It materializes symbolic values of function parameters on-demand. The on-demand materialization is triggered when finding a value from an abstract memory and joining/widening abstract memories.
Depends on D13294630
Main idea:
* Symbolic values are on-demand-ly generated by a symbol path and its type
* In order to avoid infinite generation of symbolic values, symbol paths are canonicalized by structure types and field names (which means they are abstracted to the same value). For example, in a linked list, a symbolic value `x->next->next` is canonicalized to `x->next` when the structures (`*x` and `*x->next`) have the same structure type and the same field name (`next`).
Changes from the previous code:
* `Symbol.t` does not include `id` and `pname` for distinguishing symbols. Now, all symbols are compared by `path:SymbolPath.partial` and `bound_end`.
* `SymbolTable` is no longer used, which was used for generating symbolic values with new `id`s.
Reviewed By: mbouaziz
Differential Revision: D13294635
fbshipit-source-id: fa422f084
Summary:
`eval_locs` is like `eval |> get_all_locs` but avoids computing things that aren't necessary.
The goal was not to be an optimization but is needed for Java where array blocks don't have offsets.
Reviewed By: skcho
Differential Revision: D13190939
fbshipit-source-id: 1cc0e6338
Summary: It instantiates not only symbols for bound but also symbols for locations at function calls.
Reviewed By: mbouaziz
Differential Revision: D13231291
fbshipit-source-id: ce23a943b
Summary:
Before, the liveness pre-analysis would place extra instructions in the
CFG for either:
1. marking an `Ident.t` as dead, or
2. marking a `Pvar.t` as `= 0`
But we have no way of marking pvars dead without setting them to 0. This
is bad because setting pvars to 0 is not possible everywhere they are
dead. Indeed, we only do it when we haven't seen their address being
taken anyway. This prevents the following situation, recorded in our tests:
```
int address_taken() {
int** x;
int* y;
int i = 7;
y = &i;
x = &y;
// if we don't reason about taken addresses while adding nullify instructions,
// we'll add
// `nullify(y)` here and report a false NPE on the next line
return **x;
}
```
So we want to mark pvars as dead without nullifying them. This diff
extends the `Remove_temps` SIL instruction to accept pvars as well, and
so renames it to `ExitScope`.
Reviewed By: da319
Differential Revision: D13102953
fbshipit-source-id: aa7f03a52
Summary:
It modifies sizes and offsets of array values on pointer castings.
Currently, it supports only simple castings of pointer-to-integers.
Reviewed By: mbouaziz
Differential Revision: D12920589
fbshipit-source-id: a5ba831b8
Summary:
It enables the translation of casting expression. As of now, it
translates only the castings of pointers to integer types, in order to
avoid too much of change, which may mess the checkers up.
Reviewed By: jvillard
Differential Revision: D12920568
fbshipit-source-id: a5489df24
Summary:
Useful to understand the changes in the pre-analysis, or to inspect the
CFG that checkers actually get.
This means that the pre-analysis always runs when we output the dotty,
but I don't really see a reason why not. In fact, we could probably
*always* store the CFGs as pre-analysed.
Reviewed By: mbouaziz
Differential Revision: D13102952
fbshipit-source-id: 89f3102ec
Summary: It fixes the conditions of the `check` function to address `is_collection_add` cases correctly.
Reviewed By: mbouaziz
Differential Revision: D13081281
fbshipit-source-id: 39ae5ef03
Summary:
When initialising a variable via semi-exotic means, the frontend loses
the information that the variable was initialised. For instance, it
translates:
```
struct Foo { int i; };
...
Foo s = {42};
```
as:
```
s.i := 42
```
This can be confusing for backends that need to know that `s` actually
got initialised, eg pulse.
The solution implemented here is to insert of dummy call to
`__variable_initiazition`:
```
__variable_initialization(&s);
s.i := 42;
```
Then checkers can recognise that this builtin function does what its
name says.
Reviewed By: mbouaziz
Differential Revision: D12887122
fbshipit-source-id: 6e7214438
Summary:
For more deduplications of issues, this diff loosens the condition of
similar bounds. The previous condition of similar bounds was too
strict, so [0,0] and [0,+oo] were not similar.
Depends on D10851762
Reviewed By: mbouaziz
Differential Revision: D10866127
fbshipit-source-id: 4ba912a88
Summary:
It terminates narrowing when new and old states are not comparable.
Since current narrowing does not use meet operations guaranteeing
termination of narrowing, it tries to terminate narrowing more
conservatively.
Reviewed By: mbouaziz
Differential Revision: D12815419
fbshipit-source-id: e8b45199e
Summary: It tries division on minmax value approximately, rather than just returning infinities. For example, `[0,2+min(6,s)] / 2` returns `[0,4]`.
Reviewed By: mbouaziz
Differential Revision: D10867091
fbshipit-source-id: d3f49987b
Summary:
This diff preserves values of offset and index separately, rather than
one value of their addition, because premature addition results in
imprecise FPs by the limited expressiveness of the domain.
Reviewed By: mbouaziz
Differential Revision: D10851393
fbshipit-source-id: 1685ead36
Summary:
It avoids checking integer overflow when it definitely cannot happen.
For example, it does not check integer overflow of addition when one
of parameters is a negative number, or underflow of subtraction when
its first parameter is a positive number.
Reviewed By: mbouaziz
Differential Revision: D10446161
fbshipit-source-id: b8c86e1b2
Summary: We assume multiplication of 1 is safe. It happens sometimes by multiplying `sizeof(char)`.
Reviewed By: mbouaziz
Differential Revision: D10444680
fbshipit-source-id: 2f33be280
Summary: This diff changes pp of binary operation condition in order to avoid a `make test` failure. For the same `uint64_t` type, it is translated to `unsigned long long` in 64bit mac, but `unsigned long` in 64bit linux, which made a `make test` failure.
Reviewed By: mbouaziz
Differential Revision: D10459466
fbshipit-source-id: 449ab548e
Summary:
Fix the logic for computing duplicate symbols. It was broken at some point and some duplicate symbols creeped into our tests. Fix these, and add a test to avoid duplicate symbols detection to regress again.
Also, this removes one use of `Cfg.load`, on the way to removing file-wide CFGs from the database.
Reviewed By: ngorogiannis
Differential Revision: D10173349
fbshipit-source-id: a0d2365b3
Summary: It uses big int, instead of 63bits int of OCaml, in the interval domain in order to get preciser numeric values in the future.
Reviewed By: jvillard
Differential Revision: D10123364
fbshipit-source-id: c217f4366
Summary:
Goal of the stack: deprecate the `--analyzer` option in favour of turning
individual features on and off. This option is a mess: some of the options are
now subcommands (compile, capture), others are aliases (infer and checkers),
and they can all be replicated using some straightforward combination of other
options.
This diff: stop using `--analyzer` in tests. It's mostly `checkers` everywhere,
which is already the default. `linters` becomes `--no-capture --linters-only`.
`infer` is supposed to be `checkers` already. `crashcontext` is
`--crashcontext-only`.
Reviewed By: mbouaziz
Differential Revision: D9942689
fbshipit-source-id: 048281761
Summary:
The model for `getcwd` assumes the first argument should be non-null when in fact a NULL pointer is legitimate and results in allocation:
> As an extension to the POSIX.1-2001 standard, glibc's getcwd() allocates the buffer dynamically using mal‐
> loc(3) if buf is NULL. In this case, the allocated buffer has the length size unless size is zero, when buf
> is allocated as big as necessary. The caller should free(3) the returned buffer.
I suggest this glibc extension be used for the getcwd model to reduce false positives.
Pull Request resolved: https://github.com/facebook/infer/pull/925
Reviewed By: mbouaziz
Differential Revision: D9830450
Pulled By: jvillard
fbshipit-source-id: 95c4862b1
Summary:
It simplifies abstract memory instantiations of function calls. Now it instantiates callee memories by directly evaluating symbol paths, rather than constructing `subst_map`.
main changes are:
- no construction of `subst_map` and `trace_map`
- no symbol table in Inferbo's summary
- no `Symbol_not_found` exception (for when a required symbol was unavailable in `subst_map`)
Reviewed By: mbouaziz
Differential Revision: D9495597
fbshipit-source-id: 18cdcd6f7
Summary: When a typedef-ed structure is defined in another source file, `tenv` returns a structure with empty fields.
Reviewed By: mbouaziz
Differential Revision: D9629200
fbshipit-source-id: 8859803f9
Summary:
It returns unknown values on non-const function calls like on unknown
function calls.
Reviewed By: mbouaziz
Differential Revision: D9478862
fbshipit-source-id: 4b795ec55
Summary: `CONDITION_ALWAYS_**` can be introduced by global constants.
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D9478528
fbshipit-source-id: 7b1a46e7a
Summary:
When we see `pthread_create(..., ..., foo, ...)`, we want to call the function
`foo` to check that its precondition is met. The initial goal was to get rid
of the uncouth call to `Summary.get` when what we really want is to analyse
`foo` instead of just betting on the fact that it has been analysed already.
Besides switching to `Ondemand.analyze_proc_name`, this also changes the
matching of the function pointer in the arguments of `pthread_create()` to
detect the common case of a constant function name. I also added tests.
Reviewed By: jeremydubreil
Differential Revision: D9195159
fbshipit-source-id: dfec79f14
Summary:
It adds relational domains to Inferbo: octagon of Apron and polyhedra of Elina.
- Each Mem domain value includes one relational value containing relations among symbols. The relational values are modified by the `Prune` and `Store` commands.
- Each abstract value includes three symbols, which represent integer value, array offset, and array size of an abstract value.
The relational domain is deactivated by default. Use the `--bo-relational-domain {oct, poly}` option for the activation, though Inferbo with the relational domains does not work at this point because some modifications of Apron and Elina we made has not been applied to their opam repositories yet.
Reviewed By: jvillard
Differential Revision: D8874102
fbshipit-source-id: 08e5883cb
Summary:
Infer does the right thing now, make sure it doesn't regress.
https://github.com/facebook/infer/issues/86
Reviewed By: mbouaziz, dulmarod
Differential Revision: D8442855
fbshipit-source-id: 3df29b88c
Summary:
It adds relational domains to Inferbo: octagon of Apron and polyhedra of Elina.
- Each `Mem` domain value includes one relational value containing relations among *symbols*. The relational values are modified by the `Prune` and `Store` commands.
- Each abstract value includes three *symbols*, which represent integer value, array offset, and array size of an abstract value.
The relational domain is deactivated by default, so this diff should not make any differences in CI.
Use `--bo-relational-domain {oct, poly}` for the activation, though Inferbo with the relational domains does not work at this point because some modifications of Apron and Elina we made has not been applied to their opam repositories yet.
Reviewed By: mbouaziz, jvillard
Differential Revision: D8478542
fbshipit-source-id: 510ff53
Summary: The frontend got better and now `assert` is correctly translated.
Reviewed By: dulmarod
Differential Revision: D8731877
fbshipit-source-id: ba11b0c
Summary: Record that it is not supported: https://github.com/facebook/infer/issues/8
Reviewed By: mbouaziz, dulmarod
Differential Revision: D8442762
fbshipit-source-id: fa271cb
Summary:
I realized that control variable analysis was broken when we had multiple back-edges for the same loop. This is often the case when we have a switch statement combined with continue in a loop (see `test_switch` in `switch_continue.c`) or when we have disjunctive guards in do-while loops.
This diff fixes that by
- defining a loop by its loophead (the target of its backedges) rather than its back-edges. Then it converts back-edge list to a map from loop_head to sources of the loop's back-edges.
- collecting multiple guard nodes that come from potentially multiple exit nodes per loop head
In addition, it also removes the wrong assumption that an exit node belongs to a single loop head.
Reviewed By: mbouaziz
Differential Revision: D8398061
fbshipit-source-id: abaf288