Summary:
The issue type `ZERO_EXECUTION_TIME` actually corresponds to bottom state but has been mistakenly used to mean
- unreachable nodes (program never reaching exit state)
- having zero cost (e.g. for allocations).
Note that, for execution costs, the latter doesn't make sense since we always incur a unit cost for the start node. Hence, a function with empty body will have unit cost. For allocations or IO however, we only incur costs for specific primitives, so a function with no allocations/IO could have a zero cost. However, there is no point reporting functions with zero cost as a specific issue type. Instead, what we want to track is the former, i.e. functions whose cost becomes 0 due to program never reaching exit state.
This diff aims to split these cases into two by only reporting on the latter and adds traces to bottom/unreachable cost by creating a special category in polynomials.
Next diff will rename `ZERO_XXX` to `XXX_UNREACHABLE_AT_EXIT`.
Reviewed By: skcho
Differential Revision: D20005774
fbshipit-source-id: 46b9abd5a
Summary:
Current domain of Inferbo cannot handle float values. This diff evaluates float constants to the top
interval.
Reviewed By: ezgicicek
Differential Revision: D20116361
fbshipit-source-id: e6e398bbd
Summary: We had no tests that resulted in `ZERO_EXECUTION_COST`. Let's fix that.
Reviewed By: skcho
Differential Revision: D20097504
fbshipit-source-id: 56c23fea0
Summary:
This diff fixes the clang translation for switch statement. It assumed that `default:` comes always
at last, which introduced some unreachable nodes inadvertently, e.g. when `default:` comes at first.
Reviewed By: dulmarod
Differential Revision: D19793138
fbshipit-source-id: 1e8b52c0d
Summary:
Inferbo analyzed some program points unreachable incorrectly, because of unsound semantics of band
operator, which did not handle the case when given parameters are pointer values.
Reviewed By: ngorogiannis
Differential Revision: D19392705
fbshipit-source-id: dd590508c
Summary:
A plugin update allows infer to know when a function doesn't return
according to its attributes. This propagates this info all the way to
the attributes of each function, and then use this information in a new
pre-analysis that cuts the links to successor nodes of each `Call`
instruction to a function that does not return.
NOTE: The "no_return" `CallFlag.t` was dead code, following diffs deal
with that (by removing it).
Reviewed By: dulmarod
Differential Revision: D18573922
fbshipit-source-id: 85ec64eca
Summary:
This also prints the CFGs *after* pre-analysis for individual procedures
in infer-out/captured/<filename>/<proc>.dot. One can also look up the
CFGs before pre-analysis in infer-out/captured/proc_cfgs_frontend.dot.
Context: I want to add a pre-analysis that needs to look at proc
attributes inter-procedurally. For this to make sense it has to happen
*after* all of capture, and before analysis.
Thus, this diff brings back the lazy running of the pre-analysis like in
D15803492, except that we still make sure to run the pre-analyses
systematically regardless of the checkers being run by running the
pre-analysis from ondemand.ml. Also we don't need to re-introduce the
"did_preanalysis" proc attribute for the same reason that the
pre-analysis is now run once and for all by ondemand.ml (instead of each
individual checker back in the days).
This has the benefit of running the pre-analysis only when needed, and
the drawback that several concurrent processes analysing the same proc
descs will duplicate work. Since pre-analyses are supposed to be very
fast I assume that neither is a big deal. If they become more expensive
then the benefit gets bigger and the drawback is just the same as with
regular analyses.
Reviewed By: skcho
Differential Revision: D18573920
fbshipit-source-id: de350eaef
Summary:
- more flexible API
- less error-prone thanks to named parameters
- also takes care of adjusting predecessors of the previous successors!
This fixes some (probably harmless) bugs in the frontends.
Reviewed By: dulmarod
Differential Revision: D18573923
fbshipit-source-id: ad97b3607
Summary:
This diff tries to make less imprecise division by constants results.
For example, the results of the division `[l, u] / c`, where `c` is a positive constant, are:
1. If `l/c` or `u/c` is representable in the bound domain, it uses the precise bounds, i.e., `[l/c, u/c]`.
2. If it is not representable, it tries to make conservative results:
if `0<=l<=u`, it returns `[0, u]` because `0 <= [l/c, u/c] <= u`
if `l<=u<=0`, it returns `[l, 0]` because `l <= [l/c, u/c] <= 0`
if `l<=0<=u`, it returns `[l, u]` because `l <= [l/c, u/c] <= u`
3. otherwise, it returns top, `[-oo, +oo]`
Reviewed By: ezgicicek
Differential Revision: D18270380
fbshipit-source-id: 8fb14c0e4
Summary:
This diff revises widening functions of bounds that have a linear form and a min/max form.
For example, for lower bounds,
* 3 ▽ (1+min(2, x)) = (1+min(2, x))
* 3+x ▽ (3+min(2, x)) = (3+min(2, x))
Reviewed By: jvillard
Differential Revision: D17420786
fbshipit-source-id: ff9eebed3
Summary:
This diff ignores character symbols in the cost results, in order to
avoid FPs from parser code.
Reviewed By: ezgicicek
Differential Revision: D17132053
fbshipit-source-id: d9cf8bd26
Summary:
This diff avoids that an integer value is pruned to the bottom by
comparing to a pointer.
For example, before this diff,
assume((int*)x == p);
assume((int*)x != p);
where x is an integer, x is pruned to the bottom in both of the assume
cases. So, there were some, unintentional and false, unreachable
code.
Depends on D16960199
Reviewed By: ezgicicek
Differential Revision: D16964735
fbshipit-source-id: 90a3c8c80
Summary: Functions with empty body have unit cost, not zero. The unit cost comes from the start node.
Reviewed By: skcho
Differential Revision: D16855642
fbshipit-source-id: 6b5181faf
Summary: We want to keep big O notation as simple as possible in cost analysis reports (especially in diff time). Therefore, let's not show constants/min/max in big O notations even though the resulting asymptotic bound might be inaccurate. Developers can click on the trace and see the actual cost.
Reviewed By: skcho
Differential Revision: D16731351
fbshipit-source-id: 2e16f7eca
Summary: In order to test changes to bigO notation, let's record them in test results.
Reviewed By: skcho
Differential Revision: D16763972
fbshipit-source-id: c1376909b
Summary: Sometimes programmers use integer underflow to get a maximum number of that type. This diff assumes that integer underflows from the syntactical form `(unsigned 0) - constant` is intended by the programmer, and suppresses the alarms of which.
Reviewed By: ezgicicek
Differential Revision: D16560639
fbshipit-source-id: 206f30dbc
Summary:
Before this diff, it gave up pruning of linear bound by minmax bound.
For example, `overapprox_min (x+c1, c2+min(d1,y))` was `x+c1`.
However, we can get a bit more preciser value as follows.
```
overapprox_min (x+c1, c2+min(d1,y))
<= min (x+c1, c2+d1)
= c1+min(c2+d1-c1, x)
```
Reviewed By: ezgicicek
Differential Revision: D16543837
fbshipit-source-id: 8fdbce097
Summary: This diff tries to do weak update for the abstract locations by pointer arithmetic, e.g. `p[n]` or `p+n`, even if the type of `p` is declared as a simple pointer, not an array.
Reviewed By: ezgicicek
Differential Revision: D16458367
fbshipit-source-id: 3b4cdd7e4
Summary:
We were skipping some instructions before and that was a problem for
pulse. See added pulse test.
Reviewed By: mbouaziz
Differential Revision: D16030150
fbshipit-source-id: 9c62e6213
Summary:
- Add allocation costs to `costs-report.json` and enable diffing over allocation costs.
- Also, let's be more consistent and modular in naming our cost issues.
- introduce a generic issue type `X_TIME_COMPLEXITY_INCREASE` where `X` can be one of the cost kinds. If the function is on the cold start, issue can have the `COLD_START` suffix. Similarly for infinite/zero/expensive calls.
- Change `PERFORMANCE_VARIATION` -> `EXECUTION_TIME_COMPLEXITY_INCREASE`
- Add new issue type for `ALLOCATION_COMPLEXITY_INCREASE_COLD_START` which will be enabled by default
- Refactor cost issues to be more modular and succinct. This also makes addition of a new cost kind very easy by adding the kind into the `enabled_cost_kinds` list in `CostKind.ml`
Reviewed By: mbouaziz
Differential Revision: D15822681
fbshipit-source-id: cf89ece59
Summary: In its new form it actually tests that infer takes the correct branch.
Reviewed By: mbouaziz
Differential Revision: D15494297
fbshipit-source-id: 7b9bb8f75
Summary:
- take advantage more structured attributes in the exported AST
- circumvent new format of `if` and `switch`
- a few new features/nodes but nothing major there
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz, martintrojer
Differential Revision: D15453572
fbshipit-source-id: c0c24345f
Summary:
Thanks to the newly added `StarField`, path length is better controlled before ondemand is used.
Hence there is no need to (unsoundly) canonicalize paths then anymore.
Reviewed By: ezgicicek
Differential Revision: D15409716
fbshipit-source-id: 9ea7b4717
Summary:
Some edge case involving casting field pointers to the structure type itself generated arbitrarily long paths when used in a loop.
Without changing the widening, this diff avoids repetitions of fields in paths by abstracting them with a star.
E.g. `x.a.b.c.b` will become `x.a.b.c*.b`, and so will `x.a.b.c.a.b`, `x.a.b.c.c.b`, or `x.a.b.c.b.b`.
Reviewed By: ngorogiannis
Differential Revision: D15352143
fbshipit-source-id: 5ea426c5e
Summary:
update-submodule: facebook-clang-plugins
We used to translate `offsetof` by an unknown value.
This fixes it. It is now translated like an integer literal.
Reviewed By: ddino
Differential Revision: D15317799
fbshipit-source-id: ae89e0ec5
Summary:
Increases precision a bit. I didn't observe speed problems on what I tested. (But, who knows?)
Closes https://github.com/facebook/infer/pull/799
Reviewed By: jvillard
Differential Revision: D6284206
Pulled By: rgrig
fbshipit-source-id: 6f1e8631f
Summary:
Instead of emitting an ad-hoc builtin on variable declaration emit a new
metadata instruction. This allows us to remove the code matching on that
ad-hoc builtin that had to be inserted in several checkers.
Inferbo & pulse used that information meaningfully and had to undergo
some minor changes to cope with the new metada instruction.
Reviewed By: ezgicicek
Differential Revision: D14833100
fbshipit-source-id: 9b3009d22
Summary:
This diff propagates LatestPrune on function calls.
Depends on D14321605
Reviewed By: mbouaziz
Differential Revision: D14321618
fbshipit-source-id: cb2e1b547
Summary:
Given a pointer-typed parameter, Inferbo assumes that it is an array
block. However, when a pointer is given as an actual parameter, it
failed the substitution of the array block value of the parameter, thus
which made some return values to bottom unexpectedly.
This diff revises the substitution of array block, so it can
substitute array block values with actual pointers correctly when it
is possible.
Reviewed By: mbouaziz
Differential Revision: D14663475
fbshipit-source-id: 0477de1ba
Summary:
While adding a footprint frame during rearrangement, the footprint
variables should be fresh with respect to the current state too, not
only with respect to he footprint, because the frame is added to the
state.
Reviewed By: jberdine
Differential Revision: D14401026
fbshipit-source-id: 20ea4485a
Summary: Unknown locations in the alias domain resulted in unexpected unreachable code.
Reviewed By: mbouaziz
Differential Revision: D14339412
fbshipit-source-id: a5dca6489
Summary:
This diff differentiates proof obligations by allocsites. Sizes and
offsets of arrays were joined when making proof obligations.
Reviewed By: mbouaziz
Differential Revision: D14163149
fbshipit-source-id: cb6608c16
Summary:
This diff adds a constant to the set of widening thresholds if the
constant is compared to an abstract value in condition expressions.
Each abstract value has its own set of thresholds.
Reviewed By: mbouaziz
Differential Revision: D14147150
fbshipit-source-id: ca0db34d4
Summary:
In this diff, it avoids a precision-losing pruning, which was needed
to keep effects of assume commands.
```
unsigned int c = a + b; // (1)
if (c > 0) { // (2)
char result[c];
result[c - 1] = 0; // (4)
}
```
For example, in the example, `c` is assigned by `[a+b,a+b]` at (1),
then it tried to prune the lower bound of `c` to 1 at (2) while losing
precision, in order to say `c - 1` at (4) is safe in terms of integer
underflow. Instead, it could not say that `c - 1` is smaller than `c`
in the buffer access, because the former is analyzed to `[0,a+b-1]` and
the latter `[1,a+b]` at (4).
Now, the situation has changed. By adopting conditional proof
obligation (D13749914), the FP of integer overflow can be suppressed
without the precision-losing pruning.
Reviewed By: mbouaziz
Differential Revision: D14122770
fbshipit-source-id: 634744e99
Summary: Since Inferbo's current min/max domain can keep only a single symbol, e.g. min(constant, symbol) , it loses precision when trying to prune a value including multiple symbols.
Reviewed By: ezgicicek
Differential Revision: D14099399
fbshipit-source-id: 71d677b75
Summary: Record where each symbol in a polynomial is coming from: either a loop, function call or a modeled call.
Reviewed By: mbouaziz
Differential Revision: D14047420
fbshipit-source-id: 56d0bd926
Summary: It keeps alias of simple plus/minus arithmetic in order to pruning the value of "++i" expression.
Reviewed By: mbouaziz
Differential Revision: D14080230
fbshipit-source-id: d3af32a32
Summary:
- There is no need to use AI to compute a dot product: let's just fold over all nodes, but still do it in order (using the WTO) to report at the right place
- The previous version was computing a dot product on nodes for each node, which was quadratic, the new version is linear
- Report only once, the first time the threshold is reached (if in a loop, report at the loop head)
Reviewed By: ddino
Differential Revision: D14028171
fbshipit-source-id: b4a840c6e
Summary:
If the result of `fgets` is not-null, the length of `s` should be
bigger than or equal to 1.
Reviewed By: mbouaziz
Differential Revision: D13939994
fbshipit-source-id: 298fe33f4
Summary:
It adds a semantics for evaluation of abstract location of literal
string, which was missing.
Reviewed By: mbouaziz
Differential Revision: D13915265
fbshipit-source-id: 741df843b
Summary:
This diff extends the abstract domain to keep binary conditions on
prunings, so Inferbo can suppress more proof obligations (i.e., false
positives) that are known to be unreachable according to the binary
conditions.
Depends on D13729600
Reviewed By: mbouaziz
Differential Revision: D13749914
fbshipit-source-id: 314f048f1
Summary:
`memcpy` should copy the contents of the source to the destination.
Depends on D13634754
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D13668414
fbshipit-source-id: cb0ff2010
Summary: It extends the abstract location for C string length, i.e., the first index of the null character in character array.
Reviewed By: mbouaziz
Differential Revision: D13634241
fbshipit-source-id: d2727d5f5
Summary:
This diff prevents deduplications of issues when they have different
conditions to reach.
Reviewed By: mbouaziz
Differential Revision: D13596220
fbshipit-source-id: 95f802edb
Summary: This diff unset powloc and arrayblk values of p when assume(p==Null).
Reviewed By: mbouaziz, jvillard
Differential Revision: D13415366
fbshipit-source-id: a491a957f
Summary:
For abstract values representing one concrete value, create only one symbol instead of two.
Still create two symbols (lb, ub) for abstract values representing multiple concrete values (like array cells).
As a consequence, comparisons of symbolic values are more precise (we can even prove equality). I expect to remove a bunch of FPs.
Another consequence is the disappearance of `.lb` and `.ub` in many reports.
Reviewed By: skcho
Differential Revision: D13072084
fbshipit-source-id: 9bc0b9881
Summary:
It weakens canonical path in order to avoid an explosion of locations when a struct type has pointers to struct.
For example:
```
struct Tree {
struct Tree *root;
struct Tree *left;
struct Tree *right;
};
```
It was able to generate lots of abstract locations before this diff:
```
t->root
t->left
t->right
t->root->left
t->root->right
t->left->root
t->left->right
t->right->root
t->right->left
t->root->left->right
t->root->right->left
t->left->root->right
t->left->right->root
t->right->root->left
t->right->left->root
```
By this diff, pointer fields that have the same type are (unsoundly) canonicalized to the same one. For example,
```
t->root
t->root->left
t->root->right
t->root->left->right
t->root->right->left
```
are canonicalized to `t->root`. This is definitely unsound but I believe it is better than the location explosion by which analyses do not terminate in a reasonable time or giving a fixed limit of depth to the field access.
Reviewed By: mbouaziz
Differential Revision: D13503808
fbshipit-source-id: 867018712
Summary: It weakly updates array when there are more than two contents.
Reviewed By: mbouaziz
Differential Revision: D13318443
fbshipit-source-id: fa740d8b1
Summary:
It materializes symbolic values of function parameters on-demand. The on-demand materialization is triggered when finding a value from an abstract memory and joining/widening abstract memories.
Depends on D13294630
Main idea:
* Symbolic values are on-demand-ly generated by a symbol path and its type
* In order to avoid infinite generation of symbolic values, symbol paths are canonicalized by structure types and field names (which means they are abstracted to the same value). For example, in a linked list, a symbolic value `x->next->next` is canonicalized to `x->next` when the structures (`*x` and `*x->next`) have the same structure type and the same field name (`next`).
Changes from the previous code:
* `Symbol.t` does not include `id` and `pname` for distinguishing symbols. Now, all symbols are compared by `path:SymbolPath.partial` and `bound_end`.
* `SymbolTable` is no longer used, which was used for generating symbolic values with new `id`s.
Reviewed By: mbouaziz
Differential Revision: D13294635
fbshipit-source-id: fa422f084
Summary:
`eval_locs` is like `eval |> get_all_locs` but avoids computing things that aren't necessary.
The goal was not to be an optimization but is needed for Java where array blocks don't have offsets.
Reviewed By: skcho
Differential Revision: D13190939
fbshipit-source-id: 1cc0e6338
Summary: It instantiates not only symbols for bound but also symbols for locations at function calls.
Reviewed By: mbouaziz
Differential Revision: D13231291
fbshipit-source-id: ce23a943b
Summary:
Before, the liveness pre-analysis would place extra instructions in the
CFG for either:
1. marking an `Ident.t` as dead, or
2. marking a `Pvar.t` as `= 0`
But we have no way of marking pvars dead without setting them to 0. This
is bad because setting pvars to 0 is not possible everywhere they are
dead. Indeed, we only do it when we haven't seen their address being
taken anyway. This prevents the following situation, recorded in our tests:
```
int address_taken() {
int** x;
int* y;
int i = 7;
y = &i;
x = &y;
// if we don't reason about taken addresses while adding nullify instructions,
// we'll add
// `nullify(y)` here and report a false NPE on the next line
return **x;
}
```
So we want to mark pvars as dead without nullifying them. This diff
extends the `Remove_temps` SIL instruction to accept pvars as well, and
so renames it to `ExitScope`.
Reviewed By: da319
Differential Revision: D13102953
fbshipit-source-id: aa7f03a52
Summary:
It modifies sizes and offsets of array values on pointer castings.
Currently, it supports only simple castings of pointer-to-integers.
Reviewed By: mbouaziz
Differential Revision: D12920589
fbshipit-source-id: a5ba831b8
Summary:
It enables the translation of casting expression. As of now, it
translates only the castings of pointers to integer types, in order to
avoid too much of change, which may mess the checkers up.
Reviewed By: jvillard
Differential Revision: D12920568
fbshipit-source-id: a5489df24
Summary:
Useful to understand the changes in the pre-analysis, or to inspect the
CFG that checkers actually get.
This means that the pre-analysis always runs when we output the dotty,
but I don't really see a reason why not. In fact, we could probably
*always* store the CFGs as pre-analysed.
Reviewed By: mbouaziz
Differential Revision: D13102952
fbshipit-source-id: 89f3102ec