Summary:
Turns out the mistake was pretty simple: we just forgot to keep the
history of the return value in the callee and add it to the caller's.
Reviewed By: skcho
Differential Revision: D28385941
fbshipit-source-id: 40fe09c99
Summary:
This diff avoids dereference of C struct, in its frontend and its semantics of Pulse. In SIL, C
struct is not first-class value, thus dereferencing on it does not make sense.
Reviewed By: ezgicicek
Differential Revision: D27953258
fbshipit-source-id: 348d56338
Summary: Added a new issue type for sending a message to nil when its return type is non-POD. To distinguish these issues from other nullptr dereference issues, we extend the `MustBeValid` attribute to contain the reason of why an address must be valid. For now a reason can only have `SelfOfNonPODReturnMethod` as it's value, but in the future we will use it for other nullability issue types, such as nil insertion into collections.
Reviewed By: jvillard
Differential Revision: D27762333
fbshipit-source-id: 689e5a431
Summary:
Before this diff, TOPL had 3 implementations:
1. a post-processing of biabduction summaries
2. a post-processing of pulse summaries
3. a deep embedding in pulse
1 and 2 additionally require instrumenting SIL to generate monitors for
the TOPL properties. 3 is faster than both 1 and 2, by a good lot, and
doesn't require instrumenting the SIL code. Thus, delete 1 and 2!
Also harmonise the CLI so that TOPL is activated by --topl, which
actives it as a checker, like other analyses.
Reviewed By: rgrig
Differential Revision: D27270178
fbshipit-source-id: e86cf972b
Summary:
See updated tests and code comments: this changes many arithmetic
operations to detect when a contradiction "p|->- * p=0" is about to be
detected, and generate a latent issue instead. It's hacky but it does
what we want. Many APIs change because of this so there's some code
churn but the overall end result is not much worse thanks to monadic
operators.
Reviewed By: skcho
Differential Revision: D26918553
fbshipit-source-id: da2abc652
Summary:
This first commit introduces test cases and the new summary type, in
particular how it is propagated during function calls. We don't yet
actually generate these summary types, this is for the next diff.
The goal is to catch this pattern:
```
foo(p) {
if(p) {}
*p = 42;
}
goo() { foo(NULL); }
```
We went foo(p) to be a latent error when p=0. Right now we detect a
contradiction p|->- * p=0 |- false. The next diff will fix it.
Reviewed By: skcho
Differential Revision: D26918552
fbshipit-source-id: 6614db17b
Summary:
This changes the results. I think it's because we cut short paths to
ISL errors sooner now, before they are duplicated and moved. I could not
really assess what was going on though so could be wrong.
On OpenSSL 1.0.2d:
Before: 106 issues
After: 90 issues
Reviewed By: ezgicicek
Differential Revision: D26822331
fbshipit-source-id: e861e7fc2
Summary:
This will enable further improvements: basically we want to be able to
abort the symbolic execution of a single disjunct whenever an error is
detected. Right now there is only one kind of error, which is now
explicitly called `ReportableError`.
The next diff refactors Pulse.ISL to add its own error type so that we
are able to get rid of the isl_status field (ISLOk/ISLError) inside
abductive states. ISLError states are really `Error _` states but
previously it would have been too much of an API change to expose that.
Now it's all going to be part of `AccessResult.t`.
A further change will add another error type for when a value is found
to be 0 after the fact by the arithmetic.
Reviewed By: ezgicicek
Differential Revision: D26821178
fbshipit-source-id: 2923db8e7
Summary:
Pretty minor, it's more convenient to make it return the state and will
be used in a later diff when that function will actually sometimes
modify the state.
Reviewed By: skcho
Differential Revision: D26488376
fbshipit-source-id: a21eaf008
Summary: This diff revises the trace generation of the uninitialized value checker, by introducing a new diagnostics for it.
Reviewed By: jvillard
Differential Revision: D25433775
fbshipit-source-id: 1279c0de4
Summary:
There was a bug where we forgot to mark these values as reachable. In
particular we would forget their arithmetic value as a result.
For example, now we remember that the array access is at an index equal
to 5 in the summary of this function:
```
foo(int a[]) {
a[5] = 42;
}
```
Reviewed By: skcho
Differential Revision: D25430468
fbshipit-source-id: 4acf09842
Summary:
Address a long-standing embarassing TODO in a minimal way: array indices
are values and when applying a summary we didn't actually bother
translating callee values to caller values. Fix that in a simple way by
just using the current mapping between callee and caller values and
otherwise freshen callee values to avoid clashes with caller values.
Reviewed By: ezgicicek
Differential Revision: D25424013
fbshipit-source-id: 03ca59b9f
Summary:
I wrote an entire diff trying to fix the "bug" that this wasn't needed
so I think this warrants a comment ;)
Reviewed By: ezgicicek
Differential Revision: D25423958
fbshipit-source-id: 414038e40
Summary:
This diff supports inter-procedural uninit analysis in pulse.
* Added `MustBeInitialized` attribute to pre state when an address is read
* Remove `Uninitialized` attribute when callee has `WrittenTo` for the
same address
Reviewed By: jvillard
Differential Revision: D25368492
fbshipit-source-id: cbc74d4dc
Summary:
Use the new module to represent both Sat/Unsat from Pulse formulas, and
FeasiblePath/InfeasiblePath from PulseReport.
Reviewed By: jberdine
Differential Revision: D25277566
fbshipit-source-id: 9f8412ca9
Summary:
Until then we mostly ignored aliasing constraints added by callees,
except some of the cases where the aliasing was incompatible with the
current heap. But, we should add `v_caller = v_caller'` any time both
of these (caller) variables are equal to the same callee variable.
These situations are hard to create at the moment since all values in
the pre-condition heap are created distinct and never change. The next
diff introduces canonicalisation of states and merges equal variables,
thus needs this change.
Reviewed By: skcho
Differential Revision: D25092213
fbshipit-source-id: 9fa7b8b53
Summary:
PulseTopl.large_step is now implemented
All active tests are migrated now to topl-in-pulse.
Reviewed By: jvillard
Differential Revision: D25179556
fbshipit-source-id: dc1136bab
Summary:
A Topl "small step" is a call to a method that is of interest to the
automaton. When such a call of interest is made, the topl component of
PulseAbductiveDomain.t is updated. This means that intra-procedural
Topl should now work entirely inside Pulse, without instrumenting Sil.
Main TODOs:
- add error extraction
- implement inter-procedural (PulseTopl.large_step)
Reviewed By: jvillard
Differential Revision: D25028286
fbshipit-source-id: e31a96d13
Summary:
When a procedure is called, we must evolve the topl component of the
PulseAbductiveDomain. This commit just inserts a call to a dummy
PulseTopl.large_step in the right place. The [large_step] function still
needs to be done.
Reviewed By: jvillard
Differential Revision: D24980825
fbshipit-source-id: 0eb280145
Summary:
Put hooks into Pulse for a faster Topl:
- done: PulseAbductiveDomain now tracks a Topl state
- todo: PulseTopl needs some transfer function (now they're dummies)
Reviewed By: jvillard
Differential Revision: D23815497
fbshipit-source-id: f3f0cf9ef
Summary:
We deliberately stopped as soon as an error was detected when applying a
function call. This is not good as other pre/posts of the function may
apply cleanly, which would allow us to cover more behaviours of the
code.
Went on a bit of a refactoring tangeant while fixing this, to clarify
the `Ok None`/`Ok Some _`/`Error _` datatype returned by PulseInterproc.
Now we report errors as soon as we find them during function calls but
continue accumulating specs afterwards.
Reviewed By: da319
Differential Revision: D24888768
fbshipit-source-id: d5f2c29d7
Summary:
Communicate new facts from the arithmetic domain to the memory domain to
detect contradictions between the two.
Reviewed By: jberdine
Differential Revision: D24832079
fbshipit-source-id: 2caf8e9af
Summary:
Take another page from the Incorrectness Logic book and refrain from reporting issues on paths unless we know for sure that this path will be taken.
Previously, we would report on paths that are merely *not impossible*. This goes very far in the other direction, so it's possible we'll want to go back to some sort of middle ground. Or maybe not. See the changes in the tests to get a sense of what we're missing.
Reviewed By: ezgicicek
Differential Revision: D24014719
fbshipit-source-id: d451faf02
Summary: When we evaluate lambdas in pulse, we create a closure object with `fake` fields to store captured variables. However, during the function call we were not linking the captured values from the closure object. We address this missing part here.
Reviewed By: jvillard
Differential Revision: D23316750
fbshipit-source-id: 14751aa58
Summary:
It's typically used inside another ~fold argument and it gets too
verbose.
Reviewed By: da319
Differential Revision: D22846501
fbshipit-source-id: 2fdd4271f
Summary:
When applying function summaries, we are careful not to violate the
summary's assumptions about non-aliasing. For example, the summary we
generate for `foo(x,y) { *x = *y; }` will have `x` and `y` be allocated
to two different `AbstractValue.t` in the heap, representing
disjointness.
However, the current logic is too coarse and also rejects passing the
same pure value to functions that made no assumption about them being
equal or different, eg `goo(int x,int y) { int z = x + y; }`. This is
because the corresponding `AbstractValue.t` are different in the
callee's summary, but are represented by only one same value in callers
such as `goo(i,i)`.
This diff restricts the "don't violate aliasing" condition to only
consider heap-allocated values. This is consistent with separation logic
by the way: we use the implication `x|->- * y|->- |- x≠y`, which is
valid only when both `x` and `y` are both allocated in the heap as in
the left-hand-side of `|-`.
Reviewed By: skcho
Differential Revision: D22574297
fbshipit-source-id: 206a18499
Summary:
This one is observed to be more memory efficient. Intuitively, maps need
to be re-allocated more often than lists for balancing. In pulse, we'll
often only ever add new values, in increasing order (when they are fresh
variables created as we symbolically execute the program), which pushes
maps into their worst-case allocation pattern. At least I suspect that's
what happens. With lists, this case is handled much better as lists are
not re-allocated when adding elements.
This is somewhat confirmed by benchmarking and observing GC stats.
Reviewed By: skcho
Differential Revision: D22140908
fbshipit-source-id: 29815112f
Summary:
List of things happening in this unreviewable diff:
- moved PulsePathCondition to PulseSledge
- renamed --pulse-path-conditions to --pudge
- PulsePathCondition now contains all the arithmetic of pulse
(inferbo+concrete intervals+pudge). In particular, moved arithmetic
attributes into PulsePathCondition.t. PulsePathCondition plays the
role of PulseArithmetic (combining all domains).
- added tests for a false positive involving free()
- PulseArithmetic is now just a thin wrapper around PulsePathCondition
to operate on states directly (instead of on path conditions).
- The rest is mostly moving code into PulsePathCondition (eg, from
PulseInterproc) and adjusting it.
Reviewed By: jberdine
Differential Revision: D21332073
fbshipit-source-id: 184c8e0a9
Summary:
Add a new data structure and use it for the map of memory accesses to
limit the number of destinations reachable from a given address. This
avoids remembering details of each index in large arrays, or even each
field in large structs.
Reviewed By: skcho
Differential Revision: D18246091
fbshipit-source-id: 5d3974d9c
Summary:
The idea was to keep track of why we know certain facts but actually
these traces are never read. Other arithmetic facts (BoItv and the path
condition) don't have histories so remove them from concrete intervals
too.
Reviewed By: dulmarod
Differential Revision: D21303353
fbshipit-source-id: eecf07b05
Summary:
Computing sledge's equality relation and normalising terms is costly. We
can avoid doing that most of the time by keeping the sledge path
condition lazily evaluated and only forcing it down to a value at two
critical points in the analysis:
1. Summary creation, to avoid storing unsatisfiable pre/posts that will have
to be needlessly executed by callers. This also saves us from having
to serialise the closures involved in the uncomputed form of lazy
values inside the pulse summaries.
2. Before reporting errors we check in the state is in fact satisfiable.
If not we just prune it away at that point.
This yields ~4x speedup on some targets.
Reviewed By: ezgicicek
Differential Revision: D21129759
fbshipit-source-id: a75fdd3bc
Summary:
Add a path condition to each symbolic state, represented in sledge's arithmetic domain. This gives a precise account of arithmetic constraints. In particular, it is relation and thus is more robust in the face of inter-procedural analysis.
This is gated behind a flag for now as there are performance issues with the new arithmetic.
Reviewed By: jberdine
Differential Revision: D20393947
fbshipit-source-id: b780de22a
Summary:
The "interface" modules define short forms for the internals of pulse
and also serve as a guide of which modules you are supposed to use at
which "level" in the pulse domains (base domain vs abductive domain vs
higher-level PulseOperations.ml). Make sure they are used.
Reviewed By: skcho
Differential Revision: D21022927
fbshipit-source-id: f890df245
Summary:
PulseAbductiveDomain.ml can be split into two distinct parts:
1. The definition of the "abductive domain" itself. This remains in that
file.
2. How to apply a given pre/post pair to the current state (during a
function call). This is about the same size as 1. in terms of lines
of code(!) and is now in PulseInterproc.ml.
Reviewed By: ezgicicek
Differential Revision: D21022921
fbshipit-source-id: 431fe061e