infer_clone

Commit Graph

Author	SHA1	Message	Date
Jules Villard	5ec898a4f3	[pulse] suppress leaks that are not leaks due to pointer arithmetic Summary: This seems to be a popular source of false positives. Reviewed By: skcho Differential Revision: D28576767 fbshipit-source-id: d8d4d60d6	4 years ago
Jules Villard	b7ee374d00	[pulse] values equal to live values are not dead Summary: This fixes a memory leak false positive. When collecting unreachable values we should be careful to take the equality relation into account. Equal values are normally canonicalised but only with respect to "known" equalities. This makes sure variables that are live thanks to the "pruned" equalities are not discarded from the state. Reviewed By: skcho Differential Revision: D28382642 fbshipit-source-id: 2b898d754	4 years ago
Jules Villard	312d4a2c0f	[pulse] change the API of `simplify` to take `~can_be_pruned ~keep` Summary: I should have listened to skcho on D28002725 (`7207e05682`). Needed for the next diff. Reviewed By: skcho Differential Revision: D28121203 fbshipit-source-id: 96c01b141	4 years ago
Jules Villard	7207e05682	[pulse] discard "pruned" atoms that refer to variables outside the pre Summary: See added test: pulse sometimes insisted that an issue was latent even though the condition that made it latent could not be influenced (hence could the issue could never become manifest) by callers because it was unrelated to the pre, i.e. it came from a mutation inside the function. In these cases, we want to report the issue straight away instead of keeping it latent. Reviewed By: skcho Differential Revision: D28002725 fbshipit-source-id: ce9e6f190	4 years ago
Jules Villard	e549103d75	[pulse] use term_eqs Summary: Whenever an equality "t = v" (t an arbitrary term, v a variable) is added (or "v = t"), remember the "t -> v" mapping after canonicalising t and v. Use this to detect when two variables are equal to the same term: `t = v` and `t = v'` now yields `v = v'` to be added to the equality relation of variables. This increases the precision of the arithmetic engine. Interestingly, the impact on most code I've tried is: 1. mostly same perfs as before, if a bit slower (could be within noise) 2. slightly more (latent) bugs reported in absolute numbers I would have expected it to be more expensive and yield fewer bugs (as fewer false positives), but there could be second-order effects at play here where we get more coverage. We definitely get more latent issues due to dereferencing pointers after testing nullness, as can be seen in the unit tests as well, which may alone explain (2). There's some complexity when adding term equalities where the term is linear, as we also need to add it to `linear_eqs` but `term_eqs` and `linear_eqs` are interested in slightly different normal forms. Reviewed By: skcho Differential Revision: D27331336 fbshipit-source-id: 7314e127a	4 years ago
Jules Villard	5a363c9b07	[pulse][arith] small normalization improvement Summary: It's better (=possibly more efficient) to take the opportunity to normalize linear terms when we can instead of possibly having to apply the same normalization over and over on individual terms until the next round of proper normalization. Reviewed By: skcho Differential Revision: D27464885 fbshipit-source-id: 0dc01a089	4 years ago
Jules Villard	8602b709ef	[pulse][arith] change bit shifts by a constant factor into multiplications Summary: When we don't know the value being shifted it may help to translate bit-shifting into multiplication by a constant as it might surface linear terms, eg `x<<1` is `2*x`. Reviewed By: skcho Differential Revision: D27464847 fbshipit-source-id: 9b3b5f0d0	4 years ago
Jules Villard	8e9bc54c4a	[pulse][arith] eval constant terms before other simplifications Summary: The simplifications done by `simplify_shallow` are all taken care of by `eval_const_shallow` as well, they just also happen to help when not all of the term is a constant. However, they might be less precise/efficient than in the constant case, in particular in the next diff that translates `x << c` into `x * 2^c` when `c` is constant. Reviewed By: skcho Differential Revision: D27464805 fbshipit-source-id: 452bc6ab1	4 years ago
Jules Villard	d1b3e56574	[pulse] cap the size of literals in formulas Summary: On some pathological examples of crypto primitives like libsodium, later diffs make pulse grind to a halt due to an explosion in the size of literals. This is at least partly due to the fact the arithmetic doesn't operate modulo 2^64. Due to the fact the arithmetic is confused in any case when we reach such large numbers, cap them, currently at 2^128. This removes pathological cases for now, even now on libsodium Pulse is ~5 times faster than before! Take this opportunity to put the modified Q/Z modules in the own files. Reviewed By: jberdine Differential Revision: D27463933 fbshipit-source-id: 342d941e2	4 years ago
Jules Villard	2d83dfdcb0	[pulse] add a term_eqs field to formulas Summary: Just some scaffolding to save a bit of churn from the next diff. Reviewed By: skcho Differential Revision: D27328348 fbshipit-source-id: 4f5bfcc65	4 years ago
Gabriela Cunha Sampaio	68b4b5cc27	[pulse] IsInstanceOf simplification for null obj Summary: Fixing `IsInstanceOf` term simplification for null case. Before, this was only being done if value was known to be null at the moment of the call to `instanceof`. Otherwise, the `IsInstanceOf` term would remain in the formula unnecessarily. Reviewed By: jvillard Differential Revision: D27361025 fbshipit-source-id: 2d958a757	4 years ago
Jules Villard	c07af055eb	[topl] delete shallow implementations in favour of a single Pulse one Summary: Before this diff, TOPL had 3 implementations: 1. a post-processing of biabduction summaries 2. a post-processing of pulse summaries 3. a deep embedding in pulse 1 and 2 additionally require instrumenting SIL to generate monitors for the TOPL properties. 3 is faster than both 1 and 2, by a good lot, and doesn't require instrumenting the SIL code. Thus, delete 1 and 2! Also harmonise the CLI so that TOPL is activated by --topl, which actives it as a checker, like other analyses. Reviewed By: rgrig Differential Revision: D27270178 fbshipit-source-id: e86cf972b	4 years ago
Jules Villard	30de9be354	[pulse] protect against Z exceptions Summary: There could still be divisions by zero, eg in the "mod" case: consider "x mod (1/2)" (doesn't matter what x is). Then we'd check "1/2 =? 0" and since it's false conclude that it's safe to take the modulo... oops! To make things safer, harden `Z` to not throw anymore. Also add a layer of defense in depth by wrapping the functions that do Z/Q operations in another layer of exception catching because we really don't want to crash the entire analysis due to that. Reviewed By: martintrojer Differential Revision: D27262569 fbshipit-source-id: e22187ca0	4 years ago
Jules Villard	36ebf276a3	[pulse] simplify IsInstanceOf inside sub-terms too Summary: Previously we would only simplify when the term is exactly IsInstanceOf, and skip sub-terms. Most of the time this is the case but in the future this could change. Reviewed By: skcho Differential Revision: D27156519 fbshipit-source-id: bd10574e0	4 years ago
Jules Villard	f56f18350d	[pulse] bump base_fuel to 10 to avoid under-normalising formulas Summary: 10 seems better at no visible CPU cost. Not very scientific as this is only one data point, but neither was choosing 5 in the first place. Measurements on OpenSSL using Pulse.ISL: ``` $ time infer --pulse-only --scheduler callgraph -j 2 --pulse-report-latent-issues --pulse-isl \| fuel \| user time (s) \| under-normalisation \| latent issues \| \|------+---------------+---------------------+---------------\| \| 5 \| 163 \| 3074 \| 160 \| \| 10 \| 158 \| 85 \| 160 \| \| 15 \| 174 \| 32 \| 160 \| \| 20 \| 186 \| 20 \| 160 \| ``` Reviewed By: skcho Differential Revision: D27156497 fbshipit-source-id: 1114b8677	4 years ago
Jules Villard	4436265f6b	[pulse] fold linear normalization into normalization Summary: This is a refactoring for a later change. This change alters behaviour slightly to make it less chaotic: instead of normalization doing: """ do normalize(phi) until phi doesn't change anymore normalize(phi): do normalize_linear_part(phi) until this doesn't change phi anymore do other normalizations """ we now do """ do normalize(phi) until phi doesn't change anymore normalize(phi): normalize_linear_part(phi) do other normalizations if linear didn't change """ In particular we no longer spend potentially-quadratic amouns of fuel during normalization. Reviewed By: skcho Differential Revision: D26450391 fbshipit-source-id: 9f63e1a04	4 years ago
Jules Villard	4bcf013859	[pulse] fix some new_eqs propagation issues Summary: - add a pp_new_eq function to help people who want to printf-debug stuff - fix one case where new_eqs were reset to `[]` instead of propagated - do not add to `new_eqs` when nothing changes during normalisation. This avoids duplicated new_eqs that arise from regenerating the linear equality relation multiple times during normalisation. Reviewed By: da319 Differential Revision: D27156042 fbshipit-source-id: 59b093ec8	4 years ago
Ezgi Çiçek	432a970432	[refactor] Remove `then ()` Reviewed By: jvillard Differential Revision: D26978263 fbshipit-source-id: c7c684d4b	4 years ago
Gabriela Cunha Sampaio	e739099a40	[pulse] Model for Java instanceof Summary: Adding support for the Java instanceof operator in Pulse. Reviewed By: jvillard Differential Revision: D26275046 fbshipit-source-id: 8ba608cca	4 years ago
Jules Villard	1a1668f2e1	[pulse] avoid division by zero Summary: Difficult to repro as most of the time other simplifactions catch this before we actually get to dividing by zero. Nonetheless... shamecube Reviewed By: da319 Differential Revision: D26758187 fbshipit-source-id: b8718c515	4 years ago
Gabriela Cunha Sampaio	97bce99c03	[pulse] Adding IsInstanceOf predicate Summary: As a first step to support the Java `instanceof` operator, this change allows the path condition to be appended with `IsInstanceOf(var, typ)`. Reviewed By: jvillard Differential Revision: D26664009 fbshipit-source-id: cd19dce83	4 years ago
Jules Villard	e7124511dc	[pulse] use only known facts for variable substitutions Summary: Using more than the "known" part of the arithmetic could accidentally leak "pruned" information into certain facts. I noticed this when adding more term equality reasoning to pulse in another diff. At the moment this has little effect but is still more correct conceptually. Reviewed By: ezgicicek Differential Revision: D26450333 fbshipit-source-id: eb31da344	4 years ago
Jules Villard	abc36fe97f	[pulse] add a bunch of equal and compare functions Summary: This is all dead code but I had to do this to try something else and I don't want to have to do that again :) Reviewed By: skcho Differential Revision: D26022111 fbshipit-source-id: 622ca10b9	4 years ago
Jules Villard	77d508328f	[pulse][formula] swap order of constant and linear sum Summary: It is better for the derived comparison functions to start by comparing the single offset `Q.t` instead of the map. The order of the pair doesn't matter so the easiest way to achieve that is by putting the offset first. Reviewed By: skcho Differential Revision: D26022080 fbshipit-source-id: 874ea5c66	4 years ago
Jules Villard	b5bd85c967	[pulse] quantifier elimination using var_eqs Summary: First stab at quantifier elimination done poorly but fast :) Easy one: when we know "x = y", and we want to keep x but not y, then replace y by x everywhere. Reviewed By: skcho Differential Revision: D25432207 fbshipit-source-id: 81b142b96	4 years ago
Jules Villard	8b2b797136	[pulse] minor rename: eq -> lin_eq Summary: We'll need more kinds of "eq"s at some point. Reviewed By: skcho Differential Revision: D25430933 fbshipit-source-id: 545d1923c	4 years ago
Jules Villard	ab2813e355	[pulse] canonicalize wrt equality relation Summary: When we know `v1 = v2`, canonicalize `v2 -> v1 * v3 -> v2` to `v1 -> v1 * v3 -> v1`. Only do this when creating summaries (and so also when reporting errors) for now. This only takes into account the equality relation between variables for now. It needs to be extended to take into account other ways variables can be equal, eg when two variables are equal to the same constant or the same term. Reviewed By: skcho Differential Revision: D25092158 fbshipit-source-id: 9e589b631	4 years ago
Jules Villard	98b562c844	[pulse][refactor] extract and reuse a `SatUnsat` module Summary: Use the new module to represent both Sat/Unsat from Pulse formulas, and FeasiblePath/InfeasiblePath from PulseReport. Reviewed By: jberdine Differential Revision: D25277566 fbshipit-source-id: 9f8412ca9	4 years ago
Radu Grigore	009f3b651c	[topl] Small steps in Pulse Summary: A Topl "small step" is a call to a method that is of interest to the automaton. When such a call of interest is made, the topl component of PulseAbductiveDomain.t is updated. This means that intra-procedural Topl should now work entirely inside Pulse, without instrumenting Sil. Main TODOs: - add error extraction - implement inter-procedural (PulseTopl.large_step) Reviewed By: jvillard Differential Revision: D25028286 fbshipit-source-id: e31a96d13	4 years ago
Jules Villard	578583f2ab	[pulse] check that new arithmetic facts are consistent with the heap Summary: Communicate new facts from the arithmetic domain to the memory domain to detect contradictions between the two. Reviewed By: jberdine Differential Revision: D24832079 fbshipit-source-id: 2caf8e9af	4 years ago
Jules Villard	e1cadb12b0	[pulse] emit formula of path conditions in json output Summary: Needed for REDOCS. Reviewed By: ngorogiannis Differential Revision: D24568404 fbshipit-source-id: 30fed9879	4 years ago
Jules Villard	7fdb33b710	[pulse] report errors only when the PRUNE nodes along the path are true Summary: Take another page from the Incorrectness Logic book and refrain from reporting issues on paths unless we know for sure that this path will be taken. Previously, we would report on paths that are merely not impossible. This goes very far in the other direction, so it's possible we'll want to go back to some sort of middle ground. Or maybe not. See the changes in the tests to get a sense of what we're missing. Reviewed By: ezgicicek Differential Revision: D24014719 fbshipit-source-id: d451faf02	5 years ago
Jules Villard	b62c3f55b9	[pulse] fix fuel debug message Summary: This would previously print that we ran out of fuel even if we didn't and we simply reached a normal form. Reviewed By: ezgicicek Differential Revision: D23575571 fbshipit-source-id: 37d02ca8d	5 years ago
Radu Grigore	9591276541	[topl] Cheap port to Pulse. Summary: Report errors found by running Topl on top of Pulse, when using --topl-pulse. Topl tests now run on top of Pulse. Reviewed By: jvillard Differential Revision: D23030771 fbshipit-source-id: 8770c2902	5 years ago
Jules Villard	5cceead7ae	[pulse] normalize again when we discover new linear eqs Summary: When normalizing discovers new linear arithmetic facts in `normalize_linear_eqs` we go around once more. Do the same when atoms become linear equalities. Reviewed By: skcho Differential Revision: D23264425 fbshipit-source-id: b355875f3	5 years ago
Jules Villard	50b94dbbd6	[pulse] cleanup arithmetic Summary: Mostly cosmetic except for a change in [solve_eq] to try harder at normalization (improves unit tests!). Add more comments and do minor renamings. Reviewed By: skcho Differential Revision: D23243629 fbshipit-source-id: 55bdaf8a8	5 years ago
Jules Villard	8b23fee8f8	[pulse] refactor Atom.eval_atom Summary: This function had become a bit hard to read and the part about embedded atoms was not very clear and also a bit incomplete (need to handle "= 1" and "≠ 1" too). Reviewed By: skcho Differential Revision: D23242216 fbshipit-source-id: 239fade97	5 years ago
Jules Villard	ecdb153579	[pulse] streamline atom normalization Summary: This does a bunch of things at once (sorry): - Refactor atom/term normalisation so that terms that are really just atoms become atoms. - Use this to not bother adding special cases in the functions exported in the .mli: `and_less_than`, `and_equal_binop`, `prune_binop`, etc. all had special cases to avoid introducing terms that could be atoms. That's not great because the same smarts wasn't applied to terms that would only become atom-like after some normalisation, and led to weird and duplicated code. Now it's much cleaner: just add the most straighforward fact and normalise! - Fix a bug: adding a new equality `x = linear` should not be done using `Normalizer.merge_var_linarith` as this is an internal function that assumes that `x` is the right representative in `x - linear`. Instead, for abitrary equations of that form, `solve_eq` should be used. - When `normalize_linear_eqs` discovers new linear equalities, normalize again. Add fuel there too to avoid spending too much time doing that. It could be that we don't need/want fuel there but then we'd need to think very hard about why there's no infinite recursion possible and that seems harder. Reviewed By: skcho Differential Revision: D23241282 fbshipit-source-id: e5b8c4759	5 years ago
Jules Villard	7df30b0c4e	[pulse] preserve physical equality on var subst in LinArith Summary: This is used for variable substitution and will often be a no-op when normalising terms over and over again (after the first normalisation, the expression should stay the same). The equivalent function for terms was already being careful about not re-allocating identical terms so extend that care to linear expression. Reviewed By: skcho Differential Revision: D23241601 fbshipit-source-id: b365eb87a	5 years ago
Jules Villard	eb37d2ced5	[pulse] substitute entire linear expressions Summary: This allows further normalisation now that terms contain linear expressions in normal form. Reviewed By: skcho Differential Revision: D23241499 fbshipit-source-id: f8e4e759c	5 years ago
Jules Villard	36af901d79	[pulse] normalize any linear atom Summary: Linear arithmetic is able to simplify more atoms, eg `x+y <= x+y` becomes `True` by normalising to "lhs - rhs <= 0". This does the first step of normalisation, but to get True in this example we also need to substitute inside atoms according to the linear equalities, which is the next diff (for now we only substitute variables inside atoms for other variables or for constants). Reviewed By: skcho Differential Revision: D23241457 fbshipit-source-id: 0da0b545c	5 years ago
Jules Villard	69995cebb6	[pulse] add a Linear variant to terms Summary: More scaffolding, nothing creates `Linear _` terms yet. Some changes to variables substitution to allow substituting variables for linear terms (as well as constants and other variables). Reviewed By: skcho Differential Revision: D23241461 fbshipit-source-id: fc870255e	5 years ago
Jules Villard	45894a7dd9	[pulse] move LinArith before Term Summary: This is needed for the rest of the stack that introduces a `Linear of LinArith.t` variant in `Term.t` to enable more normalisation inside of terms. Reviewed By: skcho Differential Revision: D23241353 fbshipit-source-id: ad765cd13	5 years ago
Jules Villard	1d56705cd4	[pulse] evaluate all constant expressions Summary: Make term simplification a bit more structured and separate the "simplification" phase from the "evaluating constant expressions" phase. Also implement the latter for all possible terms. Reviewed By: skcho Differential Revision: D23241334 fbshipit-source-id: 2964aa477	5 years ago
Jules Villard	bcba7c8475	[pulse][minor] moving some arithmetic stuff around Summary: Not much to see here, extracted to make further changes more readable. Reviewed By: da319 Differential Revision: D23241335 fbshipit-source-id: 81181f23a	5 years ago
Jules Villard	af64d5dafe	[pulse] detect when atoms become linear arithmetic Summary: Since this is where almost all of the reasoning is concentrated, let's make sure we use it at every opportunity! Reviewed By: skcho Differential Revision: D23194224 fbshipit-source-id: fedb2811e	5 years ago
Jules Villard	3e7bf4343b	[pulse] make unit tests more robust to adding more tests Summary: Reset the state before each test so that adding tests doesn't affect other tests by shifting the ids of their anonymous variables. Reviewed By: skcho Differential Revision: D23194171 fbshipit-source-id: 7b717f160	5 years ago
Jules Villard	6fae5f641e	[pulse] change constants to be rationals Summary: These are the only ones we need, it turns out the other types (string, proc names, ...) were dead code. The changes the integer constants to rational constants, to match the domain of the linear arithmetic engine. Reviewed By: skcho Differential Revision: D23164136 fbshipit-source-id: 755c3f526	5 years ago
Jules Villard	0433e9592e	[pulse] new new arithmetic Summary: Instead of alternating between a normal form and a tree structure, always keep a normal form. Except the normal form is not always fully normalized. Overall, it's a bit faster than the previous iteration, while being more precise! In particular, linear arithmetic aims at being much more complete. Reviewed By: skcho Differential Revision: D23134209 fbshipit-source-id: 5f9ec6ece	5 years ago
Jules Villard	7b743ceb1a	[pulse][formula] forget dead facts Summary: At the end of analysing a procedure we call `simplify ~keep:vars_live_in_pre_post`. Any variable not in `vars_live_in_pre_post` is not mentioned anywhere else in the state and therefore is not going to contribute constraints in callers of the procedure (in other words: they're dead). We want to also forget arithmetic facts about these variables as this is a good opportunity to make the path condition smaller, sometimes by a lot! The main issue is that dead variables may be useful intermediate terms in the formula, eg trying to keep only facts about `x` in `y = x + 1 && y = 0` is going to lose a lot of precision. But, if a variable not in `keep` is only mentioned in a simple atom `z = 42` atom, for example, it's safe to forget about it, eg it's safe to remember only `x=0` in `x=0 && z=42` (if only `x` is live). In other words, we can get rid of all atoms containing variables not transitively involved in other atoms that eventually involve live variables. A graph problem! This is guaranteed not to forget anything important and can still trim a lot of atoms in certain situations. Reviewed By: skcho Differential Revision: D22921313 fbshipit-source-id: 6d5db7cbe	5 years ago

1 2

54 Commits (5ec898a4f3a3f390fb2f87f7c2e16affa241b78e)