infer_clone

Commit Graph

Author	SHA1	Message	Date
Jules Villard	b20c22a5ee	[pulse] abduce arithmetic facts Summary: This does several things because it was hard to split it more: 1. Split most of the arithmetic reasoning to PulseArithmetic.ml. This doesn't need to be reviewed thoroughly because an upcoming diff changes the domain from just `EqualTo of Const.t` to an interval domain! 2. When going through a prune node intra-procedurally, abduce arithmetic facts to the pre (instead of just propagating them). This is the "assume as assert" trick used by biabduction 1.0 too and allows to propagate arithmetic constraints to callers. 3. Use 2 when applying summaries by pruning specs whose preconditions have un-satisfiable arithmetic constraints. This changes one of the tests! Pulse now does a bit more work to find the false positive, as can be seen in the longer trace. Reviewed By: skcho Differential Revision: D18117160 fbshipit-source-id: af3b2c8c0	5 years ago
Jules Villard	16c88e282d	[pulse] some tests about values Summary: In preparation for improvements to the arithmetic reasoning. Reviewed By: dulmarod Differential Revision: D17977207 fbshipit-source-id: ee98e0772	5 years ago
Jules Villard	6a738045fd	[pulse] interprocedural histories and traces Summary: bigmacro_bender There are 3 ways pulse tracks history. This is at least one too many. So far, we have: 1. "histories": a humble list of "events" like "assigned here", "returned from call", ... 2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah" 3. "traces", which combine one history with one interproc action This diff gets rid of interproc actions and makes histories include "nested" callee histories too. This allows pulse to track and display how a value got assigned across function calls. Traces are now more powerful and interleave histories and interproc actions. This allows pulse to track how a value is fed into an action, for instance performed in callee, which itself creates some more (potentially now interprocedural) history before going to the next step of the action (either another call or the action itself). This gives much better traces, and some examples are added to showcase this. There are a lot of changes when applying summaries to keep track of histories more accurately than was done before, but also a few simplifications that give additional evidence that this is the right concept. Reviewed By: skcho Differential Revision: D17908942 fbshipit-source-id: 3b62eaf78	5 years ago
Jules Villard	669383d315	[pulse] more details about variable declaration events Summary: - add the variable being declared so we can report it back in the trace in addition to its location - distinguish between local vars and formals Reviewed By: skcho Differential Revision: D17930348 fbshipit-source-id: a5b863e64	5 years ago
Jules Villard	96c96a8dc6	[pulse] remember equalities found in branches Summary: When we make the decision to go into a branch "v = N" where some abstract value is compared to a constant, remember the corresponding equality. This allows to prune simple infeasible paths intra-procedurally. Further work is needed to make this useful interprocedurally, for instance either or both of these ideas could be explored: - abduce v=N in the precondition and do not apply summaries when the equalities in the pre are not satisfied - prune post-conditions that lead to unsat states where a value has to be equal to several different constants Reviewed By: skcho Differential Revision: D17906166 fbshipit-source-id: 5cc84abc2	5 years ago
Jules Villard	3ac8e27062	[pulse] use constant equality to prune unfeasible paths Summary: When we know "x = 3" and we have a condition "x != 3" we know we can prune the corresponding path. Reviewed By: skcho Differential Revision: D17665472 fbshipit-source-id: 988958ea6	5 years ago
Jules Villard	362e9cc622	[pulse] do not print `()` after functions Summary: Unfortunately it is very hard to predict when `Typ.Procname.describe` will add `()` after the function name, so we cannot make sure it is always there. Right now we report clowny stuff like "error while calling `foo()()`", which this change fixes. Reviewed By: ezgicicek Differential Revision: D17665470 fbshipit-source-id: ef290d9c0	5 years ago
Ezgi Çiçek	127902222d	[pulse] Filter AddressOfStackVariable from read only heuristic check Reviewed By: skcho Differential Revision: D16518259 fbshipit-source-id: 92a631a82	5 years ago
Ezgi Çiçek	09ab685c7e	[pulse] Handle stack refs escaping their scope via pointer Summary: Pulse didn't treat local variables going out of scope as invalidating the corresponding address in memory. This diff fixes that by - marking all local variables that exits the scope with the attribute `AddressOfStackVariable` - before we write the summary for the proc, we make sure to invalidate all such addresses local to the procedure as `Invalid.` If such an address is read, then we would raise a use-after-lifetime issue. Reviewed By: jvillard Differential Revision: D16458355 fbshipit-source-id: 3686524cb	5 years ago
Jules Villard	a504a67ec2	[pulse] model some of `std::basic_string` Summary: A common gotcha is the new test. Model the minimum amount of `std::basic_string` to catch it. Reviewed By: mbouaziz, ngorogiannis Differential Revision: D16121090 fbshipit-source-id: 66f06cb43	5 years ago
Jules Villard	14b9975cf3	[pulse] support modelling destructors Summary: We want to detect that variables and C++ temporaries go out of scope even when their destructor happens to be modelled. We lost a test to that because `std::function::~function` was poorly modeled as deleting the lambda itself which would now cause a double invalidation. This has to be modelled better now as something that invalidates something inside the lambda, and also model `operator()` as something that accesses that something, to recover that test. It's not a vital test though, so Do It Later©. Reviewed By: ngorogiannis Differential Revision: D16121091 fbshipit-source-id: 6b777ca18	5 years ago
Jules Villard	d9aadf5df2	[pulse] allow models in invalidation traces Summary: Be more flexible in what type of function calls are allowed in `ViaCall ...` actions to be able to include models. Also get rid of `here here` in traces /o\ As a side-effect, get more precise (=qualified) procedure names in traces (but not in messages so as not to be too verbose). Reviewed By: mbouaziz, ngorogiannis Differential Revision: D16121092 fbshipit-source-id: fb51b02f8	5 years ago
Jules Villard	ef26e8bb28	[clang] NamespaceAliasDecl is just a no-op Summary: Fixes #1123. Reviewed By: mbouaziz, ngorogiannis Differential Revision: D16163589 fbshipit-source-id: 10d2d8010	5 years ago
Jules Villard	e803a30c2d	[clang] fix translation of `initListExpr` again Summary: So it turns out we need to translate even more cases. Pulse had a FP before that this fixes. Reviewed By: ezgicicek Differential Revision: D16073629 fbshipit-source-id: c03460b5a	5 years ago
Jules Villard	14ce445f81	[pulse] run tests against C++17 Summary: This is needed to test some functionality in the next diff. Only one test changes (no longer a FN), which is now documented. Also, stop including the "header models" meant for biabduction! Maybe one day we'll need to have several test modes for different C++ versions. Seems overkill for now, so let's wait until we see some actual issues (eg FPs) that manifest in one version but not the other. Reviewed By: mbouaziz Differential Revision: D16073630 fbshipit-source-id: 1cfdfc933	5 years ago
Jules Villard	86decb83f6	[pulse] record attributes of address not edge-reachable in the post Summary: Sometimes the post of a function call has attributes on addresses that were mentioned in the pre but are no longer reachable in the post. We don't want to forget these, see added test. Reviewed By: mbouaziz Differential Revision: D16050050 fbshipit-source-id: 1ce522b97	6 years ago
Jules Villard	58b1df6bb9	[clang] fix destructor placement for temporaries in conditionals Summary: The previous code would call the destructor for the C++ temporary before the prune nodes, which then try to dereference it. Wrong. Quick fix: don't destroy temporaries in conditionals. Reviewed By: mbouaziz Differential Revision: D16030735 fbshipit-source-id: e11abad58	6 years ago
Jules Villard	3a3c93140e	[pulse] translate initListExpr in more cases Summary: We were skipping some instructions before and that was a problem for pulse. See added pulse test. Reviewed By: mbouaziz Differential Revision: D16030150 fbshipit-source-id: 9c62e6213	6 years ago
Jules Villard	d96ab2458d	[pulse] model lambda destructor Summary: Not sure if anyone uses this but there, now it's modelled. Reviewed By: mbouaziz Differential Revision: D16008162 fbshipit-source-id: f4795dcba	6 years ago
Jules Villard	91a2e2986b	[pulse] model lambda capture by value Summary: Prevent false positives about variables captured by value gone out of scope. Reviewed By: ezgicicek Differential Revision: D16008165 fbshipit-source-id: d70e47db4	6 years ago
Jules Villard	433c144840	[pulse] calling known lambdas calls the corresponding proc name Summary: We know how to do interprocedural calls so let's use that! Reviewed By: mbouaziz Differential Revision: D16008164 fbshipit-source-id: 4c34bf704	6 years ago
Jules Villard	2bf6852b95	[pulse] model `std::function::operator=` Summary: `function::operator=` is called whenever we assign a literal lambda to a variable, so it's pretty useful to be able to report anything on lambdas. Reviewed By: mbouaziz Differential Revision: D16008163 fbshipit-source-id: a9d07668d	6 years ago
Jules Villard	f15d9915a0	[pulse] better types to avoid `_fun_` prefix to proc names in bug traces Summary: Printing `Exp.Const (Cfun proc_name)` adds `_fun_` in front of the procedure name, eg `_fun_foo` instead of `foo`. This showed up in pulse traces. Reviewed By: mbouaziz Differential Revision: D16004606 fbshipit-source-id: 72ac6866f	6 years ago
Jules Villard	a3311fb751	[pulse] C++ temporaries bound to globals do not "escape" Summary: Fixes a false positive where the address of a C++ temporary is bound to a static const reference variable then returned. The fix doesn't try to establish that the variable is a const reference so could lead to false negatives but that can be addressed later. Reviewed By: ezgicicek Differential Revision: D16004538 fbshipit-source-id: e403dbefe	6 years ago
Jules Villard	7f12ced394	[pulse] move to SIL proper Summary: [apologies for the unreviewable diff...] Get rid of HIL expressions in pulse. This finishes the HIL -> SIL migration. The first step made pulse start from SIL instructions but would translate most accesses to HIL to re-use most of the existing pulse code. This diff gets rid of the intermediate translation of SIL expressions to HIL expressions. Big changes: 1. `PulseOperations` mostly rewritten, driven by using `Exp.t` instead of `HilExp.AccessExpression.t` for everything. 2. Stop trying to reverse-engineer what addresses mean in terms of access paths from program variables. Rely on the trace pointing at the right places in the code to be enough. This is because it wasn't that useful (and could even be misleading when wrong) but could be prohibitively expensive in degenerate cases (eg nodes with tens of thousands of successive array accesses...) 3. `PulseAbductiveDomain.apply_post` now returns the computed return value instead of recording it itself. 4. Change of vocabulary: `materialize` -> `eval`, `crumb` -> `event` 5. Function calls arguments are now evaluated prior to doing anything else, which saves everything else from having to (remember to) do that. In particular, this changes how models look quite a bit. Reviewed By: mbouaziz Differential Revision: D15986373 fbshipit-source-id: 1d79935de	6 years ago
Jules Villard	04233ee49b	[clang] destroy C++ temporaries Summary: Inject destructor calls to destroy a temporary when its lifetime ends. Reviewed By: mbouaziz Differential Revision: D15674209 fbshipit-source-id: 0f783a906	6 years ago
Jules Villard	0592bac25e	[pulse] explain SIL logical variables in terms of program access paths Summary: Now that HIL doesn't help us anymore we need to reconstruct its mapping "SIL logical var -> program access path". We already have everything we need in pulse: it suffices to walk the current memory graph starting from program variables until we find the value of the temporary we are interested in. This diff also builds some type machinery to make sure all accesses are explained. Reviewed By: mbouaziz Differential Revision: D15824959 fbshipit-source-id: 722c81b39	6 years ago
Jules Villard	c9f4768be7	[pulse] move to SIL Summary: It turns out HIL gets in the way of a precise heap analysis. For instance, instead of: ``` n$0 = &x.f _ = delete(&x) &y = n$0 ``` HIL tries hard to forget about intermediate variables and shows instead ``` _ = delete(&x) &y = &x.f ``` Oops, that's a use-after-delete, whereas the original code was safe. While it's easy to write SIL programs that are completely unsound for HIL, they are not generated very often from the frontends. In fact, the problem became apparent only when making the clang frontend translate C++ temporaries destructors, which produces the situation above routinely. This diff makes the minimal amount of change to make Pulse build and produce equivalent results (minus HIL bugs) starting from SIL instead of HIL. The reporting sucks for now because we need to translate SIL temporaries back into program access paths. This is done in the next diff. Reviewed By: mbouaziz Differential Revision: D15824961 fbshipit-source-id: 8e4e2a3ed	6 years ago
Jules Villard	6f5cb512db	[pulse] add example of FN in const-ref-bound temporary Summary: This one isn't caught because we don't destruct temporaries that are bound to a const reference. According to the C++ standard these should get destroyed when the const reference gets destroyed but instead we just don't destroy them for now. Reviewed By: mbouaziz Differential Revision: D15760209 fbshipit-source-id: 32c935ec0	6 years ago
Jules Villard	e14809baa8	[pulse] fix temporaries test code Summary: A test was claiming to be ok but wasn't. Reviewed By: mbouaziz Differential Revision: D15695944 fbshipit-source-id: 58772a793	6 years ago
Jules Villard	21f66dd197	[pulse] do not model `operator=` as assignment Summary: In a next diff temporaries will get destructed at the end of their lifetimes and that naive model would be causing false positives. The flipside is that we lose all reports on closures for now, will need to model them separately later. Reviewed By: mbouaziz Differential Revision: D15695943 fbshipit-source-id: c2c482c02	6 years ago
Jules Villard	db800f138b	[clang] rewrite scope computations Summary: This started as an attempt to understand how to modify the frontend to inject destructors for C++ temporaries (see next diffs). This diff rewrites the existing logic for computing the list of variables that should be destroyed at the end of each statement, either because it's the end of their syntactic scope or because control flow branches outside of their syntactic scope. The frontend translates a function from the last instructions to the first, but scope computation needs to be done in the other direction, so it's done in a separate pass before the main translation happens. That first pass creates a map from statements in the AST to the list of variables that should be destroyed at the end of these statements. This is still the case now. Before, that map would be computed in a bit of a weird way: scopes are naturally a stack but instead of that the structure maintained was a flat list + a counter to know where the current scope ended in that list. In this diff, redo the computation maintaining a stack of scopes instead, which is a bit cleaner. Also treat more instructions as introducing a new scope, eg if, for, ... Reviewed By: mbouaziz Differential Revision: D15674208 fbshipit-source-id: c92429e82	6 years ago
Jules Villard	c3d55817b1	[pulse] another test for temporaries Summary: I rewrote the test so it doesn't need any C++ headers so that: - it's easier to see what's going on - it's easier to debug: the whole AST is now somewhat readable vs before the headers made it impossibly long Reviewed By: ezgicicek Differential Revision: D15674213 fbshipit-source-id: d98941983	6 years ago
Josh Berdine	cfc1c8be36	[copyright] Remove years Reviewed By: jvillard Differential Revision: D15771884 fbshipit-source-id: e2997e3a3	6 years ago
Peter O'Hearn	9b8a908ad3	[Pulse] model folly delayed destruction Reviewed By: jvillard Differential Revision: D15508919 fbshipit-source-id: f6073ef7c	6 years ago
Jules Villard	d586630edf	[pules] do not print templated part of function names Summary: This messes with the deduplication heuristic when templated function names show up in the error messages, since the heuristic demands that the error messages are the same. Reviewed By: mbouaziz Differential Revision: D15374333 fbshipit-source-id: 70232d254	6 years ago
Jules Villard	5de9bc29d2	[pulse] better error messages Summary: Improve the error messages, change is more or less documented in the code. Reviewed By: mbouaziz Differential Revision: D15374334 fbshipit-source-id: f1dd54180	6 years ago
Jules Villard	b700af9ffb	[hil] do not put parens around trivial expressions Summary: `(x)` -> `x` `&(x)` -> `&x` everything else unchanged Reviewed By: mbouaziz Differential Revision: D15374360 fbshipit-source-id: af5ef4e66	6 years ago
Jules Villard	6364199b94	[pulse] traces record how values were constructed Summary: Before: the trace would explain how a value was invalidated and accessed, but not how the value that was invalidated had been constructed. Now: `PulseTrace.t` records breadcrumbs of how the value was constructed in addition to the interproc "action" trace leading to the invalidation or access action. Concretely: ``` void bad(X &x) { X y = x; X z = x; delete y; access(z); } ``` will produce the trace: Invalidation part: y = x delete y Access part: z = x access(z) access to z->f inside of access(z) Before this diff the "Access part" would be missing the "z = x" part of the trace, so it might be confusing why `z` has anything to do with `y`. However, such "breadcrumbs" are not recorded in the inter-procedural part, only the sequence of calls is. This is a trade-off for simplicity, maybe it's enough for developers maybe it isn't, we'll find out later. Reviewed By: jberdine Differential Revision: D15354438 fbshipit-source-id: 8d0aed717	6 years ago
Jules Villard	b5589661ce	[pulse] improve error messages and traces Summary: Feedback from peterogithub: - mention which access path is being invalidated and accessed in the message - mention the line at which it was invalidated (the line at which it's accessed is already the line at which we report) - traces for stack variable/C++ temporary address escapes - delete double implementation of the same functionality in `PulseTrace`: `location_of_action_start` is the same as `outer_location_of_action`... Reviewed By: jberdine Differential Revision: D14800294 fbshipit-source-id: 3d9ab9b3d	6 years ago
Jules Villard	9dbbd68472	[pulse] apply summaries to globals too Summary: Similarly to function parameters (and the return value), we need to apply the pre/post of a function call to the globals mentioned in its summary. - tigthen summaries further to remember only abducible variables in the post (as well as in the pre) - take globals into account when applying pre/post pairs Reviewed By: jberdine Differential Revision: D14780800 fbshipit-source-id: fc0d180bb	6 years ago
Jules Villard	3ba05b8cee	[pulse] be more careful about what to consider as a variable going out of scope Summary: The heuristic to detect variables going out of scope was to detect any access expression passed as argument to an injected destructor call. However destructor calls are also injected in destructor bodies to destruct each field of an object, so the heuristic would detect fields going out of scope, which, erm, doesn't make sense. Limit the heuristic to local program variables. Reviewed By: jberdine Differential Revision: D14771454 fbshipit-source-id: ffa3c9fe3	6 years ago
Jules Villard	31c2a39e81	[pulse] tighten up summaries Summary: Only throw values to the pre if they can be followed from "abducible" variables: formals of the current method and globals. Because figuring out if a `Pvar.t` is a formal of the current procedure is actually a giant pain, hack something not too bad instead: pre-register all formals at the start of the analysis of the procedure. Then the only other variables we care about in the precondition are globals, which we can detect easily. This is mostly an optimisation (summaries won't include irrelevant "abduced" facts about the procedure's local variables anymore), but it also fixes a bug where we would sometimes overwrite things in the pre. I think that's why the tests improved. Reviewed By: ngorogiannis Differential Revision: D14753493 fbshipit-source-id: 08e73637f	6 years ago
Jules Villard	7c90480758	[pulse] do not create `&` back-edges eagerly Summary: This mostly doesn't make sense. The only thing this would have been good for was to give the most accurate result on access paths such as `*(&(x.f))`, but these are normalised anyway (into `x.f`) so we actually never see these. That said there might be some use to some similar logic in the future, but in the meantime let's delete the current feature as it wasn't thought through. Reviewed By: ezgicicek Differential Revision: D14753492 fbshipit-source-id: 597cec027	6 years ago
Jules Villard	ada032ee2c	[pulse] improve error messages and traces Summary: The previous message formatting had regressed and produced non-sensical messages. More importantly, remove template parameters from error messages to trigger the heuristic in `InferPrint` that deduplicates errors that are on the same line with the same error type and message. Without this we get hundreds of reports that correspond to as many instantiations of the same code. Reviewed By: ngorogiannis Differential Revision: D14747979 fbshipit-source-id: 3c4aad2b1	6 years ago
Jules Villard	db4e1ea433	[pulse] reallocate variables on initialisation Summary: We see the magic function `__variable_initialization` at the point where the variable is declared, eg `int x = foo()`. It's safe to reset `&x` at that point. This circumvents an issue that pops up in some rare cases where the ternary conditional operator `?:` and variable initialization conspire to produce weird frontend results. Some test becomes a FN again, but I think it was being reported for the wrong reasons; will investigate more later. Reviewed By: ngorogiannis Differential Revision: D14747980 fbshipit-source-id: e75d6e30f	6 years ago
Jules Villard	3ce095a288	[pulse] more efficient representation of attributes Summary: This ensures that each attribute type can only be present once per address. Makes ~80x time improvement on pathological cases such as Duff's device. This introduces a new kind of Set in `PrettyPrintable`. Reviewed By: mbouaziz Differential Revision: D14645091 fbshipit-source-id: c7f9b760c	6 years ago
Jules Villard	d57ed5086e	[pulse] better treatment of variables going out of scope Summary: Detect when a variable goes out of scope. When that's the case, mark its address and its contents as invalid. Give subsequent uses a USE_AFTER_LIFETIME error type instead of USE_AFTER_DESTRUCTOR. Reviewed By: jberdine Differential Revision: D14387147 fbshipit-source-id: a2c530fda	6 years ago
Jules Villard	53b1577b4c	[pulse][interproc 3/3] interproc call Summary: biggest_diff Reviewed By: jberdine Differential Revision: D14387150 fbshipit-source-id: 6d6ddeffc	6 years ago
Jules Villard	686231ec6e	[SIL] change `variable_initialization()` builtin to a new auxiliary instruction Summary: Instead of emitting an ad-hoc builtin on variable declaration emit a new metadata instruction. This allows us to remove the code matching on that ad-hoc builtin that had to be inserted in several checkers. Inferbo & pulse used that information meaningfully and had to undergo some minor changes to cope with the new metada instruction. Reviewed By: ezgicicek Differential Revision: D14833100 fbshipit-source-id: 9b3009d22	6 years ago
Jules Villard	ebe5028ca1	[SIL] add `Skip` metadata instruction Summary: springcleaning2 Reviewed By: ezgicicek Differential Revision: D14827673 fbshipit-source-id: 0d3cf730b	6 years ago
Jules Villard	b665e1c575	[SIL][HIL] distinguish auxiliary instructions as `Metadata` Summary: Bundle all non-semantic-bearing instructions into a `Metadata _` instruction in SIL. - On a documentation level this makes clearer the distinction between instructions that encode the semantics of the program and those that are just hints for the various backend analysis. - This makes it easier to add more of these auxiliary instructions in the future. For example, the next diff introduces a new `Skip` auxiliary instruction to replace the hacky `ExitScope([], Location.dummy)`. - It also makes it easier to surface all current and future such auxiliary instructions to HIL as the datatype for these syntactic hints can be shared between SIL and HIL. This diff brings `Nullify` and `Abstract` to HIL for free. Reviewed By: ngorogiannis Differential Revision: D14827674 fbshipit-source-id: f68fe2110	6 years ago
David Lively	5d4a27ea54	RFC: stop using _ to separate ObjC/C++ class name from method in Typ.Procname.to_string Reviewed By: jvillard Differential Revision: D14736442 fbshipit-source-id: 500df354b	6 years ago
Jeremy Dubreil	261f1ba171	[infer] update the Pulse tests expected output Reviewed By: ngorogiannis Differential Revision: D14650123 fbshipit-source-id: ad5e0d7a8	6 years ago
Jules Villard	605bc5e01a	[pulse] fix some tests and add interproc tests Summary: Some of these tests were wrong, eg `~lambda()` calls `lambda()` then... takes the bitwise complement or something? The intent was to call the destructor. Add interprocedural tests for later. Reviewed By: jberdine Differential Revision: D14324762 fbshipit-source-id: 40d2c32f5	6 years ago
Jules Villard	4cdb65c237	[pulse] \|- is now true only of isomorphic graphs Summary: Previously we would say that `lhs <= rhs` (or `lhs \|- rhs`) when a mapping existed between the abstract addresses of `lhs` and `rhs` such that `mapping(lhs)` was a supergraph of `rhs`. In particular, we had that `x \|-> x' * x' \|-> x'' \|- x \|-> x'`. This is not entirely great, in particular once we get pairs of state representing footprint + current state. I'm not sure I have an extremely compelling argument why though, except that it's not the usual way we do implication in SL, but there wasn't a compelling argument for the previous state of affairs either. This changes `\|-` to be true only when `mapping(lhs) = rhs` (modulo only considering the addresses reachable from the stack variables). Reviewed By: jberdine Differential Revision: D14568272 fbshipit-source-id: 1bb83950e	6 years ago
Jules Villard	4988523104	[AI] make join and widen use the same argument order Summary: This helps convergence when `<=` is based on physical equality for example, and widening is implemented as `widen ~prev ~next = join prev next`. Reviewed By: skcho Differential Revision: D14568270 fbshipit-source-id: ded5ed296	6 years ago
Jules Villard	363d69430d	[ai][pulse] use subgraph-based implication between states Summary: When joining two lists of disjuncts we try to ensure there isn't a state that under-approximates another already in the list. This helps reduce the number of disjuncts that are generated by conditionals and loops. Before we would always just add more disjuncts unless they were physically equal but now we do a subgraph computation to assess under-approximation. We only do this half-heartedly for now however, only taking into consideration the "new" disjuncts vs the "old" ones. It probably makes sense to do a full quadratic search to minimise the number of disjuncts from time to time but this isn't done here. Reviewed By: mbouaziz Differential Revision: D14258482 fbshipit-source-id: c2dad4889	6 years ago
Jules Villard	a19db6605c	[AI][pulse] lists of disjuncts instead of sets Summary: The disjunctive domain shouldn't really be a set in the first place as comparing abstract states for equality is expensive to do naively (walking the whole maps representing the abstract heap). Moreover in practice these sets have a small max size (currently 50 for pulse, the only client), so switching them to plain lists makes sense. Reviewed By: mbouaziz Differential Revision: D14258489 fbshipit-source-id: c512169eb	6 years ago
Jules Villard	44007f054c	[pulse] collect garbage (unreachable) heap parts from time to time Summary: It's useful to keep the size of states down, especially when humans are trying to read it. It will also help keep the size of summaries down in the inter-procedural pulse. Reviewed By: mbouaziz Differential Revision: D14258486 fbshipit-source-id: 45ebcac67	6 years ago
Sungkeun Cho	0e5a902ac6	[inferbo] Add model of String::length Reviewed By: mbouaziz Differential Revision: D13547914 fbshipit-source-id: 7d496d11a	6 years ago
Jules Villard	4c1ee2a485	[pulse] add traces to the domain Summary: Record per-location traces. Actually, that doesn't quite make sense as a location can be accessed in many ways, so associate a trace to each edge in the memory graph. For instance, when doing `x->f = y`, we want to take the history of the `<val of y> ----> ..` edge, add "assigned at location blah" to it and store this extended history to the edge `<val of x> --f--> ..`. Use this machinery to print nicer traces in `infer explore` and better error messages too (include the last assignment, like biabduction messages). Reviewed By: da319 Differential Revision: D13518668 fbshipit-source-id: 0a62fb55f	6 years ago
Daiva Naudziuniene	b19ad38dae	[pulse] Example of use after destructor for temporaries Summary: Adding a test case for use after destructor for temporaries. At the moment pulse does not find it as frontend does not inject destructors for temporaries. Reviewed By: jvillard Differential Revision: D13506229 fbshipit-source-id: 31b9466f7	6 years ago
Jules Villard	8d3363f677	[pulse] record simple double free test Summary: Just to make sure it is caught. Reviewed By: da319 Differential Revision: D13517263 fbshipit-source-id: 976d3a3ae	6 years ago
Jules Villard	9868f7f763	[pulse] warn on returning address of C++ temporary Summary: When a C++ temporary goes out of scope, tag its address in the heap with a new attribute `AddressOfCppTemporary` so that we can later check that we don't return it. Reviewed By: da319 Differential Revision: D13466898 fbshipit-source-id: 8808338b4	6 years ago
Jules Villard	db1814b1d1	[pulse] detect stack variable address escape Summary: When assign to the special `return` variable, check that the result is not the address of a local variable, otherwise report. Reviewed By: ngorogiannis Differential Revision: D13466896 fbshipit-source-id: 465da7f13	6 years ago
Jules Villard	c77f22310a	[pulse] rewrite test to avoid stack variable address escape Summary: Pulse is about to be smart enough to detect that bug. Reviewed By: da319 Differential Revision: D13466895 fbshipit-source-id: 79afd2d51	6 years ago
Jules Villard	2bb9e5ad85	[pulse] rename function that was never a pulse FP Summary: Naming it `FP_` was a mistake in the original commit that copied the tests over as pulse has never reported on that method. Reviewed By: da319 Differential Revision: D13465324 fbshipit-source-id: f8b24ebda	6 years ago
Daiva Naudziuniene	e2b5a6f941	[pulse] Allow taking address of a field of an invalid object Summary: It's ok to take an address of a field / array access of an invalid object. This diff calculates the inner most dereference for an access expression starting with `&` and does not report on the dereference even if the address is invalid. Reviewed By: jvillard Differential Revision: D13450758 fbshipit-source-id: 18c038701	6 years ago
Daiva Naudziuniene	220d29766d	[pulse] Model stack as a map from addresses of variables Summary: When we create Dereference edge, we also create TakeAddress back edge. This causes false positives for stack variables. When we write to a stack variable and then take its address, the resulting address is the one from the back edge of the written value. See example `push_back_value_ok`. To solve this issue, this diff changes stack to denote a map from address of variables rather than from variables. We still have issue for fields, see example, FP_push_back_value_field_ok. To solve this, we probably need to remove back edges. Reviewed By: jvillard Differential Revision: D13432415 fbshipit-source-id: 9254a1a6d	6 years ago
Jules Villard	65d031af66	[pulse] model lambda captures Summary: When a lambda gets created, record the abstract addresses it captures, then complain if we see some of them be invalidated before it is called. Add a notion of "allocator" for reporting better messages. The messages are still a bit sucky, will need to improve them more generally at some point. ``` jul   lambda  ~  infer  1  infer -g --pulse-only -- clang -std=c++11 -c infer/tests/codetoanalyze/cpp/pulse/closures.cpp Logs in /home/jul/infer.fb/infer-out/logs Capturing in make/cc mode... Found 1 source file to analyze in /home/jul/infer.fb/infer-out Found 2 issues infer/tests/codetoanalyze/cpp/pulse/closures.cpp:21: error: USE_AFTER_DESTRUCTOR `&(f)` accesses address `s` captured by `&(f)` as `s` invalidated by destructor call `S_~S(s)` at line 20, column 3 past its lifetime (debug: 5). 19. f = [&s] { return s.f; }; 20. } // destructor for s called here 21. > return f(); // s used here 22. } 23. infer/tests/codetoanalyze/cpp/pulse/closures.cpp:30: error: USE_AFTER_DESTRUCTOR `&(f)` accesses address `s` captured by `&(f)` as `s` invalidated by destructor call `S_~S(s)` at line 29, column 3 past its lifetime (debug: 8). 28. f = [&] { return s.f; }; 29. } 30. > return f(); 31. } 32. Summary of the reports USE_AFTER_DESTRUCTOR: 2 ``` Reviewed By: da319 Differential Revision: D13400074 fbshipit-source-id: 3c68ff4ea	6 years ago
Daiva Naudziuniene	fcfb6cc361	[pulse] Model more std::vector functions that can invalid references to elements Summary: Model more `std::vector` functions that can potentially invalidate references to vector's elements (https://en.cppreference.com/w/cpp/container/vector). Reviewed By: mbouaziz Differential Revision: D13399161 fbshipit-source-id: 95cf2cae6	6 years ago
Jules Villard	95fab102bf	[pulse] do not destroy `this` even if asked to Summary: Some code calls `this->~Obj()` then proceeds to use fields in the current object, which previously we would report as invalid uses. Assume people know what they are doing and ignore destructor calls to `this`. Reviewed By: mbouaziz Differential Revision: D13401145 fbshipit-source-id: f6b0fb6ec	6 years ago
Daiva Naudziuniene	332b150be9	[pulse] Model std::vector::reserve to invalidate references to elements Summary: Similarly as `std::vector::push_back`, `std::vector::reserve` can invalidate the references to elements if the new size is bigger than the existing one. More info on `std::vector::reserve`: https://en.cppreference.com/w/cpp/container/vector/reserve Reviewed By: jvillard Differential Revision: D13340324 fbshipit-source-id: bf99b6923	6 years ago
Daiva Naudziuniene	485b9c7bf5	[pulse] Abstract Location Set Summary: Instead of variable having the value of a single location on stack, we now allow variables to have multiple locations. Consequently, we also allow a memory location to point to a set of locations in the heap. We enforce a limit on a maximum number of locations in a set (currently 5). Reviewed By: jvillard Differential Revision: D13190876 fbshipit-source-id: 5cb5ba9a6	6 years ago
Daiva Naudziuniene	e59d9632b1	[Pulse] Improve example to illustrate FP caused by an allocation in a branch Summary: Recent improvements in join fixed `FP_allocate_in_branch_ok` because the variable was not read after the join. Reviewed By: mbouaziz Differential Revision: D13233441 fbshipit-source-id: 89b701e12	6 years ago
Jules Villard	1c668c4d41	[SIL][preanalysis] add call flag for functions treating first formal as return Summary: This helps some checkers and the liveness preanalysis. Reviewed By: da319 Differential Revision: D13102954 fbshipit-source-id: b8d3c5fe2	6 years ago
Jules Villard	f3411a2203	[HIL] Add `ExitScope` instruction Summary: It's useful for checkers to know when variables go out of scope to perform garbage collection in their domains, especially for complex domains with non-trivial joins. This makes the analyses more precise at little cost. This could have been added as a custom function call to a builtin, but I decided against it because this instruction doesn't have the semantics of any function call. It's better for each checker to explicitly not deal with the custom instruction instead. Reviewed By: jberdine Differential Revision: D13102951 fbshipit-source-id: 33be22fab	6 years ago
Jules Villard	0b2dcbf406	[pulse] add non-passing tests about join Summary: So we can see improvements in later diffs. Reviewed By: skcho Differential Revision: D13102949 fbshipit-source-id: 45494904b	6 years ago
Daiva Naudziuniene	b640d69021	[pulse] An example of false positive caused by an allocation in a branch Summary: The title Reviewed By: mbouaziz Differential Revision: D13167332 fbshipit-source-id: dd6588904	6 years ago
Daiva Naudziuniene	2c06254800	[pulse] False positive caused by multiple variables captured by value in lambda Summary: Update clang plugin which now gives names to variables captured by lambdas that were empty before. update-submodule: facebook-clang-plugins Reviewed By: jvillard Differential Revision: D12979015 fbshipit-source-id: 0b092fb24	6 years ago
Jules Villard	67ff14b4ed	[pulse] record attributes inside memory cells instead of separately Summary: It turns out keeping attributes (such as invalidation facts) separate from the memory is a bad idea and leads to loss of precision and false positives, as seen in the new test (which previously generated a report). Allow me to illustrate on this example, which is a stylised version of the issue in the added test: previously we'd have: ``` state1 = { x = 1; invalids={} } state2 = { x = 2; invalids ={1} } join(state1, state2) = { x = {1, 2}; invalids={{1, 2}} } ``` So even though none of the states said that `x` pointed to an invalid location, the join state says it does because `1` and `2` have been glommed together. The fact `x=1` from `state1` and the fact "1 is invalid" from `state2` conspire together and `x` is now invalid even though it shouldn't. Instead, if we record attributes as part of the memory we get that `x` is still valid after the join: ``` state1 = { x = (1, {}) } state2 = { x = (2, {}) } join(state1, state2) = { x = ({1, 2}, {}) } ``` Reviewed By: mbouaziz Differential Revision: D12958130 fbshipit-source-id: 53dc81cc7	6 years ago
Jules Villard	6f9028a77f	[pulse] use WTO scheduler Summary: I hear that this scheduler is better. I want the best scheduler possible. Also pulse's join is a bit complex so it might matter one day. whydididothis Reviewed By: mbouaziz Differential Revision: D12958131 fbshipit-source-id: 3bd77ccba	6 years ago
Daiva Naudziuniene	86f52e52ed	[pulse] Operator= copy assignment Summary: For a general case of `operator=` we want to create a fresh location for the first parameter as `operator=` behaves as copy assignment. Reviewed By: jvillard Differential Revision: D12940635 fbshipit-source-id: 89c6e530d	6 years ago
Jules Villard	f30e97f072	[pulse] add model for `std::vector::reserve` using additional memory attribute Summary: Whenever `vec.reserve(n)` is called, remember that the vector is "reserved". When doing `vec.push_back(x)` on a reserved vector, assume enough size has been reserved in advance and do not invalidate the underlying array. This gets rid of false positives. Reviewed By: mbouaziz Differential Revision: D12939837 fbshipit-source-id: ce6354fc5	6 years ago
Jules Villard	1c8143898e	[pulse] generalise "invalid" addresses as sets of attributes Summary: Instead of keeping at most one invalidation fact for each address, keep a set of them and call them "attributes". Keeping a set of invalidation facts is redundant since we always only want the smallest one, but makes the implementation simpler, especially once we add more kinds of attributes (used for modelling, see next diffs). Reviewed By: mbouaziz Differential Revision: D12939839 fbshipit-source-id: 4a54c2132	6 years ago
Jules Villard	637018a330	[pulse] model some early exit functions Summary: Copied on the ownership checker logic: return the initial value of the domain as return. This can probably be improved. Reviewed By: mbouaziz Differential Revision: D12888102 fbshipit-source-id: 9e2dac7fc	6 years ago
Jules Villard	9aa5582caa	[clang] leave markers of variable initialization for pulse Summary: When initialising a variable via semi-exotic means, the frontend loses the information that the variable was initialised. For instance, it translates: ``` struct Foo { int i; }; ... Foo s = {42}; ``` as: ``` s.i := 42 ``` This can be confusing for backends that need to know that `s` actually got initialised, eg pulse. The solution implemented here is to insert of dummy call to `__variable_initiazition`: ``` __variable_initialization(&s); s.i := 42; ``` Then checkers can recognise that this builtin function does what its name says. Reviewed By: mbouaziz Differential Revision: D12887122 fbshipit-source-id: 6e7214438	6 years ago
Jules Villard	165cb1cf73	[pulse] back to sounder joins Summary: Now that arrays are dealt with separately (see previous diff), we can turn the join back into an over-approximation as far as invalid locations are concerned. Reviewed By: skcho Differential Revision: D12881989 fbshipit-source-id: fd85e49c0	6 years ago
Jules Villard	f400d4c5c5	[pulse] always register havoc'd variables Summary: This prevents the join from wrongly assuming that we haven't seen a variable on one side of the join. Reviewed By: skcho Differential Revision: D12881987 fbshipit-source-id: 42a776adb	6 years ago
Daiva Naudziuniene	4954d3da4b	[pulse] Model operator= Summary: For `operator=(lhs, rhs)` we want to model it as an assignment if rhs is materialized temporary created in the constructor. Reviewed By: jvillard Differential Revision: D10462510 fbshipit-source-id: 998341e69	6 years ago
Daiva Naudziuniene	881bcb8fce	[pulse] Clean up placement new model Summary: Do not create a new location for placement new argument if it already exists. Reviewed By: jvillard Differential Revision: D12839942 fbshipit-source-id: 758b67a82	6 years ago
Jules Villard	0a2cb44667	[pulse] introduce the more precise `VECTOR_INVALIDATION` issue type Summary: Get rid of `USE_AFTER_LIFETIME`. This could be useful to deploy pulse alongside the ownership checker too. Reviewed By: da319 Differential Revision: D12857477 fbshipit-source-id: 8e2a2a37c	6 years ago
Jules Villard	f627812541	[pulse] new issue type `USE_AFTER_DESTRUCTOR` Summary: Keep `USE_AFTER_LIFETIME` for unclassified errors (for now it contains vector invalidation too because I can't think of a good name for them, and maybe it makes sense to wait until we have more types of them to decide on a name). Reviewed By: da319 Differential Revision: D12825060 fbshipit-source-id: bd75ef698	6 years ago
Jules Villard	c6b2126c3f	[pulse] forget about addresses that are invalid on only one side of a join Summary: Getting this right will be long and complex so for now the easiest is to underreport and only consider as invalid the addresses we know to be invalid on both sides of a join. In fact the condition for an address to be invalid after a join is more complex than this: it is invalid only if all the addresses in its equivalence class as discovered by the join are invalid. Reviewed By: skcho Differential Revision: D12823925 fbshipit-source-id: 2ca109356	6 years ago
Daiva Naudziuniene	8b54879b07	[pulse] Constructors Summary: Similarly as for destructors, we provide an address of an object as a first parameter to constructors. When constructor is called we want to create a fresh location for a new object. Reviewed By: jvillard Differential Revision: D10868433 fbshipit-source-id: b60f32953	6 years ago
Daiva Naudziuniene	1094a8224c	[pulse] Invalidate object rather than address in destructor call Summary: We provide an address of an object as a parameter to destructor. When destructor is called the object itself is invalidated, but not the address. Reviewed By: jvillard Differential Revision: D12824032 fbshipit-source-id: 516eebcf8	6 years ago
Jules Villard	6cce767d19	[pulse] copy tests from ownership Summary: The time has come to keep track of which tests pass and which are FP/FN for pulse. Reviewed By: mbouaziz Differential Revision: D10854064 fbshipit-source-id: 60938e48f	6 years ago
Jules Villard	cf66ea0afb	[pulse] havoc vector array on push_back Summary: Turns out once a vector array became invalid it stayed that way, instead of the vector getting a new valid internal array. Reviewed By: skcho Differential Revision: D10853532 fbshipit-source-id: f6f22407f	6 years ago
Jules Villard	6d6ac1d368	[pulse] do not use access paths as they forget about &/* Summary: Now the domain can reason about `&` and `` too. When recording `&` between two locations also record a back-edge ``, and vice-versa. Reviewed By: mbouaziz Differential Revision: D10509335 fbshipit-source-id: 8091b6ec0	6 years ago

1 2 3 4

160 Commits (980f1101560800f7731e14d313a862133b577d4a)