infer_clone

Commit Graph

Author	SHA1	Message	Date
Daiva Naudziuniene	3d74f39102	[pulse] Improve trace for Optional Empty Access Summary: `folly::Optional::value()` returns a reference, hence an error was shown when the actual value was being accessed. Since `value()` throws an exception in case of `folly::none`, we want to show the error message at the call site of `value()`. We do this by dereferencing the result of `value()` in the model. Reviewed By: jvillard Differential Revision: D24702875 fbshipit-source-id: ca9f30349	4 years ago
Daiva Naudziuniene	b17861b1c8	[pulse] More precise model for constructing folly::Optional<Value> from Value Summary: Before we were creating a fresh internal value when we were constructing `folly::Optional`. This diff models `folly::Optional` constructor more precisely by copying the given value. There was also a missing dereference in the model of `value_or` Reviewed By: jvillard Differential Revision: D24621016 fbshipit-source-id: c86d3c157	4 years ago
Daiva Naudziuniene	059c0f24a2	[pulse] Model Optional value_or Summary: Model `folly::Optional::value_or(default)` to return value if not-empty and `default` if empty. Reviewed By: jvillard Differential Revision: D24539456 fbshipit-source-id: cc9e176cc	4 years ago
Jules Villard	7fdb33b710	[pulse] report errors only when the PRUNE nodes along the path are true Summary: Take another page from the Incorrectness Logic book and refrain from reporting issues on paths unless we know for sure that this path will be taken. Previously, we would report on paths that are merely not impossible. This goes very far in the other direction, so it's possible we'll want to go back to some sort of middle ground. Or maybe not. See the changes in the tests to get a sense of what we're missing. Reviewed By: ezgicicek Differential Revision: D24014719 fbshipit-source-id: d451faf02	4 years ago
Daiva Naudziuniene	22d317c940	[pulse] Move pulse model flags to .inferconfig for pulse tests Summary: The title Reviewed By: skcho Differential Revision: D23960402 fbshipit-source-id: edc3bc2d0	4 years ago
Daiva Naudziuniene	91a33f6edc	[frontend] Captured struct variables in cpp lambdas Summary: Structs captured both by reference or by value should have reference in their type. Struct captured by value should first call copy constructor. In this diff we fix the type of the captured variable to include reference. Copy constructor injection is left for the future. Reviewed By: jvillard Differential Revision: D23688713 fbshipit-source-id: d13748b5d	4 years ago
Daiva Naudziuniene	857daf63c9	[frontend] Capture reference variables Summary: Variables captured without initialization do not have correct type inside lambda's body. This diff sets the correct type of captured reference variables inside procdesc and makes sure the translation of captured variables is correct. The translation of lambda's body will then take into account the type of captured var from procdesc. Reviewed By: jvillard Differential Revision: D23678371 fbshipit-source-id: ed16dc978	4 years ago
Daiva Naudziuniene	42abe5b277	[frontend] Fix type of captured vars in lambda's body Summary: Add missing reference to the type of variable captured by reference without initialization. Reviewed By: jvillard Differential Revision: D23567685 fbshipit-source-id: b4e2ac0b6	4 years ago
Daiva Naudziuniene	d0cb245303	[frontend] Fix capture init for cpp lambdas Summary: We were missing assignment to captured variables with initializers. Consider the following example: ``` S* update_inside_lambda_capture_and_init(S* s) { S* object = nullptr; auto f = [& o = object](S* s) { o = s; }; f(s); return object; } ``` which was translated to ``` VARIABLE_DECLARED(o:S&); &o:S&=&object &f =(_fun...lambda..._operator(),([by ref]&o &o:S&)) ``` However, we want to capture `o` (which is an address of `object`), rather `&o` in closure. After the diff ``` VARIABLE_DECLARED(o:S&); &o:S&=&object n$7=&o:S& &f =(_fun...lambda..._operator(),([by ref]n$7 &o:S&)) ``` Reviewed By: jvillard Differential Revision: D23567346 fbshipit-source-id: 20f77acc2	4 years ago
Jules Villard	03bc3f31c8	[pulse] add option to skip functions/classes Summary: This can be useful to make pulse forget about tricky parts of the code. Treat "skipped" procedures as unknown so heuristics for mutating the return value and parameters passed by reference are applied. Reviewed By: ezgicicek Differential Revision: D23729410 fbshipit-source-id: d7a4924a8	4 years ago
Daiva Naudziuniene	4401701578	[pulse] Model for std::function copy constructor Summary: Added a model for copy constructor for `std::function`. In most cases, the SIL instruction `std::function::function(&dest, &src)` gives us pointers to `dest` and `src`, hence, we model the copy constructor as a shallow copy. However, in some cases, e.g. `std::function f = lambda_literal`, SIL instruction contains the closure itself `std::function::function(&dest, (operator(), captured_vars)`, hence, we need to make sure we copy the right value. Reviewed By: ezgicicek Differential Revision: D23396568 fbshipit-source-id: 0acb8f6bc	4 years ago
Daiva Naudziuniene	0a4af7754d	[pulse] Fix std::function::operator() Summary: There was a mismatch between formals and actuals in `std::function::operator()` because we were not passing the first argument corresponding to the closure. Reviewed By: ezgicicek Differential Revision: D23372104 fbshipit-source-id: d0f9b27d6	4 years ago
Daiva Naudziuniene	29fd9e13d1	[pulse] Understand captured variables in cpp lambdas Summary: When we evaluate lambdas in pulse, we create a closure object with `fake` fields to store captured variables. However, during the function call we were not linking the captured values from the closure object. We address this missing part here. Reviewed By: jvillard Differential Revision: D23316750 fbshipit-source-id: 14751aa58	4 years ago
Jules Villard	5278cb7374	[pulse] `delete nullptr` is a no-op Summary: `delete` works exactly like `free` so merge both models together. Also move the `free(0)` test to nullptr.cpp as it seems more appropriate. Reviewed By: da319 Differential Revision: D23241297 fbshipit-source-id: 20a32ac54	4 years ago
Daiva Naudziuniene	69e0dce0ed	[pulse] fix end() iterator false positive Summary: Before we were modelling `vector.end()` as returning a fresh pointer every time is was called. It is common to check if an iterator is not the `end()` iterator and proceed to dereference the iterator in that case. In such code pattern `vector.end()` is called twice and returns different fresh values which causes false positives. To fix this, we add a special internal field `__infer_model_backing_array_pointer_to_last_element` to a vector to denote its end. Now, every time we call `vector.end()` we return the value of this field. We introduce a new attribute `EndOfCollection` to mark `end` iterator as the existing `EndIterator` invalidation is not suitable when we need to read the same value multiple times. Reviewed By: jvillard Differential Revision: D23101185 fbshipit-source-id: fa8a33b58	4 years ago
Jules Villard	97fcc3b0ad	[pulse] apply equality relation to terms to be added to the equality relation Summary: Extra normalization gives extra precision. This doesn't seem to negatively impact perf. Reviewed By: skcho Differential Revision: D22867109 fbshipit-source-id: 5b82ec377	4 years ago
Jules Villard	5a39c158c5	[pulse] arithmetic domain: take 4! Summary: This time it's personal. Roll out pulse's own arithmetic domain to be fast and be able to add precision as needed. Formulas are precise representations of the path condition to allow for good inter-procedural precision. Reasoning on these is somewhat ad-hoc (except for equalities, but even these aren't quite properly saturated in general), so expect lots of holes. Skipping dead code in the interest of readability as this (at least temporarily) doesn't use pudge anymore. This may make a come-back as pudge has/will have better precision: the proposed implementation of `PulseFormula` is very cheap so can be used any time we could want to prune paths (see following commits), but this comes at the price of some precision. Calling into pudge at reporting time still sounds like a good idea to reduce false positives due to infeasible paths. #skipdeadcode Reviewed By: skcho Differential Revision: D22576004 fbshipit-source-id: c91793256	4 years ago
Daiva Naudziuniene	221d0b62ab	[pulse] Model builtin __new as returning non-null Summary: We model internal builtin `__new` function to return a non-null value. This fixes nullptr_dereference false positives where we explicitly check the result of a function call for nullptr when the function returns a newly created object. Reviewed By: jvillard Differential Revision: D22772217 fbshipit-source-id: 37d209697	4 years ago
Jules Villard	9690dba871	[pulse] a slow example for pudge Summary: Add a test to the repo to try and detect perf regressions in pulse. Currently analyzed in ~0.1s. With `--pudge`, takes ~10s. Sledge does eager normalization and canonicalization when incorporating new facts into formula contexts and the algorithm is polynomial in the number of equalities. This example generates one equality per location in the array => boom. This bypasses the recency model of arrays because the formula needs to be constructed before it can be simplified to get rid of dead variables. The new arithmetic is not as complete as sledge's algorithm but linear in time. We could use it to simplify the formula before passing it to sledge. In fact, that was the original motivation. Reviewed By: skcho Differential Revision: D22574366 fbshipit-source-id: e9044ae09	4 years ago
Jules Villard	ae57f217d2	[pulse] don't always mistake equality for aliasing Summary: When applying function summaries, we are careful not to violate the summary's assumptions about non-aliasing. For example, the summary we generate for `foo(x,y) { x = y; }` will have `x` and `y` be allocated to two different `AbstractValue.t` in the heap, representing disjointness. However, the current logic is too coarse and also rejects passing the same pure value to functions that made no assumption about them being equal or different, eg `goo(int x,int y) { int z = x + y; }`. This is because the corresponding `AbstractValue.t` are different in the callee's summary, but are represented by only one same value in callers such as `goo(i,i)`. This diff restricts the "don't violate aliasing" condition to only consider heap-allocated values. This is consistent with separation logic by the way: we use the implication `x\|->- * y\|->- \|- x≠y`, which is valid only when both `x` and `y` are both allocated in the heap as in the left-hand-side of `\|-`. Reviewed By: skcho Differential Revision: D22574297 fbshipit-source-id: 206a18499	4 years ago
Daiva Naudziuniene	35011757dc	[pulse] Add a flag to pass functions that we want to model as returning non-null Summary: To avoid NULLPTR_DEREFERENCE false positives we want to model some functions as returning non-null. A new flag --pulse-model-return-nonnull allows us to provide a list of such functions. Reviewed By: ezgicicek Differential Revision: D22431564 fbshipit-source-id: 9944c7382	4 years ago
Daiva Naudziuniene	0ab3689f1f	[infer] NULLPTR_DEREFERENCE false positive caused by thread_local variable Summary: Keyword `thread_local` in cpp allows us to create a variable with thread storage duration, meaning that the object's lifetime begins when the thread begins and ends when the thread ends. We get `NULLPTR_DEREFERENCE` false positive for `thread_local` variable since we reallocate it in the `VariableLifetimeBegins` metadata instruction and we do not see further updates to the variable. To solve the issue we special case `VariableLifetimeBegins` instruction for global variables. Reviewed By: jvillard Differential Revision: D22284135 fbshipit-source-id: 13c14ef90	5 years ago
Daiva Naudziuniene	2c48e61031	[pulse] A new issue type OPTIONAL_EMPTY_ACCESS for trying to access folly::Optional when it is folly::none Summary: We need to check if `folly::Optional` is not `folly::none` if we want to retrieve the value, otherwise a runtime exception is thrown: ``` folly::Optional<int> foo{folly::none}; return foo.value(); // bad ``` ``` folly::Optional<int> foo{folly::none}; if (foo) { return foo.value(); // ok } ``` This diff adds a new issue type that reports if we try to access `folly::Optional` value when it is known to be `folly::none`. Reviewed By: ezgicicek Differential Revision: D22053352 fbshipit-source-id: 32cb00a99	5 years ago
Daiva Naudziuniene	412d2777eb	[pulse] Add a flag to pass functions that we want to model as abort Summary: To avoid NULLPTR_DEREFERENCE false positives we want to treat some functions as `abort`. A new flag `--pulse-model-abort` allows us to provide a list of such functions. Reviewed By: ezgicicek Differential Revision: D21962555 fbshipit-source-id: d46b93c99	5 years ago
Daiva Naudziuniene	98092481d4	[pulse] Special case for std::function:operator=( nullptr ) Summary: Assigning `nullptr` to `std::function` was causing `NULLPTR_DEREFERENCE` as our model was expecting to get an object in the right hand side of the assignment (`std::function::operator=`) and was dereferencing that object. Assigning `nullptr` to `std::function` removes callable object from it. We model this special case by creating a fresh value. Reviewed By: skcho Differential Revision: D21685318 fbshipit-source-id: 2d4af1933	5 years ago
Daiva Naudziuniene	ca2ec281c7	[pulse] Model for iterator operator-- Summary: Currently we get false positive if we apply `operator--` to the `end()` iterator. To solve this, we model iterator `operator--` not to raise an error for the `EndIterator` invalidation, but to create a fresh element in the underlying array. Reviewed By: ezgicicek Differential Revision: D21476353 fbshipit-source-id: 5c722372e	5 years ago
Daiva Naudziuniene	eaf95951f5	[pulse] Modeling std::vector::end() Summary: It is undefined behavior to dereference end iterator. To catch end iterator dereferencing issues we change iterator model: instead of having `internal pointer` storing the current index, we model it as a pointer to a current index. This allows us to model `end()` iterator as having an invalid pointer and there is no need to create an invalidated element in the vector itself. Reviewed By: ezgicicek Differential Revision: D21178441 fbshipit-source-id: fd6a94b0b	5 years ago
Jules Villard	385b6fa914	[pulse] revamp arithmetic, put everything in the path condition Summary: List of things happening in this unreviewable diff: - moved PulsePathCondition to PulseSledge - renamed --pulse-path-conditions to --pudge - PulsePathCondition now contains all the arithmetic of pulse (inferbo+concrete intervals+pudge). In particular, moved arithmetic attributes into PulsePathCondition.t. PulsePathCondition plays the role of PulseArithmetic (combining all domains). - added tests for a false positive involving free() - PulseArithmetic is now just a thin wrapper around PulsePathCondition to operate on states directly (instead of on path conditions). - The rest is mostly moving code into PulsePathCondition (eg, from PulseInterproc) and adjusting it. Reviewed By: jberdine Differential Revision: D21332073 fbshipit-source-id: 184c8e0a9	5 years ago
Jules Villard	2d8debc562	[pulse] invalidate vector backing array correctly Summary: We were invalidating "*(vec.__infer_backing_array)" instead of the address of the field itself. Reviewed By: ezgicicek Differential Revision: D21280357 fbshipit-source-id: 48b984800	5 years ago
Jules Villard	b1e35a728d	[biabd] rename test directories from {biabduction,errors,infer} to {biabduction} Summary: The directory names had some interesting variety due to historical reasons. - {c,cpp,objc,objcpp}/errors/ date from the time when infer was only biabduction - java/infer/ dates from the time when we had an "--analyzer" option and "infer" was one of them (sic), and eg another was "eradicate". - c/biabduction/ dates from the time when the biabduction analysis was being migrated to the "checkers" (AI) framework. For some reasons the tests there are not a subset of c/infer/ but seem to be entirely new tests. The convention now dictates that we should name all of these */biabduction/. This diff moves the existing tests from c/biabduction/ into c/biabduction/misc/. Reviewed By: mityal Differential Revision: D21300147 fbshipit-source-id: 516d1cb15	5 years ago
Daiva Naudziuniene	247ecb813d	[pulse] Fix traces for iterator invalidation errors Summary: Iterator invalidation traces were based on vector rather than iterator itself. Reviewed By: ezgicicek Differential Revision: D21202047 fbshipit-source-id: 62ce8a488	5 years ago
Ezgi Çiçek	269cdb80d9	[pulse] Model `StdVector` allocator Summary: We ignored allocator models for vectors, and were not able to initialize vectors properly. This diff fixes this issue. It also adds a test which was a FN before. Reviewed By: skcho, jvillard Differential Revision: D21089492 fbshipit-source-id: 6906cd1d1	5 years ago
Jules Villard	3332dc1a42	[AI] improve disjunctive domain Summary: Replace horrible hack with ok hack. The main difficulty in implementing the disjunctive domain is to avoid the quadratic time complexity of executing the same disjuncts over and over again when going around loops: First time around a loop, assuming for example a single disjunct `d`: ``` [d] loop body [d1' \/ d2'] ``` Second time around the same loop: the new pre will be the join of the posts of predecessor nodes, so `old_pre \/ post(loop,old_pre)`, i.e. `d \/ d1' \/ d2'`. Now we need to execute `loop body` again without running the symbolic execution of `d` again (and the time after that we'll want to not execute `d`, `d1'`, or `d2'`). Horrible hack (before): Disjuncts have a boolean "visited" attached that does its best to keep track of whether a given disjunct is old or new. When executing a single instruction look at the flag and skip the state if it's old. Of course we have no way to know for sure so it turns out it was often wrongly re-executing old disjuncts. This was also producing the wrong results over even simple loops: only the last iteration would make it outside the loop for some reason. Overall, the semantics were pretty untractable and shady at best. New hack (this diff): only run instructions of a given node on disjuncts that are not physically equal to the "pre" ones already in the invariant map for the current node. This gives the correct result over simple loops and a nice performance improvement in general (probably the old heuristic was hitting the quadratic bad case more often). Reviewed By: skcho Differential Revision: D21154063 fbshipit-source-id: 5ee38c68c	5 years ago
Jules Villard	3220804ddb	[pulse] add a cache of constants to equate them Summary: When encountering a constant, pulse creates an abstract value (a variable) to represent it, and remembers that it's equal to it. The problem is that pulse doesn't yet know how to deal with the fact that some variables are going to be equal to each other. This hacks around this issue in the case of constants, within the same procedure, by remembering which constants have been assigned to which place-holder variables, and serving those variables again when the same constant is translated again. Limitation: this doesn't work across procedure calls as the "constant maps" are not saved in summaries. Something to look out for: we don't want to make `if (p == NULL)` create a path where `p` is invalid (we only make null invalid when we see an assignment from 0, i.e. `p = NULL;`). Reviewed By: ezgicicek Differential Revision: D21089961 fbshipit-source-id: 5ebb85d0a	5 years ago
Daiva Naudziuniene	dae7f36339	[pulse] Vector iterator model Summary: Modeling vector iterator with two internal fields: an internal array and an internal pointer. The internal array field points to the internal array field of a vector; the internal pointer field represents the current element of the array. For now `operator++` creates a fresh element inside the array. Reviewed By: ezgicicek Differential Revision: D21043304 fbshipit-source-id: db3be49ce	5 years ago
Jules Villard	7a888170e7	[pudge] it's alive! Summary: Add a path condition to each symbolic state, represented in sledge's arithmetic domain. This gives a precise account of arithmetic constraints. In particular, it is relation and thus is more robust in the face of inter-procedural analysis. This is gated behind a flag for now as there are performance issues with the new arithmetic. Reviewed By: jberdine Differential Revision: D20393947 fbshipit-source-id: b780de22a	5 years ago
Ezgi Çiçek	e1093159b0	[pulse] Distinguish error state at top level Summary: As soon as pulse detects an error, it completely stops the analysis and loses the state where the error occurred. This makes it difficult to debug and understand the state the program failed. Moreover, other analyses that might build on pulse (e.g. impurity), cannot access the error state. This diff aims to restore and display the state at the time of the error in `PulseExecutionState` along with the diagnostic by extending it as follows: ``` type exec_state = \| represents the state at the program point that caused an error ) ``` As a result, since we don't immediately stop the analysis as soon as we find an error, we detect both errors in conditional branches simultaneously (see test result changes for examples). NOTE: We need to extend `PulseOperations.access_result` to keep track of the failed state as follows: ``` type 'a access_result = ('a, Diagnostic.t t [denoting the exit state] ) result ``` Reviewed By: jvillard Differential Revision: D20918920 fbshipit-source-id: 432ac68d6	5 years ago
Dulma Churchill	b29d1a2f5f	[pulse] Adding new value history for allocations Reviewed By: jvillard Differential Revision: D20914622 fbshipit-source-id: f32836a95	5 years ago
Ezgi Çiçek	6105a6ab42	[pulse] Add tests for constant abstract locations Reviewed By: skcho Differential Revision: D20920029 fbshipit-source-id: b35f05ae1	5 years ago
Ezgi Çiçek	5a2b285fff	[pulse] Distinguish exit state at top level Summary: This diff lifts the `PulseAbductiveDomain.t` in `PulseExecutionState` by tracking whether the program continues the analysis normally or exits unusually (e.g. by calling `exit` or `throw`): ``` type exec_state = \| ContinueProgram of PulseAbductiveDomain.t (** represents the state at the program point ) \| ExitProgram of PulseAbductiveDomain.t (* represents the state originating at exit/divergence. *) ``` Now, Pulse's actual domain is tracked by `PulseExecutionState` and as soon as we try to analyze an instruction at `ExitProgram`, we simply return its state. The aim is to recover the state at the time of the exit, rather than simply ignoring them (i.e. returning empty disjuncts). This allows us to get rid of some FNs that we were not able to detect before. Moreover, it also allows the impurity analysis to be more precise since we will know how the state changed up to exit. TODO: - Impurity analysis needs to be improved to consider functions that simply exit as impure. - The next goal is to handle error state similarly so that when pulse finds an error, we recover the state at the error location (and potentially continue to analyze?). Disclaimer: currently, we handle throw statements like exit (as was the case before). However, this is not correct. Ideally, control flow from throw nodes follows catch nodes rather than exiting the program entirely. Reviewed By: jvillard Differential Revision: D20791747 fbshipit-source-id: df9e5445a	5 years ago
Jules Villard	3bf771bff4	[pulse] add model for std::vector<>::at() Summary: Kinda forgot to model this when `operator[]` was modelled. Reviewed By: skcho Differential Revision: D19433156 fbshipit-source-id: 49fbafc8a	5 years ago
Ezgi Çiçek	6f64131ae6	[pulse] Do not havoc arguments of unknown functions that are pointers to const Reviewed By: skcho Differential Revision: D19331312 fbshipit-source-id: b450a819b	5 years ago
Jules Villard	49fb5b7c85	[pulse] do arithmetic on pointers too Summary: A plus is a plus, no need to give up when +/- is about pointers. This gets rid of some false positives involving pointer arithmetic. However, the problem remains if we make things a bit more inter-procedural. This is documented in an added test. Reviewed By: ezgicicek Differential Revision: D18932877 fbshipit-source-id: 4ad1cfe72	5 years ago
Jules Villard	e06a43a677	[pulsebo] use inferbo more in summaries Summary: - Do most of the work of `solve_arithmetic_constraints` inside `subst_attribute` instead, since we need to re-use the latter function for post-conditions where the first function is not appropriate. - When substituting arithmetic constraints, we refine arithmetic information (both concrete intervals and inferbo), which can lead to inconsistent states. Instead of recording the new arithmetic facts by returning a new current state, just act as a map on attributes. This is to enable doing the point above. - All this lead to a somewhat messy refactoring... - Rename `CannotApplyPre` to `Contradiction` since it's used for post-conditions as well now Reviewed By: skcho Differential Revision: D18889120 fbshipit-source-id: d81647143	5 years ago
Jules Villard	a42e15147b	[pulse] fix test for by-ref automatic initialisation Summary: Pointers are hard... The previous test had no chance of doing initialisation of the pointer by reference and was in fact a false negative (and still is, fix incoming). Renamed functions to stress the false negative and added a test that is really (potentially) doing pointer initialisation by reference. Reviewed By: skcho Differential Revision: D18888008 fbshipit-source-id: 1e72408c7	5 years ago
Jules Villard	eb52b28f91	[pulsebo] use inferbo in prunes Summary: Finally use information from the inferbo intervals in pulse's domain to make decisions about whether conditionals are feasible or not. Reviewed By: skcho Differential Revision: D18811193 fbshipit-source-id: d80a28657	5 years ago
Jules Villard	df49f318f6	[pulse] havoc formals passed by reference to unknown procedures Summary: This gets rid of false positives when something invalid (eg null) is passed by reference to an initialisation function. Havoc'ing what the contents of the pointer to results in being optimistic about said contents in the future. Also surprisingly gets rid of some FNs (which means it can also introduce FPs) in the `std::atomic` tests because a path condition becomes feasible with havoc'ing. There's a slight refinement possible where we don't havoc pointers to const but that's more involved and left as future work. Reviewed By: skcho Differential Revision: D18726203 fbshipit-source-id: 264b5daeb	5 years ago
Jules Villard	32f60f3d3c	[pulse] model the fact `free(0)` is a no-op Summary: It's a well-known fact that pulse should know too. To avoid splitting the abstract state systematically, only act if we know the pointer is exactly 0 to avoid reporting a nullptr dereference on `free(x)`. Reviewed By: ezgicicek Differential Revision: D18708575 fbshipit-source-id: 1cc3f6908	5 years ago
Jules Villard	3fbefbad34	[pulse] model some of `std::atomic` Summary: Turns out code uses atomics in important places, modelling it removes FPs. The tests are copied from biabduction and adapted and extended a bit. I didn't implement compare_exchange primitives for now (plus, giving them a sequential semantics like in biabduction is probably a bit cheeky). Reviewed By: skcho Differential Revision: D18708576 fbshipit-source-id: a3581b8a4	5 years ago
Sungkeun Cho	61ae040077	[pulse] Add bo_itv to pulse attributes Summary: This diff adds inferbo's interval values to pulse's attributes. The added values will be used to filter out infeasible passes in the following diffs. Reviewed By: jvillard Differential Revision: D18726667 fbshipit-source-id: c1125ac6e	5 years ago
Jules Villard	f81c9d56e3	[pulse] arithmetic operations Summary: Model +/- when we know the concrete interval for a value. Reviewed By: skcho Differential Revision: D18528535 fbshipit-source-id: 7c67a7a54	5 years ago
Jules Villard	6ecf4066e8	[pulse] model std::integral_constant Summary: cpp_initialization Reviewed By: skcho Differential Revision: D18528537 fbshipit-source-id: ab5f8038a	5 years ago
Jules Villard	6df4fb6a9b	[pulse] report dereference of NULL and constants Summary: Note: Disabled by default. Having some support for values, we can report when a null or constant value is being dereferenced. The particularity here is that we don't report when 0 is a possible value for the address, or even if we know that the value of the address can only be 0 in that branch! Instead, we allow ourselves to report only when we the address has been set to NULL (or any constant). This is in line with how pulse deals with other issues: only report when 1. we see an address become invalid, and 2. we see the same address be used later on Reviewed By: skcho Differential Revision: D17665468 fbshipit-source-id: f1ccf94cf	5 years ago
Jules Villard	2e4fbb7fe5	[pulse] intervals! Summary: This adds a more interesting value domain to pulse: concrete intervals. There are still two main limitations: 1. arithmetic operations are all over-approximated: any assignment involving arithmetic operations is replaced by non-determinism 2. abstract values that are discovered to be equal are not merged into one Reviewed By: skcho Differential Revision: D18058972 fbshipit-source-id: 0492a590f	5 years ago
Jules Villard	b20c22a5ee	[pulse] abduce arithmetic facts Summary: This does several things because it was hard to split it more: 1. Split most of the arithmetic reasoning to PulseArithmetic.ml. This doesn't need to be reviewed thoroughly because an upcoming diff changes the domain from just `EqualTo of Const.t` to an interval domain! 2. When going through a prune node intra-procedurally, abduce arithmetic facts to the pre (instead of just propagating them). This is the "assume as assert" trick used by biabduction 1.0 too and allows to propagate arithmetic constraints to callers. 3. Use 2 when applying summaries by pruning specs whose preconditions have un-satisfiable arithmetic constraints. This changes one of the tests! Pulse now does a bit more work to find the false positive, as can be seen in the longer trace. Reviewed By: skcho Differential Revision: D18117160 fbshipit-source-id: af3b2c8c0	5 years ago
Jules Villard	16c88e282d	[pulse] some tests about values Summary: In preparation for improvements to the arithmetic reasoning. Reviewed By: dulmarod Differential Revision: D17977207 fbshipit-source-id: ee98e0772	5 years ago
Jules Villard	6a738045fd	[pulse] interprocedural histories and traces Summary: bigmacro_bender There are 3 ways pulse tracks history. This is at least one too many. So far, we have: 1. "histories": a humble list of "events" like "assigned here", "returned from call", ... 2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah" 3. "traces", which combine one history with one interproc action This diff gets rid of interproc actions and makes histories include "nested" callee histories too. This allows pulse to track and display how a value got assigned across function calls. Traces are now more powerful and interleave histories and interproc actions. This allows pulse to track how a value is fed into an action, for instance performed in callee, which itself creates some more (potentially now interprocedural) history before going to the next step of the action (either another call or the action itself). This gives much better traces, and some examples are added to showcase this. There are a lot of changes when applying summaries to keep track of histories more accurately than was done before, but also a few simplifications that give additional evidence that this is the right concept. Reviewed By: skcho Differential Revision: D17908942 fbshipit-source-id: 3b62eaf78	5 years ago
Jules Villard	669383d315	[pulse] more details about variable declaration events Summary: - add the variable being declared so we can report it back in the trace in addition to its location - distinguish between local vars and formals Reviewed By: skcho Differential Revision: D17930348 fbshipit-source-id: a5b863e64	5 years ago
Jules Villard	96c96a8dc6	[pulse] remember equalities found in branches Summary: When we make the decision to go into a branch "v = N" where some abstract value is compared to a constant, remember the corresponding equality. This allows to prune simple infeasible paths intra-procedurally. Further work is needed to make this useful interprocedurally, for instance either or both of these ideas could be explored: - abduce v=N in the precondition and do not apply summaries when the equalities in the pre are not satisfied - prune post-conditions that lead to unsat states where a value has to be equal to several different constants Reviewed By: skcho Differential Revision: D17906166 fbshipit-source-id: 5cc84abc2	5 years ago
Jules Villard	3ac8e27062	[pulse] use constant equality to prune unfeasible paths Summary: When we know "x = 3" and we have a condition "x != 3" we know we can prune the corresponding path. Reviewed By: skcho Differential Revision: D17665472 fbshipit-source-id: 988958ea6	5 years ago
Jules Villard	362e9cc622	[pulse] do not print `()` after functions Summary: Unfortunately it is very hard to predict when `Typ.Procname.describe` will add `()` after the function name, so we cannot make sure it is always there. Right now we report clowny stuff like "error while calling `foo()()`", which this change fixes. Reviewed By: ezgicicek Differential Revision: D17665470 fbshipit-source-id: ef290d9c0	5 years ago
Ezgi Çiçek	127902222d	[pulse] Filter AddressOfStackVariable from read only heuristic check Reviewed By: skcho Differential Revision: D16518259 fbshipit-source-id: 92a631a82	5 years ago
Ezgi Çiçek	09ab685c7e	[pulse] Handle stack refs escaping their scope via pointer Summary: Pulse didn't treat local variables going out of scope as invalidating the corresponding address in memory. This diff fixes that by - marking all local variables that exits the scope with the attribute `AddressOfStackVariable` - before we write the summary for the proc, we make sure to invalidate all such addresses local to the procedure as `Invalid.` If such an address is read, then we would raise a use-after-lifetime issue. Reviewed By: jvillard Differential Revision: D16458355 fbshipit-source-id: 3686524cb	5 years ago
Jules Villard	a504a67ec2	[pulse] model some of `std::basic_string` Summary: A common gotcha is the new test. Model the minimum amount of `std::basic_string` to catch it. Reviewed By: mbouaziz, ngorogiannis Differential Revision: D16121090 fbshipit-source-id: 66f06cb43	5 years ago
Jules Villard	14b9975cf3	[pulse] support modelling destructors Summary: We want to detect that variables and C++ temporaries go out of scope even when their destructor happens to be modelled. We lost a test to that because `std::function::~function` was poorly modeled as deleting the lambda itself which would now cause a double invalidation. This has to be modelled better now as something that invalidates something inside the lambda, and also model `operator()` as something that accesses that something, to recover that test. It's not a vital test though, so Do It Later©. Reviewed By: ngorogiannis Differential Revision: D16121091 fbshipit-source-id: 6b777ca18	5 years ago
Jules Villard	d9aadf5df2	[pulse] allow models in invalidation traces Summary: Be more flexible in what type of function calls are allowed in `ViaCall ...` actions to be able to include models. Also get rid of `here here` in traces /o\ As a side-effect, get more precise (=qualified) procedure names in traces (but not in messages so as not to be too verbose). Reviewed By: mbouaziz, ngorogiannis Differential Revision: D16121092 fbshipit-source-id: fb51b02f8	5 years ago
Jules Villard	ef26e8bb28	[clang] NamespaceAliasDecl is just a no-op Summary: Fixes #1123. Reviewed By: mbouaziz, ngorogiannis Differential Revision: D16163589 fbshipit-source-id: 10d2d8010	6 years ago
Jules Villard	e803a30c2d	[clang] fix translation of `initListExpr` again Summary: So it turns out we need to translate even more cases. Pulse had a FP before that this fixes. Reviewed By: ezgicicek Differential Revision: D16073629 fbshipit-source-id: c03460b5a	6 years ago
Jules Villard	14ce445f81	[pulse] run tests against C++17 Summary: This is needed to test some functionality in the next diff. Only one test changes (no longer a FN), which is now documented. Also, stop including the "header models" meant for biabduction! Maybe one day we'll need to have several test modes for different C++ versions. Seems overkill for now, so let's wait until we see some actual issues (eg FPs) that manifest in one version but not the other. Reviewed By: mbouaziz Differential Revision: D16073630 fbshipit-source-id: 1cfdfc933	6 years ago
Jules Villard	86decb83f6	[pulse] record attributes of address not edge-reachable in the post Summary: Sometimes the post of a function call has attributes on addresses that were mentioned in the pre but are no longer reachable in the post. We don't want to forget these, see added test. Reviewed By: mbouaziz Differential Revision: D16050050 fbshipit-source-id: 1ce522b97	6 years ago
Jules Villard	58b1df6bb9	[clang] fix destructor placement for temporaries in conditionals Summary: The previous code would call the destructor for the C++ temporary before the prune nodes, which then try to dereference it. Wrong. Quick fix: don't destroy temporaries in conditionals. Reviewed By: mbouaziz Differential Revision: D16030735 fbshipit-source-id: e11abad58	6 years ago
Jules Villard	3a3c93140e	[pulse] translate initListExpr in more cases Summary: We were skipping some instructions before and that was a problem for pulse. See added pulse test. Reviewed By: mbouaziz Differential Revision: D16030150 fbshipit-source-id: 9c62e6213	6 years ago
Jules Villard	d96ab2458d	[pulse] model lambda destructor Summary: Not sure if anyone uses this but there, now it's modelled. Reviewed By: mbouaziz Differential Revision: D16008162 fbshipit-source-id: f4795dcba	6 years ago
Jules Villard	91a2e2986b	[pulse] model lambda capture by value Summary: Prevent false positives about variables captured by value gone out of scope. Reviewed By: ezgicicek Differential Revision: D16008165 fbshipit-source-id: d70e47db4	6 years ago
Jules Villard	433c144840	[pulse] calling known lambdas calls the corresponding proc name Summary: We know how to do interprocedural calls so let's use that! Reviewed By: mbouaziz Differential Revision: D16008164 fbshipit-source-id: 4c34bf704	6 years ago
Jules Villard	2bf6852b95	[pulse] model `std::function::operator=` Summary: `function::operator=` is called whenever we assign a literal lambda to a variable, so it's pretty useful to be able to report anything on lambdas. Reviewed By: mbouaziz Differential Revision: D16008163 fbshipit-source-id: a9d07668d	6 years ago
Jules Villard	f15d9915a0	[pulse] better types to avoid `_fun_` prefix to proc names in bug traces Summary: Printing `Exp.Const (Cfun proc_name)` adds `_fun_` in front of the procedure name, eg `_fun_foo` instead of `foo`. This showed up in pulse traces. Reviewed By: mbouaziz Differential Revision: D16004606 fbshipit-source-id: 72ac6866f	6 years ago
Jules Villard	a3311fb751	[pulse] C++ temporaries bound to globals do not "escape" Summary: Fixes a false positive where the address of a C++ temporary is bound to a static const reference variable then returned. The fix doesn't try to establish that the variable is a const reference so could lead to false negatives but that can be addressed later. Reviewed By: ezgicicek Differential Revision: D16004538 fbshipit-source-id: e403dbefe	6 years ago
Jules Villard	7f12ced394	[pulse] move to SIL proper Summary: [apologies for the unreviewable diff...] Get rid of HIL expressions in pulse. This finishes the HIL -> SIL migration. The first step made pulse start from SIL instructions but would translate most accesses to HIL to re-use most of the existing pulse code. This diff gets rid of the intermediate translation of SIL expressions to HIL expressions. Big changes: 1. `PulseOperations` mostly rewritten, driven by using `Exp.t` instead of `HilExp.AccessExpression.t` for everything. 2. Stop trying to reverse-engineer what addresses mean in terms of access paths from program variables. Rely on the trace pointing at the right places in the code to be enough. This is because it wasn't that useful (and could even be misleading when wrong) but could be prohibitively expensive in degenerate cases (eg nodes with tens of thousands of successive array accesses...) 3. `PulseAbductiveDomain.apply_post` now returns the computed return value instead of recording it itself. 4. Change of vocabulary: `materialize` -> `eval`, `crumb` -> `event` 5. Function calls arguments are now evaluated prior to doing anything else, which saves everything else from having to (remember to) do that. In particular, this changes how models look quite a bit. Reviewed By: mbouaziz Differential Revision: D15986373 fbshipit-source-id: 1d79935de	6 years ago
Jules Villard	04233ee49b	[clang] destroy C++ temporaries Summary: Inject destructor calls to destroy a temporary when its lifetime ends. Reviewed By: mbouaziz Differential Revision: D15674209 fbshipit-source-id: 0f783a906	6 years ago
Jules Villard	0592bac25e	[pulse] explain SIL logical variables in terms of program access paths Summary: Now that HIL doesn't help us anymore we need to reconstruct its mapping "SIL logical var -> program access path". We already have everything we need in pulse: it suffices to walk the current memory graph starting from program variables until we find the value of the temporary we are interested in. This diff also builds some type machinery to make sure all accesses are explained. Reviewed By: mbouaziz Differential Revision: D15824959 fbshipit-source-id: 722c81b39	6 years ago
Jules Villard	c9f4768be7	[pulse] move to SIL Summary: It turns out HIL gets in the way of a precise heap analysis. For instance, instead of: ``` n$0 = &x.f _ = delete(&x) &y = n$0 ``` HIL tries hard to forget about intermediate variables and shows instead ``` _ = delete(&x) &y = &x.f ``` Oops, that's a use-after-delete, whereas the original code was safe. While it's easy to write SIL programs that are completely unsound for HIL, they are not generated very often from the frontends. In fact, the problem became apparent only when making the clang frontend translate C++ temporaries destructors, which produces the situation above routinely. This diff makes the minimal amount of change to make Pulse build and produce equivalent results (minus HIL bugs) starting from SIL instead of HIL. The reporting sucks for now because we need to translate SIL temporaries back into program access paths. This is done in the next diff. Reviewed By: mbouaziz Differential Revision: D15824961 fbshipit-source-id: 8e4e2a3ed	6 years ago
Jules Villard	6f5cb512db	[pulse] add example of FN in const-ref-bound temporary Summary: This one isn't caught because we don't destruct temporaries that are bound to a const reference. According to the C++ standard these should get destroyed when the const reference gets destroyed but instead we just don't destroy them for now. Reviewed By: mbouaziz Differential Revision: D15760209 fbshipit-source-id: 32c935ec0	6 years ago
Jules Villard	e14809baa8	[pulse] fix temporaries test code Summary: A test was claiming to be ok but wasn't. Reviewed By: mbouaziz Differential Revision: D15695944 fbshipit-source-id: 58772a793	6 years ago
Jules Villard	21f66dd197	[pulse] do not model `operator=` as assignment Summary: In a next diff temporaries will get destructed at the end of their lifetimes and that naive model would be causing false positives. The flipside is that we lose all reports on closures for now, will need to model them separately later. Reviewed By: mbouaziz Differential Revision: D15695943 fbshipit-source-id: c2c482c02	6 years ago
Jules Villard	db800f138b	[clang] rewrite scope computations Summary: This started as an attempt to understand how to modify the frontend to inject destructors for C++ temporaries (see next diffs). This diff rewrites the existing logic for computing the list of variables that should be destroyed at the end of each statement, either because it's the end of their syntactic scope or because control flow branches outside of their syntactic scope. The frontend translates a function from the last instructions to the first, but scope computation needs to be done in the other direction, so it's done in a separate pass before the main translation happens. That first pass creates a map from statements in the AST to the list of variables that should be destroyed at the end of these statements. This is still the case now. Before, that map would be computed in a bit of a weird way: scopes are naturally a stack but instead of that the structure maintained was a flat list + a counter to know where the current scope ended in that list. In this diff, redo the computation maintaining a stack of scopes instead, which is a bit cleaner. Also treat more instructions as introducing a new scope, eg if, for, ... Reviewed By: mbouaziz Differential Revision: D15674208 fbshipit-source-id: c92429e82	6 years ago
Jules Villard	c3d55817b1	[pulse] another test for temporaries Summary: I rewrote the test so it doesn't need any C++ headers so that: - it's easier to see what's going on - it's easier to debug: the whole AST is now somewhat readable vs before the headers made it impossibly long Reviewed By: ezgicicek Differential Revision: D15674213 fbshipit-source-id: d98941983	6 years ago
Josh Berdine	cfc1c8be36	[copyright] Remove years Reviewed By: jvillard Differential Revision: D15771884 fbshipit-source-id: e2997e3a3	6 years ago
Peter O'Hearn	9b8a908ad3	[Pulse] model folly delayed destruction Reviewed By: jvillard Differential Revision: D15508919 fbshipit-source-id: f6073ef7c	6 years ago
Jules Villard	d586630edf	[pules] do not print templated part of function names Summary: This messes with the deduplication heuristic when templated function names show up in the error messages, since the heuristic demands that the error messages are the same. Reviewed By: mbouaziz Differential Revision: D15374333 fbshipit-source-id: 70232d254	6 years ago
Jules Villard	5de9bc29d2	[pulse] better error messages Summary: Improve the error messages, change is more or less documented in the code. Reviewed By: mbouaziz Differential Revision: D15374334 fbshipit-source-id: f1dd54180	6 years ago
Jules Villard	b700af9ffb	[hil] do not put parens around trivial expressions Summary: `(x)` -> `x` `&(x)` -> `&x` everything else unchanged Reviewed By: mbouaziz Differential Revision: D15374360 fbshipit-source-id: af5ef4e66	6 years ago
Jules Villard	6364199b94	[pulse] traces record how values were constructed Summary: Before: the trace would explain how a value was invalidated and accessed, but not how the value that was invalidated had been constructed. Now: `PulseTrace.t` records breadcrumbs of how the value was constructed in addition to the interproc "action" trace leading to the invalidation or access action. Concretely: ``` void bad(X &x) { X y = x; X z = x; delete y; access(z); } ``` will produce the trace: Invalidation part: y = x delete y Access part: z = x access(z) access to z->f inside of access(z) Before this diff the "Access part" would be missing the "z = x" part of the trace, so it might be confusing why `z` has anything to do with `y`. However, such "breadcrumbs" are not recorded in the inter-procedural part, only the sequence of calls is. This is a trade-off for simplicity, maybe it's enough for developers maybe it isn't, we'll find out later. Reviewed By: jberdine Differential Revision: D15354438 fbshipit-source-id: 8d0aed717	6 years ago
Jules Villard	b5589661ce	[pulse] improve error messages and traces Summary: Feedback from peterogithub: - mention which access path is being invalidated and accessed in the message - mention the line at which it was invalidated (the line at which it's accessed is already the line at which we report) - traces for stack variable/C++ temporary address escapes - delete double implementation of the same functionality in `PulseTrace`: `location_of_action_start` is the same as `outer_location_of_action`... Reviewed By: jberdine Differential Revision: D14800294 fbshipit-source-id: 3d9ab9b3d	6 years ago
Jules Villard	9dbbd68472	[pulse] apply summaries to globals too Summary: Similarly to function parameters (and the return value), we need to apply the pre/post of a function call to the globals mentioned in its summary. - tigthen summaries further to remember only abducible variables in the post (as well as in the pre) - take globals into account when applying pre/post pairs Reviewed By: jberdine Differential Revision: D14780800 fbshipit-source-id: fc0d180bb	6 years ago
Jules Villard	3ba05b8cee	[pulse] be more careful about what to consider as a variable going out of scope Summary: The heuristic to detect variables going out of scope was to detect any access expression passed as argument to an injected destructor call. However destructor calls are also injected in destructor bodies to destruct each field of an object, so the heuristic would detect fields going out of scope, which, erm, doesn't make sense. Limit the heuristic to local program variables. Reviewed By: jberdine Differential Revision: D14771454 fbshipit-source-id: ffa3c9fe3	6 years ago
Jules Villard	31c2a39e81	[pulse] tighten up summaries Summary: Only throw values to the pre if they can be followed from "abducible" variables: formals of the current method and globals. Because figuring out if a `Pvar.t` is a formal of the current procedure is actually a giant pain, hack something not too bad instead: pre-register all formals at the start of the analysis of the procedure. Then the only other variables we care about in the precondition are globals, which we can detect easily. This is mostly an optimisation (summaries won't include irrelevant "abduced" facts about the procedure's local variables anymore), but it also fixes a bug where we would sometimes overwrite things in the pre. I think that's why the tests improved. Reviewed By: ngorogiannis Differential Revision: D14753493 fbshipit-source-id: 08e73637f	6 years ago
Jules Villard	7c90480758	[pulse] do not create `&` back-edges eagerly Summary: This mostly doesn't make sense. The only thing this would have been good for was to give the most accurate result on access paths such as `*(&(x.f))`, but these are normalised anyway (into `x.f`) so we actually never see these. That said there might be some use to some similar logic in the future, but in the meantime let's delete the current feature as it wasn't thought through. Reviewed By: ezgicicek Differential Revision: D14753492 fbshipit-source-id: 597cec027	6 years ago
Jules Villard	ada032ee2c	[pulse] improve error messages and traces Summary: The previous message formatting had regressed and produced non-sensical messages. More importantly, remove template parameters from error messages to trigger the heuristic in `InferPrint` that deduplicates errors that are on the same line with the same error type and message. Without this we get hundreds of reports that correspond to as many instantiations of the same code. Reviewed By: ngorogiannis Differential Revision: D14747979 fbshipit-source-id: 3c4aad2b1	6 years ago
Jules Villard	db4e1ea433	[pulse] reallocate variables on initialisation Summary: We see the magic function `__variable_initialization` at the point where the variable is declared, eg `int x = foo()`. It's safe to reset `&x` at that point. This circumvents an issue that pops up in some rare cases where the ternary conditional operator `?:` and variable initialization conspire to produce weird frontend results. Some test becomes a FN again, but I think it was being reported for the wrong reasons; will investigate more later. Reviewed By: ngorogiannis Differential Revision: D14747980 fbshipit-source-id: e75d6e30f	6 years ago

1 2 3 4 5

214 Commits (0d430efb42fffda741b59bf6399da55e1399ad0e)