infer_clone

Commit Graph

Author	SHA1	Message	Date
Ezgi Çiçek	11141cb100	[impurity] Collect all accesses Summary: Previously, impurity analysis only collected one access for a single modification but not all other modifying accesses. This diff - changes the impurity domain to collect all modifying accesses - tracks and prints all the accesses seen to reach the modification, improving readability&debugging Recording all accesses are needed in the next diff to determine if a method modifies any immutable fields. To determine that, we need to know all modifications, not just a single one. Reviewed By: skcho Differential Revision: D25186516 fbshipit-source-id: 43ceb3cd8	4 years ago
Daiva Naudziuniene	eaf95951f5	[pulse] Modeling std::vector::end() Summary: It is undefined behavior to dereference end iterator. To catch end iterator dereferencing issues we change iterator model: instead of having `internal pointer` storing the current index, we model it as a pointer to a current index. This allows us to model `end()` iterator as having an invalid pointer and there is no need to create an invalidated element in the vector itself. Reviewed By: ezgicicek Differential Revision: D21178441 fbshipit-source-id: fd6a94b0b	5 years ago
Daiva Naudziuniene	dae7f36339	[pulse] Vector iterator model Summary: Modeling vector iterator with two internal fields: an internal array and an internal pointer. The internal array field points to the internal array field of a vector; the internal pointer field represents the current element of the array. For now `operator++` creates a fresh element inside the array. Reviewed By: ezgicicek Differential Revision: D21043304 fbshipit-source-id: db3be49ce	5 years ago
Ezgi Çiçek	e1093159b0	[pulse] Distinguish error state at top level Summary: As soon as pulse detects an error, it completely stops the analysis and loses the state where the error occurred. This makes it difficult to debug and understand the state the program failed. Moreover, other analyses that might build on pulse (e.g. impurity), cannot access the error state. This diff aims to restore and display the state at the time of the error in `PulseExecutionState` along with the diagnostic by extending it as follows: ``` type exec_state = \| represents the state at the program point that caused an error ) ``` As a result, since we don't immediately stop the analysis as soon as we find an error, we detect both errors in conditional branches simultaneously (see test result changes for examples). NOTE: We need to extend `PulseOperations.access_result` to keep track of the failed state as follows: ``` type 'a access_result = ('a, Diagnostic.t t [denoting the exit state] ) result ``` Reviewed By: jvillard Differential Revision: D20918920 fbshipit-source-id: 432ac68d6	5 years ago
Ezgi Çiçek	8d44265ca1	[impurity] Consider exited functions as impure Summary: Consider functions that simply exit as impure by extending the impurity domain with `AbstractDomain.BooleanOr` that signifies whether the program exited. Reviewed By: skcho Differential Revision: D20941628 fbshipit-source-id: 19bc90e66	5 years ago
Ezgi Çiçek	d97e1c8fdb	[pulse][impurity] Add model for System.exit() Summary: - Model `System.exit()` as early_exit and add a test - Tweak message of methods that are impure due to having no pulse summary (and add a test) Reviewed By: skcho Differential Revision: D20668979 fbshipit-source-id: 6b5589aae	5 years ago
Ezgi Çiçek	b64ed0bbf2	[impurity] Consider functions with empty pulse summary as impure Summary: As exemplified by added tests, pulse computes an empty summary (with 0 disjuncts) whenever it discovers a contradiction which might be caused by: - discovering aliasing in memory - widening limited number of times in loops and concluding that loop exit conditions are never taken However, AFAIU, it is not possible to have a function with 0 disjunct apart from such anomalities. Even a function which does nothing like `void foo(){}` has 1 disjuncts: ``` Pulse: 1 pre/post(s) #0: PRE: { roots={ }; mem ={ }; attrs={ };} POST: { roots={ }; mem ={ }; attrs={ };} SKIPPED_CALLS: { } ``` The aim of this diff is to consider functions with 0 disjuncts as impure because most often such cases are impure, rather than actually pure. Reviewed By: skcho Differential Revision: D20619504 fbshipit-source-id: 3a8502c90	5 years ago
Ezgi Çiçek	cc815f5d20	[pulse] Only propagate existing WrittenTo attributes at function calls Summary: Previously, at each function call, we added a `WrittenTo` attribute for applying the address of the actuals. However, this results in mistakenly considering each function application that inspects its argument as impure. Instead, we should only propagate `WrittenTo` if the actuals have already `WrittenTo` attributes. For instance, for the following functions ``` public static boolean is_null(Byte a) { return a == null; } public static boolean call_is_null(Byte a) { return is_null(a); } ``` We used to get the following pulse summary for `call_is_null` (showing only one of the disjuncts): ``` #0: PRE: { roots={ &a=v1 }; mem ={ v1 -> { * -> v2 } }; attrs={ v1 -> { MustBeValid }, v2 -> { Arith =null, BoItv ([max(0, v2), min(0, v2)]) } };} POST: { roots={ &a=v1, &return=v8 }; mem ={ v1 -> { * -> v2 }, v8 -> { * -> v4 } }; attrs={ v2 -> { Arith =null, BoItv ([max(0, v2), min(0, v2)]), WrittenTo-----------WRONG }, v4 -> { Arith =1, BoItv (1), Invalid ConstantDereference(is the constant 1), WrittenTo-----------WRONG }, v8 -> { WrittenTo } };} SKIPPED_CALLS: { } ``` where we mistakenly recorded a `WrittenTo` for `v2` (what `a` points to). As a result, we considered `call_is_null` as impure :( This diff fixes that since the callee `is_null` doesn't have any `WrittenTo` attributes for its parameter `a`. So, we don't propagate `WrittenTo` and get the following summary ``` #0: PRE: { roots={ &a=v1 }; mem ={ v1 -> { * -> v2 } }; attrs={ v1 -> { MustBeValid }, v2 -> { Arith =null, BoItv ([max(0, v2), min(0, v2)]) } };} POST: { roots={ &a=v1, &return=v8 }; mem ={ v1 -> { * -> v2 }, v8 -> { * -> v4 } }; attrs={ v2 -> { Arith =null, BoItv ([max(0, v2), min(0, v2)]) }, v4 -> { Arith =1, BoItv (1), Invalid ConstantDereference(is the constant 1) }, v8 -> { WrittenTo } };} SKIPPED_CALLS: { } ``` Reviewed By: skcho Differential Revision: D20490102 fbshipit-source-id: 253d8ef64	5 years ago
Ezgi Çiçek	a4c3925d9a	[impurity] Track unique accesses Summary: Impurity domain was tracking all changes to variables (with a list of traces that containing all write/invalid accesses). This results in having long traces with multiple access events for the same variable. For instance, ``` void swap_impure(int[] array, int i, int j) { int tmp = array[i]; array[i] = array[j]; \\ included in the trace array[j] = tmp; \\ included in the trace } ``` here we recorded both array accesses. This diff changes the domain to include accesses so that we only keep track of a single trace per access. Array accesses are only recorded once. Note that we want to record all unique accesses, not just the first one, because impurity will be used for hoisting/cost where we will invalidate impure arguments and consider all the rest as not changing. Reviewed By: jvillard Differential Revision: D20385745 fbshipit-source-id: d3647dad3	5 years ago
Ezgi Çiçek	e3c89b1f10	[impurity] Fix include_value_history Summary: D20362149 missed - to pass the optional argument `include_value_history` to the recursive call in `PulseTrace.add_to_errlog`. - to set `include_value_history=false` for skipped calls. This diff fixes these issues. Reviewed By: skcho Differential Revision: D20385604 fbshipit-source-id: 176e4d010	5 years ago
Ezgi Çiçek	b90d7c42d3	[impurity] Do not add value history in impurity traces Summary: Impurity traces are quite big due to recording values histories. Let's simplify the traces by removing pulse's value histories. Reviewed By: skcho Differential Revision: D20362149 fbshipit-source-id: 8a2a6115e	5 years ago
Ezgi Çiçek	a0fd5a0e6a	[pulse] Refactor attributes into domain Summary: Let's move attributes into Pulse's domain. Reviewed By: jvillard Differential Revision: D19533915 fbshipit-source-id: 995fd12da	5 years ago
Ezgi Çiçek	dd59a141f0	[impurity] Rely on set of skipped functions to determine impurity Summary: Currently, impurity analysis is oblivious to skipped functions which might e.g. return a non-deterministic value, write to memory or have some other side-effect. This diff fixes that by relying on Pulse's skipped functions to determine impurity. Any unknown function which is not modeled to be pure is assumed to be impure. This is a heuristic. We could have assumed them to be pure by default as well. Reviewed By: jvillard Differential Revision: D19428514 fbshipit-source-id: 82efe04f9	5 years ago
Jules Villard	df49f318f6	[pulse] havoc formals passed by reference to unknown procedures Summary: This gets rid of false positives when something invalid (eg null) is passed by reference to an initialisation function. Havoc'ing what the contents of the pointer to results in being optimistic about said contents in the future. Also surprisingly gets rid of some FNs (which means it can also introduce FPs) in the `std::atomic` tests because a path condition becomes feasible with havoc'ing. There's a slight refinement possible where we don't havoc pointers to const but that's more involved and left as future work. Reviewed By: skcho Differential Revision: D18726203 fbshipit-source-id: 264b5daeb	5 years ago
Ezgi Çiçek	6781ba36d3	[impurity] Start checking equivalence at materialized addresses in pre Summary: Previously, we considered a function which modifies its parameters to be impure even though it might not be modifying the underlying value. This resulted in FPs like the following program in Java: ``` void fresh_pure(int[] a) { a = new int[1]; } ``` Similarly, in C++, we considered the following program as impure because it was writing to `s`: ``` Simple* reassign_pure(Simple* s) { s = new Simple{2}; return s; } ``` This diff fixes that issue by starting the check for address equivalnce in pre-post not directly from the addresses of the stack variables, but from the addresses pointed to by these stack variables. That means, we only consider things to be impure if the actual values pointed by the parameters change. Reviewed By: skcho Differential Revision: D18113846 fbshipit-source-id: 3d7c712f3	5 years ago
Jules Villard	6a738045fd	[pulse] interprocedural histories and traces Summary: bigmacro_bender There are 3 ways pulse tracks history. This is at least one too many. So far, we have: 1. "histories": a humble list of "events" like "assigned here", "returned from call", ... 2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah" 3. "traces", which combine one history with one interproc action This diff gets rid of interproc actions and makes histories include "nested" callee histories too. This allows pulse to track and display how a value got assigned across function calls. Traces are now more powerful and interleave histories and interproc actions. This allows pulse to track how a value is fed into an action, for instance performed in callee, which itself creates some more (potentially now interprocedural) history before going to the next step of the action (either another call or the action itself). This gives much better traces, and some examples are added to showcase this. There are a lot of changes when applying summaries to keep track of histories more accurately than was done before, but also a few simplifications that give additional evidence that this is the right concept. Reviewed By: skcho Differential Revision: D17908942 fbshipit-source-id: 3b62eaf78	5 years ago
Jules Villard	8182514f35	[impurity] clarify string parameter of `ImpurityDomain.add_to_errlog` Summary: Instead of a string argument named `~str` pass `Formal \| Global` and let `add_to_errlog` figure out how to print it. Reviewed By: ezgicicek Differential Revision: D17907657 fbshipit-source-id: ed09aab72	5 years ago
Ezgi Çiçek	557e2bfa3f	[impurity] Consider functions with no pulse summary as impure Summary: If we have no pulse summary (most likely caused by pulse finding a legit issue with the code), let's consider the function as impure. Reviewed By: jvillard Differential Revision: D17906016 fbshipit-source-id: 671d3e0ba	5 years ago
Jules Villard	362e9cc622	[pulse] do not print `()` after functions Summary: Unfortunately it is very hard to predict when `Typ.Procname.describe` will add `()` after the function name, so we cannot make sure it is always there. Right now we report clowny stuff like "error while calling `foo()()`", which this change fixes. Reviewed By: ezgicicek Differential Revision: D17665470 fbshipit-source-id: ef290d9c0	5 years ago
Ezgi Çiçek	c5ca4db8d0	[pulse][impurity] Use pulse for detecting impurity Summary: Introduce a new experimental checker (`--impurity`) that detects impurity information, tracking which parameters and global variables of a function are modified. The checker relies on Pulse to detect how the state changes: it traverses the pre and post pairs starting from the parameter/global variable and finds where the pre and post heaps diverge. At diversion points, we expect to see WrittenTo/Invalid attributes containing a trace of how the address was modified. We use these to construct the trace of impurity. This checker is a complement to the purity checker that exists mainly for Java (and used for cost and loop-hoisting analyses). The aim of this new experimental checker is to rely on Pulse's precise memory treatment and come up with a more precise im(purity) analysis. To distinguish the two checkers, we introduce a new issue type `IMPURE_FUNCTION` that reports when a function is impure, rather than when it is pure (as in the purity checker). TODO: - improve the analysis to rely on impurity information of external library calls. Currently, all library calls are assumed to be nops, hence pure. - de-entangle Pulse reporting from analysis. Reviewed By: skcho Differential Revision: D17051567 fbshipit-source-id: 5e10afb4f	5 years ago

20 Commits (c736015316c24b55f68ebfd62f8b5394e894c8e8)