infer_clone

Commit Graph

Author	SHA1	Message	Date
Sungkeun Cho	740fb36f1b	[pulse] Add semantic models for C++ string length Summary: This diff adds some semantic models for C++ string length. It introduces an virtual field for string length and use the value in the models. * basic_string constructor given a constant string * string.empty * string.length Reviewed By: jvillard Differential Revision: D29390815 fbshipit-source-id: 99d67e48e	3 years ago
Sungkeun Cho	678386acbb	[pulse] Add FP tests due to infeasible paths depending on string length Reviewed By: ezgicicek Differential Revision: D29302634 fbshipit-source-id: 259b7120d	3 years ago
Loc Le	97c9481070	[pulse][isl] support dynamic-type for subseteq-checking Reviewed By: jvillard Differential Revision: D27858465 fbshipit-source-id: 5ffa9a5ee	4 years ago
Jules Villard	f0741626a1	[clang] fix order of parameters in some inherited constructors Summary: Some funky C++ way of calling the parent's constructor triggers a function in the frontend that used to reverse the order of parameters. Reviewed By: skcho Differential Revision: D28832500 fbshipit-source-id: 1032de2ca	4 years ago
Jules Villard	d285ee900b	[pulse] functional unknown functions Summary: Unknown functions may create false positives as well as false negatives for Pulse. Let's consider that unknown functions behave "functionally", or at least that a functional behaviour is a possible behaviour for them: when called with the same parameter values, they should return the same value. This is implemented purely in the arithmetic domain by recording `v_return = f_unknown(v1, v2, ..., vN)` for each call to unknown functions `f_unknown` with values `v1`, `v2`, ..., `vN` (and return `v_return`). The hope is that this will create more false negatives than false positives, as several FPs have been observed on real code that would be suppressed with this heuristic. The other effect this has on reports is to record hypotheses made on the return values of unknown functions into the "pruned" part of formulas, which inhibits reporting on paths whose feasibility depends on the return value of unknown functions (by making these issues latent instead). This should allow us to control the amount of FPs until we model more functions. Reviewed By: skcho Differential Revision: D27798275 fbshipit-source-id: d31cfb8b6	4 years ago
Jules Villard	37a79d16b0	[pulse][2/5] do not overwrite attributes Summary: It's better to remember the first reason why an address must be valid, etc. Reviewed By: skcho Differential Revision: D28674729 fbshipit-source-id: 3b69de7ef	4 years ago
Jules Villard	8f1df1f11e	[pulse] deduplicate histories and traces for memleaks Summary: Most/all of the time we expect the history of the value to faithfully trace how it got allocated. That history was then added as a prefix of the trace leading to the same place, leading to duplicate information in the report trace. We may need to do the same for other bug types. Reviewed By: ezgicicek Differential Revision: D28536891 fbshipit-source-id: a83a2d038	4 years ago
Jules Villard	8fcd79a0b7	[pulse][models] refactor garbage-collected allocations Summary: Make it more obvious why we don't add an Allocated attribute in these models. Reviewed By: ezgicicek Differential Revision: D28536892 fbshipit-source-id: 643539ae6	4 years ago
Jules Villard	427937083b	[pulse] do not report null deref errors where the source of null is unclear Summary: As explained in the code comment, these reports are generally non-actionable at best and false positives at worst: skip reporting for constant dereference (eg null dereference) if the source of the null value is not on the path of the access, otherwise the report will probably be too confusing: the actual source of the null value can be obscured as any value equal to 0 (or the constant) can be selected as the candidate for the trace, even if it has nothing to do with the error besides being equal to the value being dereferenced Reviewed By: da319 Differential Revision: D28350193 fbshipit-source-id: 0cd76d252	4 years ago
Jules Villard	02e6d46e7f	[pulse] follow values inside function calls Summary: Turns out the mistake was pretty simple: we just forgot to keep the history of the return value in the callee and add it to the caller's. Reviewed By: skcho Differential Revision: D28385941 fbshipit-source-id: 40fe09c99	4 years ago
Jules Villard	9409685a2f	[pulse] a few textual changes in traces Summary: - Changed "passed as argument to f" to "in call to f", as these do not always correspond to passing an argument (eg could be a value returned from f) - Changed "assigned" to "returned" when appropriate - Changed the model of malloc() to not say "allocated" in the null case - Don't print "returned from f" when there was no event inside f: just print "in call to f". Reviewed By: da319 Differential Revision: D28413900 fbshipit-source-id: bc85625e3	4 years ago
Jules Villard	54228740dd	[pulse] fix typo in test Summary: Funny stuff. Reviewed By: skcho Differential Revision: D28411289 fbshipit-source-id: a9ad23551	4 years ago
Sungkeun Cho	b7b7e89159	[pulse] Address some modeled fields as pointers Summary: This diff addresses `GenericArrayBackedCollection.field` and others as pointers. The modeled fields are used as non-pointer struct fields, but their actual semantics are pointers that may have side effects. For example, `GenericArrayBackedCollection.field` is used for keeping an information that the previous vector's address could be invalid. ``` void foo(vector v) { v.push_back(0); // v's previous address may be invalid after push_back // PRE: {v -> {backing_array -> v1}} // POST: {v -> {backing_array -> v2}} // ATTR: {v1 may be invalidated} } ``` However, if we revert the modeled field values, it will return incorrect summary as follows, by reverting non-pointer parameter values. ``` // PRE: {v -> {backing_array -> v1}} // POST: {v -> {backing_array -> v1}} // ATTR: {v1 may be invalidated} ``` Reviewed By: jvillard Differential Revision: D28324161 fbshipit-source-id: 96451d4b0	4 years ago
Jules Villard	d97b82f8db	[pulse] add tests for pulse.isl Summary: There's been regressions in --pulse-isl. Without tests, everything is temporary! Note: the regressions are presumably still there, this just records the current status of pulse.isl. Also, no objective-C(++) at the moment. Should we add them too? (in another diff) Reviewed By: skcho Differential Revision: D28256703 fbshipit-source-id: 700b2cc57	4 years ago
Jules Villard	dbdf076e30	[pulse] take histories into account for all aspects of a report Summary: A previous change made pulse look into value histories for causes of invalidation in case the access trace of a value already contained the reason why that value is invalid, in order to save printing the invalidation trace in addition to the access trace. It also made reporting more accurate for null dereference as the source of null was often better identified (in cases where several values are null or zero). But, the history is also relevant to the bug type and the error message. Make these take histories into account too. Also fix a bug where we didn't look inside the sub-histories contained within function calls when looking for an invalidation along the history. Reviewed By: da319 Differential Revision: D28254334 fbshipit-source-id: 5ca00ee54	4 years ago
Jules Villard	16cb07698e	[pulse] no longer drop attributes of dead addresses Summary: When garbage-collecting addresses we would also remove their attributes. But even though the addresses are no longer allocated in the heap, they might show up in the formula and so we need to remember facts about them. This forces us to detect leaks closer to the point where addresses are deleted from the heap, in AbductiveDomain.ml. This is a nice refactoring in itself: doing so fixes some other FNs where we sometimes missed leak detection on dead addresses. This also makes it unecessary to simplify InstanceOf eagerly when variables get out of scope. Some new {folly,std}::optionals false positives that either are similar to existing ones or involve unmodelled smart pointers. Reviewed By: da319 Differential Revision: D28126103 fbshipit-source-id: e3a903282	4 years ago
Jules Villard	186b10e4f5	[pulse] record all the invalidations we can in histories Summary: Building on the infra in the previous commits, "fix" all the call sites that introduce invalidations to make sure they also update the corresponding histories. This is only possible to do when the access leading to the invalidation can be recorded. Right now the only place that's untraceable is the model of `free`/`delete`, because it happens to be the only place where we invalidate an address without knowing where it comes from (`free(v)`: what was v's access path? we could track this in the future). Reviewed By: skcho Differential Revision: D28118764 fbshipit-source-id: de67f449e	4 years ago
Daiva Naudziuniene	713cdbf580	[pulse] Inline initializers for global constant accesses Summary: Not tracking values for global constants might cause nullptr_dereference false positives. In particular, if the code has multiple checks and uses a global constant by its name in one check and its value in another check (see added test case), we are not able prune infeasible paths. This diff addresses such false positives by inlining initializers of global constants when they are being used. An assumption is that most the time the initialization of global constants would not have side effects. Reviewed By: jvillard Differential Revision: D25994898 fbshipit-source-id: 26360c4de	4 years ago
Jules Villard	3bce92d804	[pulse] better traces when invalidation happens along the access trace Summary: As explained in the previous diff: when the access trace goes through the invalidation step there is no need to print the invalidation trace at all. Note: only a few sources of invalidation are handled at the moment. The following diffs gradually fix the other sources of invalidation. Reviewed By: skcho Differential Revision: D28098335 fbshipit-source-id: 5a5e6481e	4 years ago
Jules Villard	d4bdfec49a	[pulse] record invalidation events in histories Summary: The eventual goal is to stop having separate sections of the trace ("invalidation part" + "access part") when the "access part" already goes through the invalidation step. For this, it needs to record when a value is made invalid along the path. This is also important for assignements to NULL/0/nullptr/nil: right now the way we record that 0 is not a valid address is via an attribute attached to the abstract value that corresponds to 0. This makes traces inconsistent sometimes: 0 can appear in many places in the same function and we won't necessarily pick the correct one. In other words, attaching traces to values is fragile, as the same value can be produced in many ways. On the other hand, histories are stored at the point of access, eg x->f, so have a much better chance of being correct. See added test: right now its traces is completely wrong and makes the 0 in `if (utf16StringLen == 0)` the source of the NULL value instead of the return of `malloc()`! This diff makes the traces slightly more verbose for now but this is fixed in a following diff as the traces that got longer are those that don't actually need an "invalidation" trace. Reviewed By: skcho Differential Revision: D28098337 fbshipit-source-id: e17929259	4 years ago
Sungkeun Cho	2886e849da	[frontend,pulse] Avoid dereference of C struct Summary: This diff avoids dereference of C struct, in its frontend and its semantics of Pulse. In SIL, C struct is not first-class value, thus dereferencing on it does not make sense. Reviewed By: ezgicicek Differential Revision: D27953258 fbshipit-source-id: 348d56338	4 years ago
Jules Villard	e549103d75	[pulse] use term_eqs Summary: Whenever an equality "t = v" (t an arbitrary term, v a variable) is added (or "v = t"), remember the "t -> v" mapping after canonicalising t and v. Use this to detect when two variables are equal to the same term: `t = v` and `t = v'` now yields `v = v'` to be added to the equality relation of variables. This increases the precision of the arithmetic engine. Interestingly, the impact on most code I've tried is: 1. mostly same perfs as before, if a bit slower (could be within noise) 2. slightly more (latent) bugs reported in absolute numbers I would have expected it to be more expensive and yield fewer bugs (as fewer false positives), but there could be second-order effects at play here where we get more coverage. We definitely get more latent issues due to dereferencing pointers after testing nullness, as can be seen in the unit tests as well, which may alone explain (2). There's some complexity when adding term equalities where the term is linear, as we also need to add it to `linear_eqs` but `term_eqs` and `linear_eqs` are interested in slightly different normal forms. Reviewed By: skcho Differential Revision: D27331336 fbshipit-source-id: 7314e127a	4 years ago
Jules Villard	8602b709ef	[pulse][arith] change bit shifts by a constant factor into multiplications Summary: When we don't know the value being shifted it may help to translate bit-shifting into multiplication by a constant as it might surface linear terms, eg `x<<1` is `2*x`. Reviewed By: skcho Differential Revision: D27464847 fbshipit-source-id: 9b3b5f0d0	4 years ago
Jules Villard	d1b3e56574	[pulse] cap the size of literals in formulas Summary: On some pathological examples of crypto primitives like libsodium, later diffs make pulse grind to a halt due to an explosion in the size of literals. This is at least partly due to the fact the arithmetic doesn't operate modulo 2^64. Due to the fact the arithmetic is confused in any case when we reach such large numbers, cap them, currently at 2^128. This removes pathological cases for now, even now on libsodium Pulse is ~5 times faster than before! Take this opportunity to put the modified Q/Z modules in the own files. Reviewed By: jberdine Differential Revision: D27463933 fbshipit-source-id: 342d941e2	4 years ago
Jules Villard	55871dd285	[pulse][2/2] generate latent issues when null is allocated Summary: See updated tests and code comments: this changes many arithmetic operations to detect when a contradiction "p\|->- * p=0" is about to be detected, and generate a latent issue instead. It's hacky but it does what we want. Many APIs change because of this so there's some code churn but the overall end result is not much worse thanks to monadic operators. Reviewed By: skcho Differential Revision: D26918553 fbshipit-source-id: da2abc652	4 years ago
Martin Trojer	18f28395e8	[clang] migrate to llvm/clang11 Summary: Update Infer to LLVM (clang) 11.1.0. Infer/clang now uses the LLVM 'monorepo' release, simplifying the download script. Some changes done to how/when ASTExporter mangles names, this to avoid the plugin hitting asserts in the clang code when mangling names. Reviewed By: jvillard Differential Revision: D27006986 fbshipit-source-id: 4d4b6ba05	4 years ago
Gabriela Cunha Sampaio	cba144b779	[pulse] Adapting error messages Summary: Adapting error messages in Pulse so that they become more intuitive for developers. Reviewed By: jvillard Differential Revision: D26887140 fbshipit-source-id: 896970ba2	4 years ago
Jules Villard	4c357e434b	[pulse] apply discovered variable equalities eagerly Summary: This resolves a few instances of false negatives; typically: ``` if (x == y) { // HERE x = 10; y = 44; // THERE } ``` We used to get ``` HERE: &x->v * &y ->v' * v == v' THERE: &x->v * &y ->v' * v == v' * v \|-> 10 * v' \|-> 44 ``` The state at THERE was thus inconsistent and detected as such (v` and `v'` are allocated separately in the heap hence cannot be equal). Now we normalize the state more eagerly and so we get: ``` HERE: &x->v * &y->v THERE: &x->v * &y->v * v \|-> 44 ``` Reviewed By: skcho Differential Revision: D26488377 fbshipit-source-id: 568e685f0	4 years ago
Jules Villard	84d1fd3b52	[pulse] add tests Summary: These tests showcase weaknesses of Pulse w.r.t. detecting issues in situations with 1) pointer aliasing, and 2) pointers null-tests Reviewed By: ezgicicek Differential Revision: D26488145 fbshipit-source-id: 3de230bd2	4 years ago
Jules Villard	a1db290c2e	[pulse] models for folly::Optional::operator{*,->}() Summary: These were present for `std::optional` but not `folly::Optional` for some reason. Reviewed By: da319 Differential Revision: D26450400 fbshipit-source-id: 45051e828	4 years ago
Gabriela Cunha Sampaio	bc49f1deb1	[pulse] Adapting --pulse-model-return-nonnull for Java Summary: The `--pulse-model-return-nonnull` config option currently works for C++. Now we will be using it also for Java. Changing type from string list to regexp to make it more general. Reviewed By: ezgicicek Differential Revision: D26367888 fbshipit-source-id: 9a06b9b32	4 years ago
Sungkeun Cho	27ab8bd253	[pulse] Uninitialized check for struct fields Reviewed By: jvillard Differential Revision: D25371929 fbshipit-source-id: 966f333e3	4 years ago
Jules Villard	f5936689a4	[pulse] case split in model of free(3) Summary: Having different behaviours inter-procedurally and intra-procedurally sounds like a bad design in retrospect. The model of free() should not depend on whether we currently know the value is not null as that means some specs are missing from the summary. Reviewed By: skcho Differential Revision: D26019712 fbshipit-source-id: 1ac4316a5	4 years ago
Sungkeun Cho	89c8e25deb	[frontend] Add tests of using single field struct Summary: When a single field struct is initialized with "type x{v}" form, the translated result is not straightforward. For example, ``` struct t { int val_; }; void foo(t x) { t y{x}; } ``` calls the copy constructor with `x`. This is good. ``` void foo(int n) { t y{n}; } ``` assigns the integer `n` to `y.val_`. This is good. ``` t get_v(); void foo() { t y{get_v()}; } ``` assigns return value of `get_v` to `y.val_`, rather than calling the copy constructor. This is not good, but doesn't matter for actual running; `&y.val_` is the same to `&y` and `t` value is the same to `int` value. Reviewed By: jvillard Differential Revision: D26146578 fbshipit-source-id: 8a81bb1db	4 years ago
Sungkeun Cho	8ed44df7f6	[frontend] Fix incorrect order of statements (negation) Summary: This diff fixes incorrect order of statements on `*p = !b;`. Reviewed By: jvillard Differential Revision: D26125069 fbshipit-source-id: 9dcefbd34	4 years ago
Sungkeun Cho	051473394b	[frontend] Fix incorrect order of statements Summary: This diff fixes incorrect order of statements on assignments. In the translation of `LHS=RHS;`, if `RHS` is a complicated expression that introduced new nodes, eg a conditional expression, some load statements for `LHS` came after its usage. To avoid the issue, this diff forces it to introduce new nodes for `LHS`. Reviewed By: jvillard Differential Revision: D26099782 fbshipit-source-id: 27417cd99	4 years ago
Daiva Naudziuniene	16718384b3	[pulse] Optional Empty Access false positives we want to address Reviewed By: skcho Differential Revision: D25846520 fbshipit-source-id: ae60a8c51	4 years ago
Daiva Naudziuniene	0c6eedc835	[pulse] Model std::__optional_storage_base::has_value Summary: Model ` std::__optional_storage_base::has_value` as this is what we see in clang AST when translating `std::optional::has_value` for libc++. For libstdc++, we get `std::optional::has_value` as expected. Reviewed By: skcho, jvillard Differential Revision: D25585543 fbshipit-source-id: b8d9d2902	4 years ago
Daiva Naudziuniene	b5df1be318	[pulse] Model std::vector:empty() Summary: Skipping the analysis of `std::vector::empty()` caused false positives: in the case where `std::vector::empty()` was called several times ("returning" different values each time), we were not able to prune infeasible paths. Model `std::vector::empty()` as returning the same value every time it is called. Reviewed By: ezgicicek Differential Revision: D23904704 fbshipit-source-id: 52e8a2451	4 years ago
Sungkeun Cho	0cbe2f9b08	[pulse] Uninitialized value check in pulse Summary: This diff adds uninitialized value check in pulse. For now, it supports only simple cases, - declared variables with a type of integer, float, void, and pointer - malloced pointer variables that points to integer, float, void, and pointer TODOs: I will add more cases in the following diffs. - declared/malloced array - declared/malloced struct - inter-procedural checking Reviewed By: jvillard Differential Revision: D25269073 fbshipit-source-id: 317df9a85	4 years ago
Daiva Naudziuniene	0343f5c7d9	[pulse] Remove duplicate `by` from a trace Reviewed By: ezgicicek Differential Revision: D25196418 fbshipit-source-id: c1e504099	4 years ago
Daiva Naudziuniene	4e658903ae	[pulse] Check the validity of the addresses captured by lambda only for captures by reference Summary: To look for captured variable address escape we should only check the validity of the addresses captured by reference. Checking the validity of the address captured by value can cause nullptr dereference false positives. Reviewed By: jvillard Differential Revision: D25219347 fbshipit-source-id: faf6f2b00	4 years ago
Jules Villard	29f3941600	[clang] deal with conditionally-destroyed temporaries Summary: This was left as a TODO before: where to place calls to destructors for C++ temporaries that are only conditionally creating when evaluating an expression. This can happen inside the branches of a conditional operation `b?e:f` or in potentially-short-circuited conditions on the righ-hand side of `&&` and `\|\|` operators. Following the compilation scheme of clang (observed by looking at the generated LLVM bitcode), we instrument the program with "marker" variables, so that for instance `X x = true?X():y;` becomes (following the execution on the true branch): ``` marker1 = 0; // initialize all markers to 0 PRUNE(true) // entering true branch X::X(&temporary); // create temporary... marker1 = 1; // ...triggers setting its marker to 1 X::X(&x, &temporary); // finish expression if (marker1) { X::~X(&temporary); // conditionally destroy the temporary } ``` In this diff, you'll find code for: - associating markers to temporaries that need them - code to initialize markers to 0 before full-expressions - code to conditionally destroy temporaries based on the values of the markers once the full-expression has finished evaluating Reviewed By: da319 Differential Revision: D24954070 fbshipit-source-id: cf15df7f7	4 years ago
Daiva Naudziuniene	019adf7e78	[pulse] Model for folly::Optional::get_pointer Summary: Model `folly::Optional::get_pointer` which returns an address to a value if exists or `nullptr` if empty. Reviewed By: jvillard Differential Revision: D24935677 fbshipit-source-id: 9d990fe07	4 years ago
Jules Villard	f411c7d131	[pulse] do not stop at the first error in function calls Summary: We deliberately stopped as soon as an error was detected when applying a function call. This is not good as other pre/posts of the function may apply cleanly, which would allow us to cover more behaviours of the code. Went on a bit of a refactoring tangeant while fixing this, to clarify the `Ok None`/`Ok Some _`/`Error _` datatype returned by PulseInterproc. Now we report errors as soon as we find them during function calls but continue accumulating specs afterwards. Reviewed By: da319 Differential Revision: D24888768 fbshipit-source-id: d5f2c29d7	4 years ago
Jules Villard	578583f2ab	[pulse] check that new arithmetic facts are consistent with the heap Summary: Communicate new facts from the arithmetic domain to the memory domain to detect contradictions between the two. Reviewed By: jberdine Differential Revision: D24832079 fbshipit-source-id: 2caf8e9af	4 years ago
Jules Villard	e32f6ca360	[clang] fix bad interaction between ConditionalOperator and initializers Summary: This is several inter-connected changes together to keep the tests happy. The ConditionalOperator `b?t:e` is translated by first creating a placeholder variable to temporarily store the result of the evaluation in each branch, then the real thing we want to assign to reads that variable. But, there are situations where that changes the semantics of the expression, namely when the value created is a struct on the stack (eg, a C++ temporary). This is because in SIL we cannot assign the address of a program variable, only its contents, so by the time we're out of the conditional operator we cannot set the struct value correctly anymore: we can only set its content, which we did, but that results in a "shifted" struct value that is one dereference away from where it should be. So a batch of changes concern `conditionalOperator_trans`: - instead of systematically creating a temporary for the conditional, use the `trans_state.var_exp_typ` provided from above if available when translating `ConditionalOperator` - don't even set anything if that variable was already initialized by merely translating the branch expression, eg when it's a constructor - fix long-standing TODO to propagate these initialization facts accurately for ConditionalOperator (used by `init_expr_trans` to also figure out if it should insert a store to the variable being initialised or not) The rest of the changes adapt some relevant other constructs to deal with conditionalOperator properly now that it can set the current variable itself, instead of storing stuff inside a temp variable. This change was a problem because some constructs, eg a variable declaration, will insert nodes that set up the variable before calling its initialization, and now the initialization happens before that setup, in the translation of the inner conditional operator, which naturally creates nodes above the current one. - add a generic helper to force a sequential order between two translation results, forcing node creation if necessary - use that in `init_expr_trans` and `cxxNewExpr_trans` - adjust many places where `var_exp_typ` was incorrectly not reset when translating sub-expressions The sequentiality business creates more nodes when used, and the conditionalOperator business uses fewer temporary variables, so the frontend results change quite a bit. Note that biabduction tests were invaluable in debugging this. There could be other constructs to adjust similarly to cxxNewExpr that were not covered by the tests though. Added tests in pulse that exercises the previous bug. Reviewed By: da319 Differential Revision: D24796282 fbshipit-source-id: 0790c8d17	4 years ago
Daiva Naudziuniene	58f1fd8b32	[pulse] Optional Empty Access for std::optional Reviewed By: jvillard Differential Revision: D24760820 fbshipit-source-id: bedf6aee3	4 years ago
Daiva Naudziuniene	eb4684f6d5	[pulse] Less precise model for constructing optional from value Summary: We recently introduced a more precise model for constructing an optional from a value by making a shallow copy. However, this introduced Use After Delete false positives. For now, we go back to a less precise model by creating a fresh value. A proper model would be to either make a deep copy or call the copy constructor for a value. We will address this in the following diff. Reviewed By: jvillard Differential Revision: D24826749 fbshipit-source-id: 3e5e4edeb	4 years ago
Daiva Naudziuniene	a4241eeb43	[pulse] Refactor Optional models Summary: Refactor `folly::Optional` models to make them easier to reuse for `std::optional` Reviewed By: jvillard Differential Revision: D24760053 fbshipit-source-id: f665e84c8	4 years ago

1 2 3 4 5

214 Commits (0d430efb42fffda741b59bf6399da55e1399ad0e)