From 64ba97a3e313f2a4782bb0c0e7ffd06372aea2ea Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Ezgi=20=C3=87i=C3=A7ek?= Date: Thu, 18 Jun 2020 03:16:44 -0700 Subject: [PATCH] [doc] Add documentation for (im)purity Reviewed By: jvillard Differential Revision: D22090433 fbshipit-source-id: b84d6dfb4 --- infer/documentation/checkers/Impurity.md | 1 + infer/documentation/checkers/Purity.md | 46 +++++++++++++++++++ infer/documentation/issues/IMPURE_FUNCTION.md | 10 ++++ infer/documentation/issues/PURE_FUNCTION.md | 33 +++++++++++++ infer/src/base/Checker.ml | 8 +++- infer/src/base/IssueType.ml | 10 +++- 6 files changed, 104 insertions(+), 4 deletions(-) create mode 100644 infer/documentation/checkers/Impurity.md create mode 100644 infer/documentation/checkers/Purity.md create mode 100644 infer/documentation/issues/IMPURE_FUNCTION.md create mode 100644 infer/documentation/issues/PURE_FUNCTION.md diff --git a/infer/documentation/checkers/Impurity.md b/infer/documentation/checkers/Impurity.md new file mode 100644 index 000000000..35539b40e --- /dev/null +++ b/infer/documentation/checkers/Impurity.md @@ -0,0 +1 @@ +This is an experimental inter-procedural analysis that detects impure functions. It is meant to be an improvement over the [purity](/docs/next/checker-purity) analysis with a negation on the issue types. For each function, impurity analysis keeps track of not only the impurity of the function but also some additional information such as which parameters/globals the function modifies. It models functions with no summary/model as impure. The analysis relies on [Pulse](/docs/next/checker-pulse) summaries to determine impurity. diff --git a/infer/documentation/checkers/Purity.md b/infer/documentation/checkers/Purity.md new file mode 100644 index 000000000..2313160cd --- /dev/null +++ b/infer/documentation/checkers/Purity.md @@ -0,0 +1,46 @@ +This is an experimental inter-procedural analysis that detects pure (side-effect free) functions. For each function, purity analysis keeps track of not only the purity of the function but also some additional information such as whether the function modifies a global variable or which of the parameters are modified. It models functions with no summary/model as modifying the global state (hence impure). + +If the function is pure (i.e. doesn't modify any global state or its parameters and doesn't call any unknown functions), then it reports an [`PURE_FUNCTION`](/docs/next/all-issue-types#pure_function) issue. + + +## Weaknesses + +There are two issues with the existing purity analysis: +- In order to detect which parameters are modified, we need an alias analysis which is difficult to get right. +- Just keeping track of modified arguments doesn't suffice. + +Too see the issue with the first point, consider the following simple program: + +```java +void foo(Foo a){ + Foo b = a; + b.x = 10; +} +``` + +in order to determine that `foo` is impure, we need to know that the write to `b`'s field is actually changing the function parameter `a`, i.e. we need to check if `b` is aliasing `a`. This is known as alias analysis and is hard to get right in a scalable manner. When this analysis was being developed, Infer didn't have a unified alias analysis and using biabduction seemed like a too daunting task at the time. Hence, we relied on [InferBo](/docs/next/checker-bufferoverrun)'s aliasing mechanism which was easy to invoke and integrate with. However, InferBo's aliasing analysis is far from perfect and causes issues for purity. +To see the issue with the second point, consider the following program: + +```java +boolean contains(Integer i, ArrayList list){ + Iterator listIterator = list.iterator(); + while(listIterator.hasNext()) { + Integer el = listIterator.next(); + if (i.equals(el)){ + return true; + } + } + return false; + } +``` + +The existing purity analysis concludes that the above function `contains` is impure because it calls an impure function `next()` which modifies the iterator (hence it thinks it also modifies the `list`). However, notice that `contains` doesn't have an observable side-effect: `list.iterator()` returns a new object, `hasNext()` and `equals()` are pure, and `next()` only modifies the fields of the fresh object `listIterator`. Therefore, `contains` should be considered as pure. + + +To alleviate this problem, we have developed an [Impurity](/docs/next/checker-impurity) analysis which uses [pulse](/docs/next/checker-pulse) which can successfully analyze this program as pure \o/ + + +The analysis is used by: + +- [Loop-hoisting](/docs/next/checker-loop-hoisting) analysis which identifies loop-invariant function calls, i.e. functions that are pure and have loop-invariant arguments. +- [Cost](/docs/next/checker-cost) analysis which identifies control variables in the loop that affect how many times a loop is executed. In this computation, we need to prune control variables that do not affect how many times a loop is executed. In this pruning step, we need to compute loop-invariant variables (which requires the above analysis). diff --git a/infer/documentation/issues/IMPURE_FUNCTION.md b/infer/documentation/issues/IMPURE_FUNCTION.md new file mode 100644 index 000000000..647e0bdd7 --- /dev/null +++ b/infer/documentation/issues/IMPURE_FUNCTION.md @@ -0,0 +1,10 @@ +This issue type indicates impure functions. For instance, below functions would be marked as impure: +```java +void makeAllZero_impure(ArrayList list) { + Iterator listIterator = list.iterator(); + while (listIterator.hasNext()) { + Foo foo = listIterator.next(); + foo.x = 0; + } +} +``` diff --git a/infer/documentation/issues/PURE_FUNCTION.md b/infer/documentation/issues/PURE_FUNCTION.md new file mode 100644 index 000000000..b73d9e36a --- /dev/null +++ b/infer/documentation/issues/PURE_FUNCTION.md @@ -0,0 +1,33 @@ +This issue type indicates pure functions. For instance, below functions would be marked as pure: + +```java +int local_write_pure(int x, int y) { + int k = x + y; + k++; + return k; +} + +// no change to outside state, the local allocation is ok. +int local_alloc_pure(ArrayList list) { + ArrayList list_new = new ArrayList(); + for (Integer el : list) { + list_new.add(el); + } + return list_new.size(); +} +``` + +However, the following ones would not be pure: + +```java +void swap_impure(int[] array, int i, int j) { + int tmp = array[i]; + array[i] = array[j]; // modifying the input array + array[j] = tmp; +} + +int a = 0; +void set_impure(int x, int y) { + a = x + y; //modifying a global variable +} +``` diff --git a/infer/src/base/Checker.ml b/infer/src/base/Checker.ml index 0f1d4dfbc..a802d6b45 100644 --- a/infer/src/base/Checker.ml +++ b/infer/src/base/Checker.ml @@ -186,7 +186,9 @@ let config_unsafe checker = ; activates= [] } | Impurity -> { id= "impurity" - ; kind= Internal + ; kind= + UserFacing + {title= "Impurity"; markdown_body= [%blob "../../documentation/checkers/Impurity.md"]} ; support= supports_clang_and_java_experimental ; short_documentation= "Detects functions with potential side-effects. Same as \"purity\", but implemented on \ @@ -278,7 +280,9 @@ let config_unsafe checker = ; activates= [] } | Purity -> { id= "purity" - ; kind= Internal + ; kind= + UserFacing + {title= "Purity"; markdown_body= [%blob "../../documentation/checkers/Purity.md"]} ; support= supports_clang_and_java_experimental ; short_documentation= "Detects pure (side-effect-free) functions. A different implementation of \"impurity\"." diff --git a/infer/src/base/IssueType.ml b/infer/src/base/IssueType.ml index e88067e2c..77273dc35 100644 --- a/infer/src/base/IssueType.ml +++ b/infer/src/base/IssueType.ml @@ -613,7 +613,10 @@ let guardedby_violation_racerd = ~user_documentation:[%blob "../../documentation/issues/GUARDEDBY_VIOLATION.md"] -let impure_function = register_from_string ~id:"IMPURE_FUNCTION" Error Impurity +let impure_function = + register_from_string ~id:"IMPURE_FUNCTION" Error Impurity + ~user_documentation:[%blob "../../documentation/issues/IMPURE_FUNCTION.md"] + let inefficient_keyset_iterator = register_from_string ~id:"INEFFICIENT_KEYSET_ITERATOR" Error InefficientKeysetIterator @@ -811,7 +814,10 @@ let pulse_memory_leak = ~user_documentation:[%blob "../../documentation/issues/MEMORY_LEAK.md"] -let pure_function = register_from_string ~id:"PURE_FUNCTION" Error Purity +let pure_function = + register_from_string ~id:"PURE_FUNCTION" Error Purity + ~user_documentation:[%blob "../../documentation/issues/PURE_FUNCTION.md"] + let quandary_taint_error = register_from_string ~hum:"Taint Error" ~id:"QUANDARY_TAINT_ERROR" Error Quandary