[cost][doc] Add documentation for cost analysis

Summary: Finally, we add some preliminary documentation for cost analysis. Reviewed By: jvillard Differential Revision: D22043585 fbshipit-source-id: 1d3896a4e
5 years ago · 753b909bfa
parent 415824ac0e
commit 753b909bfa
8 changed files with 160 additions and 8 deletions
--- a/infer/documentation/checkers/Cost.md
+++ b/infer/documentation/checkers/Cost.md
@ -0,0 +1,74 @@
+Cost analysis statically estimates an upper bound on the worst-case execution cost of a program (WCET). This page gives an overview of how the analysis works for *Java* code. The analyser also has limited support for C/C++ and Objective-C.
+
+To run the analysis, you can use run `infer --cost` (which will run cost analysis along with other
+analyses that are run by default) or `infer --cost-only` (which will only run cost analysis).
+
+For example, the command `infer --cost-only -- javac File.java` will run
+cost analysis on File.java.
+
+
+## How the analysis works
+
+
+Most ideas behind this analysis is based on Stefan Bydge's PhD thesis [Static WCET Analysis based on Abstract Interpretation and Counting of Elements](https://www.semanticscholar.org/paper/Static-WCET-Analysis-Based-on-Abstract-and-Counting-Bygde/ee5157164d497725c1f42dc6c475a59a87c99957).
+
+The analysis computes two things for each node in the CFG:
+- the cost of its instructions, i.e. how much one execution of this node costs,
+- how many times it can be executed.
+
+The total cost of the node is the scalar product of these two vectors. Then, these are passed to a constraint solver that computes the execution cost of the procedure based on the incoming/outgoing edges.
+
+
+At a high level, the analysis has three steps:
+- Choose control variables that allude to "how many times a loop may iterate".
+- Get abstract ranges of the control variables from [InferBO](checker-bufferoverrun) (a numerical analysis that infers symbolic intervals)
+- Construct complexity polynomials for loops and functions by via a constraint solving algorithm.
+
+
+
+## Examples 
+
+Infer’s cost analysis statically estimates the execution cost of a
+program without running the code. For instance, assume that we had the
+following program:
+
+```java
+void loop(ArrayList<Integer> list){
+  for (int i = 0; i <= list.size(); i++){
+  }
+}
+```
+
+For this program, Infer statically infers a polynomial (e.g. `8|list|+16`) for the execution cost of this program by giving each instruction in Infer's intermediate language a symbolic cost (where `|.|` refers to the length of a list). Here---overlooking the actual constants---the analysis infers that this program’s asymptotic complexity is `O(|list|)`, that is loop is linear in the size of its input list. Then, at diff time, if a developer modifies this code to,
+
+```java
+void loop(ArrayList<Integer> list){
+  for (int i = 0; i <= list.size(); i++){
+    foo(i); // newly added function call
+  }
+}
+```
+
+where `foo` has a linear cost in its parameter, then Infer automatically detects that the complexity of loop has increased from `O(|list|)` to `O(|list|^2)` and then reports an [`EXECUTION_TIME_COMPLEXITY_INCREASE`](execution_time_complexity_increase) issue.    
+
+
+
+Unlike other Infer analyses (which report found issues/bugs when running infer once), cost analysis only reports an issue for differential analysis (i.e. when comparing the analysis results on the original and the modified files). Instead, infer writes the execution cost of the program into `infer-out/costs-report.json` file. For each procedure, `costs-report.json` includes the actual polynomial (for the exection cost) along with the degree of the polynomial, the procedure name, line number etc.
+
+Differential cost analysis in action:
+- first run infer's cost analysis on `File.java` and rename `costs-report.json` (which is in `/infer-out`) to `previous-costs-report.json`
+- modify the function as shown above
+- re-run infer on `File.java` and rename  `costs-report.json` to `current-costs-report.json`
+- run `infer reportdiff --costs-current current-costs-report.json  --costs-previous current-costs-report`. 
+- Inspect `infer-out/differential/introduced.json` to see the newly found complexity increase issue(s).
+
+
+## Limitations
+
+There are a number of known limitations to the design of the static cost analysis: 
+
+- InferBo's intervals are limited to affine expressions, not full-blown polynomials. Hence, we can automatically infer bounds involving square roots.
+
+- We do not handle recursion.
+
+- If the execution cost of a program depends on an unknown call (e.g. an unmodeled library calls), we can't compute a static upper bound and return T (unknown cost).
--- a/infer/documentation/issues/EXECUTION_TIME_COMPLEXITY_INCREASE.md
+++ b/infer/documentation/issues/EXECUTION_TIME_COMPLEXITY_INCREASE.md
@ -0,0 +1,7 @@
+Infer reports this issue when the execution time complexity of a
+program increases in degree: e.g. from constant to linear or from
+logarithmic to quadratic. This issue type is only reported in
+differential mode: i.e when we are comparing the analysis results of
+two runs of infer on a file.
+
+
--- a/infer/documentation/issues/EXECUTION_TIME_COMPLEXITY_INCREASE_UI_THREAD.md
+++ b/infer/documentation/issues/EXECUTION_TIME_COMPLEXITY_INCREASE_UI_THREAD.md
@ -0,0 +1,11 @@
+Infer reports this issue when the execution time complexity of the procedure increases in degree **and** the procedure runs on the UI (main) thread.
+
+Infer considers a method as running on the UI thread whenever:
+
+- The method, one of its overrides, its class, or an ancestral class, is
+  annotated with `@UiThread`.
+- The method, or one of its overrides is annotated with `@OnEvent`, `@OnClick`,
+  etc.
+- The method or its callees call a `Litho.ThreadUtils` method such as
+  `assertMainThread`.
+
--- a/infer/documentation/issues/EXECUTION_TIME_UNREACHABLE_AT_EXIT.md
+++ b/infer/documentation/issues/EXECUTION_TIME_UNREACHABLE_AT_EXIT.md
@ -0,0 +1,16 @@
+This issue type indicates that the program's execution doesn't reach
+the exit node. Hence, we cannot compute a static bound for the
+procedure.
+
+
+Examples:
+```java
+void exit_unreachable() {
+  exit(0); // modeled as unreachable
+}
+
+
+void infeasible_path_unreachable() {
+    Preconditions.checkState(false); // like assert false, state pruned to bottom
+}
+```
--- a/infer/documentation/issues/INFINITE_EXECUTION_TIME.md
+++ b/infer/documentation/issues/INFINITE_EXECUTION_TIME.md
@ -0,0 +1,19 @@
+This warning indicates that Infer was not able to determine a static
+upper bound on the execution cost of the procedure. By default, this
+issue type is disabled.
+
+
+For instance, Inferbo's interval analysis is limited to affine
+expressions. Hence, we can't statically estimate an upper bound on the
+below example and obtain T(unknown) cost:
+```java
+// Expected: square root(x), got T
+void square_root_FP(int x) {
+ int i = 0;
+ while (i * i < x) {
+   i++;
+ }
+}
+```
+
+Consequently, we report an `INFINITE_EXECUTION_TIME`, corresponding to the biggest bound T.
--- a/infer/src/base/Checker.ml
+++ b/infer/src/base/Checker.ml
@ -121,7 +121,10 @@ let config_unsafe checker =
      ; activates= [BufferOverrunAnalysis] }
  | Cost ->
      { id= "cost"
-      ; kind= UserFacing {title= "Cost: Runtime Complexity Analysis"; markdown_body= ""}
+      ; kind=
+          UserFacing
+            { title= "Cost: Runtime Complexity Analysis"
+            ; markdown_body= [%blob "../../documentation/checkers/Cost.md"] }
      ; support= supports_clang_and_java
      ; short_documentation=
          "Computes the time complexity of functions and methods. Can be used to detect changes in \
--- a/infer/src/base/IssueType.ml
+++ b/infer/src/base/IssueType.ml
@ -51,6 +51,7 @@ module Unsafe : sig

  val register_from_string :
       ?enabled:bool
+    -> ?is_cost_issue:bool
    -> ?hum:string
    -> ?doc_url:string
    -> ?linters_def_file:string
@ -121,8 +122,9 @@ end = struct
        but issues of type 2. have not yet been defined. Thus, we record only there [enabled] status
        definitely. The [hum]an-readable description can be updated when we encounter the definition
        of the issue type, eg in AL. *)
-  let register_from_string ?(enabled = true) ?hum:hum0 ?doc_url ?linters_def_file ~id:unique_id
-      ?(visibility = User) ?user_documentation default_severity checker =
+  let register_from_string ?(enabled = true) ?(is_cost_issue = false) ?hum:hum0 ?doc_url
+      ?linters_def_file ~id:unique_id ?(visibility = User) ?user_documentation default_severity
+      checker =
    match find_from_string ~id:unique_id with
    | ((Some
         ( { unique_id= _ (* we know it has to be the same *)
@ -150,11 +152,11 @@ end = struct
            ~old:(string_of_visibility visibility_old)
            ~new_:(string_of_visibility visibility) ;
        ( match user_documentation with
-        | None ->
-            ()
-        | Some user_documentation ->
+        | Some user_documentation when not is_cost_issue ->
            L.die InternalError "Unexpected user documentation for issue type %s:@\n@\n%s@\n"
-              unique_id user_documentation ) ;
+              unique_id user_documentation
+        | _ ->
+            () ) ;
        issue.default_severity <- default_severity ;
        Option.iter hum0 ~f:(fun hum -> issue.hum <- hum) ;
        if Option.is_some doc_url then issue.doc_url <- doc_url ;
@ -177,12 +179,31 @@ end = struct
        issue


+  let cost_issue_doc_list =
+    [ ( "EXECUTION_TIME_COMPLEXITY_INCREASE"
+      , [%blob "../../documentation/issues/EXECUTION_TIME_COMPLEXITY_INCREASE.md"] )
+    ; ( "EXECUTION_TIME_COMPLEXITY_INCREASE_UI_THREAD"
+      , [%blob "../../documentation/issues/EXECUTION_TIME_COMPLEXITY_INCREASE_UI_THREAD.md"] )
+    ; ( "EXECUTION_TIME_UNREACHABLE_AT_EXIT"
+      , [%blob "../../documentation/issues/EXECUTION_TIME_UNREACHABLE_AT_EXIT.md"] )
+    ; ("INFINITE_EXECUTION_TIME", [%blob "../../documentation/issues/INFINITE_EXECUTION_TIME.md"])
+    ]
+
+
  (** cost issues are already registered below.*)
  let register_from_cost_string ?(enabled = true) ?(is_on_ui_thread = false) ~(kind : CostKind.t) s
      =
    let issue_type_base = F.asprintf s (CostKind.to_issue_string kind) in
    let issue_type = if is_on_ui_thread then issue_type_base ^ "_UI_THREAD" else issue_type_base in
-    register_from_string ~enabled ~id:issue_type Error Cost
+    let user_documentation =
+      match List.find cost_issue_doc_list ~f:(fun (s, _doc) -> String.equal s issue_type) with
+      | Some (_, doc) ->
+          doc
+      | None ->
+          L.die InternalError
+            "Unexpected cost issue %s: either the issue is not enabled or unknown." issue_type
+    in
+    register_from_string ~is_cost_issue:true ~enabled ~id:issue_type Error Cost ~user_documentation


  let all_issues () = IssueSet.elements !all_issues
--- a/infer/src/base/IssueType.mli
+++ b/infer/src/base/IssueType.mli
@ -47,6 +47,7 @@ val find_from_string : id:string -> t option

 val register_from_string :
     ?enabled:bool
+  -> ?is_cost_issue:bool
  -> ?hum:string
  -> ?doc_url:string
  -> ?linters_def_file:string