[labs] clean up lab document

Reviewed By: jvillard Differential Revision: D5292526 fbshipit-source-id: 7dac1af
9 years ago · cb95bde659
parent e60ff6f370
commit cb95bde659
3 changed files with 31 additions and 23 deletions
--- a/infer/src/labs/ResourceLeakDomain.ml
+++ b/infer/src/labs/ResourceLeakDomain.ml
@ -14,7 +14,7 @@ module L = Logging

 (* Extremely simple abstraction of resources: count the number of acquired resources. If there's
   not a corresponding number of releases, there may be a leak. *)
-type astate = int (* 3(a) *)
+type astate = int (* 2(a) *)

 (* For now, type of abstract state and summary are the same *)
 type summary = astate (* 4(a) *)
--- a/infer/src/labs/ResourceLeaks.ml
+++ b/infer/src/labs/ResourceLeaks.ml
@ -55,12 +55,12 @@ module TransferFunctions (CFG : ProcCfg.S) = struct
    match instr with
    | Call (_return_opt, Direct callee_procname, _actuals, _, _loc) ->
        (* function call [return_opt] := invoke [callee_procname]([actuals]) *)
-        (* (1)(b) *)
+        (* 1(e) *)
        let astate' =
          if acquires_resource callee_procname tenv
-          then astate + 1 (* 3(a) *)
+          then astate + 1 (* 2(a) *)
          else if releases_resource callee_procname tenv
-          then astate - 1 (* 3(a) *)
+          then astate - 1 (* 2(a) *)
          else astate in
        begin
          match Summary.read_summary pdesc callee_procname with
@ -96,7 +96,7 @@ module Analyzer =
 let checker { Callbacks.summary; proc_desc; tenv; } : Specs.summary =
  (* Report an error when we have acquired more resources than we have released *)
  let report leak_count (proc_data : extras ProcData.t) =
-    if leak_count > 0 (* 3(a) *)
+    if leak_count > 0 (* 2(a) *)
    then
      let last_loc = Procdesc.Node.get_loc (Procdesc.get_exit_node proc_data.pdesc) in
      let issue_kind = Localise.to_issue_id Localise.resource_leak in
@ -113,7 +113,7 @@ let checker { Callbacks.summary; proc_desc; tenv; } : Specs.summary =
  let initial = ResourceLeakDomain.initial, IdAccessPathMapDomain.empty in
  match Analyzer.compute_post proc_data ~initial ~debug:false with
  | Some (post, _) ->
-      (* 1(c) *)
+      (* 1(f) *)
      report post proc_data;
      Summary.update_summary (convert_to_summary post) summary
  | None ->
--- a/infer/src/labs/lab.md
+++ b/infer/src/labs/lab.md
@ -1,40 +1,48 @@
-(1) Warm up: running and debugging Infer
-This step assumes you have already downloaded Infer, compiled it successfully, and set up your OCaml dev environment.
+This is a lab exercise designed to take the participant through the basics of using the Infer.AI framework for building compositional abstract interpreters. We provide the skeleton for a simple intraprocedural resource leak analysis. During this exercise, you will identify limitations in the existing approach and work on extending it to a create a more powerful interprocedural analysis.

-(a) Run the analyzer on the tests: `cd infer/tests/codetoanalyze/java/checkers; make test`. The tests should pass.
+This document assumes that you have already [installed](https://github.com/facebook/infer/blob/master/INSTALL.md) Infer, successfully compiled it, and [set up](https://github.com/facebook/infer/blob/master/CONTRIBUTING.md#hacking-on-the-code) your OCaml dev environment (if desired).

-(b) Add a new test method with a resource leak to `Leaks.java`, then run the tests again. The tests should now fail. As the failure message indicates, run `make replace` to update the expected output. The tests should now pass again.
+The `ResourceLeaks.ml` and `ResourceLeaksDomain.ml` files have comments containing hints for which parts of the code you need to change in order to accomplish a particular task (e.g., 1(e) in `ResourceLeaks.ml`).

-(c) Run the analyzer on a single test file to produce the debug HTML: `infer --debug -a checkers --no-default-checkers --resource-leak -- javac Leaks.java`. Then, open the debug HTML: `open infer-out/captured/*.html`. This helpful artifact shows the Infer warnings alongside the code they are complaining about. It also displays the CFG node(s) associated with each line of code. Clicking a CFG node shows the Infer IR associated with the node, and the pre/post state computed while analyzing the instruction. Come back to the debug HTML early and often when you can't understand what your analysis is doing--it will help!
+## (1) Warm up: running, testing, and debugging Infer

-(c) The `Logging.d_str`/`Logging.d_strln` functions print to the debug HTML. The logging is contextual: it will print to the log for the CFG node that's being analyzed at the time the logging statement execute. Add a `Logging.d_strln` inside of the case for `Call`, recompile/re-run, and try to find the text you printed inside the appropriate CFG node log in the debug HTML.
+(a) Change to the test directory (`cd infer/tests/codetoanalyze/java/checkers`) and run the checker on a single file: `infer -a checkers --resource-leak -- javac Leaks.java`. Infer should report N bugs.

-(d) The `L.stdout`/`L.stderr` and functions print to the command line. This can be useful for printing information that doesn't happen in the context of a particular CFG node (e.g., performing post-processing on a summary). Add a `Logging.stderr` to the `compute_post` function, recompile/re-run and see your text printed on the command line.
+(b) Run the tests with `make test`. The tests work by producing an output file `issues.exp.test`, which is compared against the expected output file `issues.exp`. Tests pass when the two files are identical. You should see no output, which indicates that the tests have passed.

-(2) Reasoning about abstract domains
+(c) Add a new test method containing a resource leak to `Leaks.java`, then run the tests again. The tests should now fail, and the error message should indicate that the failure is due to the new lreeak that you added. As the message suggests, run `make replace` to update the expected output (this simply copies the actual output `issues.exp.test` to the expected output `issues.exp`). Re-run the tests; they should now pass again.
+
+(d) Run the analyzer on a single test file to produce the debug HTML: `infer --debug -a checkers --no-default-checkers --resource-leak -- javac Leaks.java`. Then, open the debug HTML: `open infer-out/captured/*.html`. This helpful artifact shows the Infer warnings alongside the code they are complaining about. It also displays the CFG node(s) associated with each line of code. Clicking a CFG node shows the Infer IR associated with the node, and the pre/post state computed while analyzing the instruction. Come back to the debug HTML early and often when you can't understand what your analysis is doing--it will help!
+
+(e) The `Logging.d_str`/`Logging.d_strln` functions print to the debug HTML. The logging is contextual: it will print to the log for the CFG node that's being analyzed at the time the logging statement execute. Try adding `Logging.d_strln "Hi Infer";` inside of the case for `Call`, recompiling/re-running. You should see the text you printed inside the appropriate CFG node log in the debug HTML.
+
+(f) The `Logging.progress` function prints to stderr. This can be useful for printing information that doesn't happen in the context of a particular CFG node (e.g., performing post-processing on a summary). Try adding `Logging.progress "Hi Infer;"` to the `compute_post` function, recompile/re-run and see your text printed on the command line.
+
+## (2) Reasoning about abstract domains

 (a) Look at the `ResourceLeakDomain.ml`. There's a termination bug in the abstract domain--do you see it?.
- Write a test method that makes the analyzer diverge.
+- Write a test method that makes the analyzer diverge (Note: this may manifest as a stack overflow that crashes Infer rather than geniune divergence).
 - Fix the domain to make your test pass and ensure that the analyzer will always terminate. Make sure your fix is sufficiently general (remember: the allocation count can be negative :)).
+- Hint: `AbstractDomain.TopLifted` may be useful for this.

 (b) Think about the concretization of the resource count. What does a resource count of zero mean? Is there a concrete state in the concretization of "Resource count zero" that leaks a resource? Write a simple test method `FN_leakBad` that will produce this concrete state (that is, a false negative test where the program leaks a resource, but the analyzer doesn't catch it)?

-(c) In addition, there are programs that do not leak resources that the analyzer will flag. Write a simple test method `FP_noleakOk` that exhibits this problem (that is, a false positive test that demonstrates the imprecision of the analyzer)?
+(c) In addition, there are programs that do not leak resources that the analyzer will flag. Write a simple test method `FP_noLeakOk` that exhibits this problem (that is, a false positive test that demonstrates the imprecision of the analyzer)?

 (d) Do your false negative and false positive examples have a common underlying cause? Can the domain be improved to address them?

-(3) Improving the domain
+## (3) Improving the domain

-(a) Change the simple counting domain to a domain that overapproximates the set of storage locations that hold a resource. As a concrete goal, the new domain should allow you to print the name of the resource(s) that leak in the error message (rather than just the number of resources). The new domain should also allow your analysis to get the correct answer on your false negative and false positive tests from (2)(b) and (2)(c). Think about the following questions when designing your new domain:
+(a) Change the simple counting domain to a domain that overapproximates the set of storage locations that hold a resource. As a concrete goal, the new domain should allow you to print the name of the resource(s) that leak in the error message (rather than just the number of resources). The new domain should also allow your analysis to get the correct answer on your false negative and false positive tests from 2(b) and 2(c). Think about the following questions when designing your new domain:
 - How should we abstract storage locations? Is abstracting the stack program variables (`Var.t`)'s enough, or do we need an abstraction of the heap as well?
 - How will we handle aliasing (storage locations that store the same address?)
- Will it be easy to extend the domain to incorporate information from the callee summaries/represent state that will be instantiated by callers (see (3))?
- Some code that might be useful in creating your new domain (depending on what approach you choose): `AbstractDomain.FiniteSet`, `AbstractDomain.Map`, `AccessPath`, `Var`.
+- Will it be easy to extend the domain to incorporate information from the callee summaries/represent state that will be instantiated by callers (see 4)?
+- Some modules that might be useful in creating your new domain (depending on what approach you choose): `AbstractDomain.FiniteSet`, `AbstractDomain.Map`, `AccessPath`, `Var`.

 (b) Write some tests that demonstrate the limitations of your new domain: both false positives (names prefixed with `FP_` and false negatives (prefixed with `FN_`).

-(4) Interprocedural analysis
-At this point, you likely have a fairly advanced intraprocedural analysis that is capable of finding resource leaks in real programs. Feel free to skip to (5) if you are eager to get started with bugfinding; many real resource leak bugs are intraprocedural. Alternatively, continue on with (4) to make your analysis interprocedural.
+## (4) Interprocedural analysis
+At this point, you likely have a fairly advanced intraprocedural analysis that is capable of finding resource leaks in real programs. Feel free to skip to 5 if you are eager to get started with bugfinding; many real resource leak bugs are intraprocedural. Alternatively, continue on with 4 to make your analysis interprocedural.

 (a) Augment the summary type with state to indicate when the current procedure returns a resource. Allowing a resource to escape to the caller should not be considered a leak. This improvement should allow your analysis to pass the following tests:

@ -86,7 +94,7 @@ void closeOnlyOneBad() {

 Hint: you might find the `FormalMap.t` stored in `proc_data.extras` useful for this. This module lets you go back and forth between the index of a formal and its name. This utility module is also used in `ThreadSafety` and `TaintAnalysis` modules.

-(5) Making it practical
+## (5) Making it practical

 (a) Real resource leaks frequently involve failing to close resources along exceptional control-flow paths. For simplicity, the initial version of the current analysis uses a filtered view of the CFG that skips exceptional edges (`ProcCfg.Normal`). To find more bugs, you might want to switch to using `ProcCfg.Exceptional` and make sure that your analysis gets the right answer on some realistic exception examples like: