infer_clone

13 KiB

Raw Blame History

id	title
absint-framework	Building checkers with the Infer.AI framework

Infer.AI is a framework for quickly developing abstract interpretation-based checkers (intraprocedural or interprocedural). You define only:

(1) An abstract domain (type of abstract state plus <=, join, and widen operations)

(2) Transfer functions (a transformer that takes an abstract state as input and produces an abstract state as output)

and then you have an analysis that can run on all of the languages Infer supports (C, Obj-C, C++, and Java)!

This guide covers how to use the framework. For background on why we built the framework and how it works, check out these slides from a PLDI 2017 tutorial and this talk from @Scale2016.

If you feel like coding instead of reading, a great way to get started with Infer.AI is to go through the lab exercise here.

By example: intraprocedural analysis

This section helps you get started ASAP if you already understand abstract interpretation (or don't, but are feeling bold).

Take a look at liveness.ml. This code is performing a compilers-101 style liveness analysis over SIL, Infer's intermediate language. Since this code is fairly small and you should already understand what it's trying to do, it's a fairly good place to look in order to understand both how to use the abstract interpretation framework and what SIL is.

There are basically three important bits here: defining the domain, defining the transfer functions, and then passing the pieces to the framework to create an an analysis. Let's break down the third bit:

module Analyzer =
  AbstractInterpreter.Make
    (ProcCfg.Backward(ProcCfg.Exceptional))
    (TransferFunctions)

The ProcCfg.Backward(ProcCfg.Exceptional) part says: "I want the direction of iteration to be backward" (since liveness is a backward analysis), and "I want to the analysis to follow exceptional edges". For a forward analysis that ignores exceptional edges, you would do ProcCfg.Normal instead (and many other combinations are possible; take a look at ProcCfg.mli for more). And finally, the TransferFunctions part says "Use the transfer functions I defined above".

Now you have an Analyzer module that exposes useful functions like compute_post (take a procedure as input and compute a postcondition) and exec_pdesc (take a procedure and compute an invariant map from node id's to the pre/post at each node). The next step is to hook your checker up to the Infer CLI. For the liveness analysis, you would do this by exposing a function for running the checker on a single procedure:

let checker { Callbacks.proc_desc; tenv; } =
  match Analyzer.compute_post (ProcData.make_default proc_desc tenv) with
  | Some post -> Logging.progress "Computed post %a for %a" Analyzer.Domain.pp post Typ.Procname.pp (Procdesc.get_proc_name proc_desc);
  | None -> ()

and then adding Liveness.checker, checkers_enabled to the list of registered checkers here.

you can then run infer run -a checkers -- <your_build_command> to run your checker on real code. See here for more details on the build systems supported by Infer.

Other examples of simple intraprocedural checkers are addressTaken.ml and copyPropagation.ml.

Basic error reporting

Useful analyses have output. Basic printing to stderr or stderr is good for debugging, but to report a programmer-readable error that is tied to a source code location, you'll want to use Reporting.log_error. Some examples of error-logging code: 1, 2, or 3.

By example: interprocedural analysis

Let's assume you have already read and understood the "intraprocedural analysis" section and have an intraprocedural checker. The abstract interpretation framework makes it easy to convert your intraprocedural analysis into a modular interprocedural analysis. Let me emphasize the modular point once more; global analyses cannot be expressed in this framework.

To make your checker interprocedural, you need to:

(1) Define the type of procedure summaries for your analysis and add some boilerplate for storing your data alongside the summaries for other analyses

(2) Add logic for (a) using summaries in your transfer functions and (b) converting your intraprocedural abstract state to a summary.

A good example to look at here is siof.ml. Step (1) is just:

module Summary = Summary.Make (struct
    type summary = SiofDomain.astate

    let update_payload astate payload =
      { payload with Specs.siof = Some astate }

    let read_from_payload payload =
      payload.Specs.siof
  end)

along with adding the Specs.siof field to the Specs.payload record type. Here, the type of the abstract state and the type of the summary are the same, which makes things easier for us (no logic to convert an abstract state to a summary).

Part (2a) is here:

match Summary.read_summary pdesc callee_pname with

This says: "read the summary for callee_pname from procedure pdesc with type environment tenv". You must then add logic for applying the summary to the current abstract state (often, this is as simple as doing a join).

Because our summary type is the same as the abstract state, part (2b) can be done for us by making use of the convenient AbstractInterpreter.Interprocedural functor (for an example of what to do when the types are different, take a look at Quandary):

module Interprocedural = Analyzer.Interprocedural (Summary)

This Interprocedural module will automatically do the work of computing and storing the summary for us. All we need to do is change the exposed checker function registered in registerCheckers.ml to call Interprocedural.checker instead:

let checker callback =
  ignore(Interprocedural.checker callback ProcData.empty_extras in)

That's it! We now have an interprocedural analysis.

One very important note here: a current (and soon-to-be-lifted) limitation prevents us from running multiple interprocedural checkers at the same time. If you register an interprocedural checker, be sure to unregister the other other ones. Otherwise, there's a risk that the checkers will clobber each other's results.

Relevant code

Some pointers to useful code for building new analyses, and to the implementation of the framework for the interested:

Domain combinators:

AbstractDomain.BottomLifted, AbstractDomain.FiniteSet, AbstractDomain.Map, AbstractDomain.Pair (all in AbstractDomain)

Domains and domain building blocks:

Reporting errors with interprocedural traces:

Examples: SiofTrace.ml, JavaTrace.ml, CppTrace.ml.
Implementation: Trace

Implementation:

IR basics: SIL, CFG's, `tenv`'s, `procdesc`'s, and `procname`'s

All of the languages analyzed by Infer are converted into a common intermediate representation. A program is represented as a control-flow graph (CFG) whose nodes contain lists of instructions in the SIL language. SIL is a small low-level language that has some similarities with C, LLVM IR, and Boogie.

Expressions are literals, program variables (Pvar's), temporary variables (Ident's), a field offset from a struct (OO features like objects are lowered into struct's), or an index offset from an array.

There are four interesting kinds of instructions: Load for reading into a temporary variable, Store for writing to a program variable, field of a struct, or an array, Prune e (often called assume in other PL formalisms) blocks execution unless the expression e evaluates to true, and Call represents function calls.

Instructions and expressions have types. A Tstruct (think: object) type has a Typename, and it is often useful to look up metadata about the type (what fields does it have, what methods does it declare, what is its superclass, etc.) in the type environment, or tenv.

A procedure description or procdesc (sometimes abbreviated pdesc) is an abstraction of a procedure declaration: it stores the CFG of the procedure, its signature, its annotations, and so on.

A procedure name or procname (sometimes abbreviated pname) is an abstraction of a called procedure name. One procname may correspond to multiple (or zero) procdesc's after resolution.

Framework-specific IR: `ProcCFG`, `ProcData`, and `extras`

The abstract interpretation framework has a few additional constructs that are worth explaining.

A ProcCfG represents the CFG of a single procedure whereas (perhaps confusingly) a Cfg is the CFG for an entire file. A ProcCfg is really a customizable view of the underlying procedure CFG; we can get a view the CFG with its edges backward (ProcCfg.Backward), with or without exceptional edges (Normal/Exceptional, respectively), or with each node holding at most one instruction (OneInstrPerNode).

ProcData is a container that holds all of the read-only information required to analyze a single procedure: procedure description, and extras. The extras are custom read-only data that are computed before analysis begins, and can be accessed from the transfer functions. Most often, no extras are required for analysis (ProcData.empty_extras), but it can be useful to stash information like a map from a formal to its index or an invariant map from a prior analysis in the extras.

How it works

Coming soon.

Intro: abstract interpretation

Coming soon.

How do I make an analysis compositional?

Coming soon.

13 KiB Raw Blame History

By example: intraprocedural analysis

Basic error reporting

By example: interprocedural analysis

Relevant code

IR basics: SIL, CFG's, tenv's, procdesc's, and procname's

Framework-specific IR: ProcCFG, ProcData, and extras

How it works

Intro: abstract interpretation

How do I make an analysis compositional?

13 KiB

Raw Blame History

IR basics: SIL, CFG's, `tenv`'s, `procdesc`'s, and `procname`'s

Framework-specific IR: `ProcCFG`, `ProcData`, and `extras`