You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

298 lines
13 KiB

---
id: absint-framework
title: Building checkers with the Infer.AI framework
---
Infer.AI is a framework for quickly developing abstract interpretation-based
checkers (intraprocedural or interprocedural). You define only:
(1) An abstract domain (type of abstract state plus `<=`, `join`, and `widen`
operations)
(2) Transfer functions (a transformer that takes an abstract state as input and
produces an abstract state as output)
and then you have an analysis that can run on all of the languages Infer
supports (C, Obj-C, C++, and Java)!
This guide covers how to use the framework. For background on why we built the
framework and how it works, check out these
[slides](http://fbinfer.com/downloads/pldi17-infer-ai-tutorial.pdf) from a PLDI
2017 tutorial and this
[talk](https://atscaleconference.com/videos/getting-the-most-out-of-static-analyzers)
from @Scale2016.
**If you feel like coding instead of reading, a great way to get started with
Infer.AI is to go through the lab exercise
[here](https://github.com/facebook/infer/blob/master/infer/src/labs/lab.md).**
## By example: intraprocedural analysis
This section helps you get started ASAP if you already understand
[abstract interpretation](http://www.di.ens.fr/~cousot/AI/IntroAbsInt.html) (or
don't, but are feeling bold).
Take a look at
[liveness.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/liveness.ml).
This code is performing a compilers-101 style liveness analysis over
[SIL](#ir-basics-sil-cfgs-tenvs-procdescs-and-procnames), Infer's intermediate
language. Since this code is fairly small and you should already understand what
it's trying to do, it's a fairly good place to look in order to understand both
how to use the abstract interpretation framework and what SIL is.
There are basically three important bits here: defining the domain, defining the
transfer functions, and then passing the pieces to the framework to create an an
analysis. Let's break down the third bit:
```
module Analyzer =
AbstractInterpreter.Make
(ProcCfg.Backward(ProcCfg.Exceptional))
(TransferFunctions)
```
The `ProcCfg.Backward(ProcCfg.Exceptional)` part says: "I want the direction of
iteration to be backward" (since liveness is a backward analysis), and "I want
to the analysis to follow exceptional edges". For a forward analysis that
ignores exceptional edges, you would do `ProcCfg.Normal` instead (and many other
combinations are possible; take a look at
[ProcCfg.mli](https://github.com/facebook/infer/blob/master/infer/src/absint/ProcCfg.mli)
for more). And finally, the `TransferFunctions` part says "Use the transfer
functions I defined above".
Now you have an `Analyzer` module that exposes useful functions like
[`compute_post`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli#L30)
(take a procedure as input and compute a postcondition) and
[`exec_pdesc`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli#L36)
(take a procedure and compute an invariant map from node id's to the pre/post at
each node). The next step is to hook your checker up to the Infer CLI. For the
liveness analysis, you would do this by exposing a function for running the
checker on a single procedure:
```
let checker { Callbacks.proc_desc; tenv; } =
match Analyzer.compute_post (ProcData.make_default proc_desc tenv) with
| Some post -> Logging.progress "Computed post %a for %a" Analyzer.Domain.pp post Typ.Procname.pp (Procdesc.get_proc_name proc_desc);
| None -> ()
```
and then adding `Liveness.checker, checkers_enabled` to the list of registered
checkers
[here](https://github.com/facebook/infer/blob/master/infer/src/checkers/registerCheckers.ml#L42).
you can then run `infer run -a checkers -- <your_build_command>` to run your
checker on real code. See
[here](/docs/analyzing-apps-or-projects) for more details
on the build systems supported by Infer.
Other examples of simple intraprocedural checkers are
[addressTaken.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/addressTaken.ml)
and
[copyPropagation.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/copyPropagation.ml).
## Basic error reporting
Useful analyses have output. Basic printing to stderr or stderr is good for
debugging, but to report a programmer-readable error that is tied to a source
code location, you'll want to use `Reporting.log_error`. Some examples of
error-logging code:
[1](https://github.com/facebook/infer/blob/master/infer/src/concurrency/RacerD.ml#L166),
[2](https://github.com/facebook/infer/blob/master/infer/src/checkers/annotationReachability.ml#L224),
or
[3](https://github.com/facebook/infer/blob/master/infer/src/quandary/TaintAnalysis.ml#L186).
## By example: interprocedural analysis
Let's assume you have already read and understood the "intraprocedural analysis"
section and have an intraprocedural checker. The abstract interpretation
framework makes it easy to convert your intraprocedural analysis into a
_modular_ interprocedural analysis. Let me emphasize the _modular_ point once
more; global analyses cannot be expressed in this framework.
To make your checker interprocedural, you need to:
(1) Define the type of procedure summaries for your analysis and add some
boilerplate for storing your data alongside the summaries for other analyses
(2) Add logic for (a) using summaries in your transfer functions and (b)
converting your intraprocedural abstract state to a summary.
A good example to look at here is
[siof.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/Siof.ml).
Step (1) is just:
```
module Summary = Summary.Make (struct
type summary = SiofDomain.astate
let update_payload astate payload =
{ payload with Specs.siof = Some astate }
let read_from_payload payload =
payload.Specs.siof
end)
```
along with adding the `Specs.siof`
[field](https://github.com/facebook/infer/blob/master/infer/src/backend/specs.ml#L329)
to the `Specs.payload` record
[type](https://github.com/facebook/infer/blob/master/infer/src/backend/specs.ml#L321).
Here, the type of the abstract state and the type of the summary are the same,
which makes things easier for us (no logic to convert an abstract state to a
summary).
Part (2a) is
[here](https://github.com/facebook/infer/blob/master/infer/src/checkers/Siof.ml#L65):
```
match Summary.read_summary pdesc callee_pname with
```
This says: "read the summary for `callee_pname` from procedure `pdesc` with type
environment `tenv`". You must then add logic for applying the summary to the
current abstract state (often, this is as simple as doing a join).
Because our summary type is the same as the abstract state, part (2b) can be
done for us by making use of the convenient
`AbstractInterpreter.Interprocedural`
[functor](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli#L19)
(for an example of what to do when the types are different, take a look at
[Quandary](https://github.com/facebook/infer/blob/master/infer/src/quandary/TaintAnalysis.ml#L540)):
```
module Interprocedural = Analyzer.Interprocedural (Summary)
```
This `Interprocedural` module will automatically do the work of computing and
storing the summary for us. All we need to do is change the exposed `checker`
function registered in `registerCheckers.ml` to call `Interprocedural.checker`
instead:
```
let checker callback =
ignore(Interprocedural.checker callback ProcData.empty_extras in)
```
That's it! We now have an interprocedural analysis.
One very important note here: a current (and soon-to-be-lifted) limitation
prevents us from running multiple interprocedural checkers at the same time. If
you register an interprocedural checker, be sure to unregister the other other
ones. Otherwise, there's a risk that the checkers will clobber each other's
results.
## Relevant code
Some pointers to useful code for building new analyses, and to the
implementation of the framework for the interested:
Domain combinators:
- `AbstractDomain.BottomLifted`, `AbstractDomain.FiniteSet`,
`AbstractDomain.Map`, `AbstractDomain.Pair` (all in
[AbstractDomain](https://github.com/facebook/infer/blob/master/infer/src/checkers/AbstractDomain.mli))
Domains and domain building blocks:
- [AccessPath](https://github.com/facebook/infer/blob/master/infer/src/checkers/accessPath.mli)
- [AccessPathDomains](https://github.com/facebook/infer/blob/master/infer/src/checkers/accessPathDomains.mli)
- [AccessTree](https://github.com/facebook/infer/blob/master/infer/src/checkers/accessTree.ml)
Reporting errors with interprocedural traces:
- Examples:
[`SiofTrace.ml`](https://github.com/facebook/infer/blob/master/infer/src/checkers/SiofTrace.ml),
[`JavaTrace.ml`](https://github.com/facebook/infer/blob/master/infer/src/quandary/JavaTrace.ml),
[`CppTrace.ml`](https://github.com/facebook/infer/blob/master/infer/src/quandary/CppTrace.ml).
- Implementation:
[`Trace`](https://github.com/facebook/infer/blob/master/infer/src/checkers/Trace.mli)
Implementation:
- [`AbstractDomain`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractDomain.ml)
- [`TransferFunctions`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli)
- [`AbstractInterpreter`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli)
- [`ProcCFG`](https://github.com/facebook/infer/blob/master/infer/src/absint/ProcCfg.mli)
- [`Summary`](https://github.com/facebook/infer/blob/master/infer/src/absint/Summary.ml)
- [`Scheduler`](https://github.com/facebook/infer/blob/master/infer/src/absint/Scheduler.ml)
## IR basics: SIL, CFG's, `tenv`'s, `procdesc`'s, and `procname`'s
All of the languages analyzed by Infer are converted into a common intermediate
representation. A program is represented as a control-flow graph
([CFG](https://github.com/facebook/infer/blob/master/infer/src/IR/Cfg.rei))
whose nodes contain lists of instructions in the SIL language. SIL is a small
low-level language that has some similarities with C, LLVM
[IR](http://llvm.org/docs/LangRef.html), and
[Boogie](https://research.microsoft.com/en-us/um/people/leino/papers/krml178.pdf).
[Expressions](https://github.com/facebook/infer/blob/master/infer/src/IR/Exp.rei#L25)
are literals, program variables (`Pvar`'s), temporary variables (`Ident`'s), a
field offset from a struct (OO features like objects are lowered into struct's),
or an index offset from an array.
There are four interesting kinds of
[instructions](https://github.com/facebook/infer/blob/master/infer/src/IR/Sil.rei#L38):
`Load` for reading into a temporary variable, `Store` for writing to a program
variable, field of a struct, or an array, `Prune e` (often called `assume` in
other PL formalisms) blocks execution unless the expression `e` evaluates to
true, and `Call` represents function calls.
Instructions and expressions have
[types](https://github.com/facebook/infer/blob/master/infer/src/IR/Typ.rei#L76).
A `Tstruct` (think: object) type has a
[`Typename`](https://github.com/facebook/infer/blob/master/infer/src/IR/Typename.rei#L13),
and it is often useful to look up metadata about the type (what fields does it
have, what methods does it declare, what is its superclass, etc.) in the type
environment, or
[`tenv`](https://github.com/facebook/infer/blob/master/infer/src/IR/Tenv.rei#L37).
A procedure description or
[`procdesc`](https://github.com/facebook/infer/blob/master/infer/src/IR/Procdesc.rei)
(sometimes abbreviated `pdesc`) is an abstraction of a procedure declaration: it
stores the CFG of the procedure, its signature, its annotations, and so on.
A procedure name or
[`procname`](https://github.com/facebook/infer/blob/master/infer/src/IR/Procname.rei)
(sometimes abbreviated `pname`) is an abstraction of a called procedure name.
One procname may correspond to multiple (or zero) `procdesc`'s after resolution.
## Framework-specific IR: `ProcCFG`, `ProcData`, and `extras`
The abstract interpretation framework has a few additional constructs that are
worth explaining.
A
[`ProcCfG`](https://github.com/facebook/infer/blob/master/infer/src/absint/procCfg.mli)
represents the CFG of a _single_ procedure whereas (perhaps confusingly) a
[`Cfg`](https://github.com/facebook/infer/blob/master/infer/src/IR/Cfg.rei) is
the CFG for an entire file. A `ProcCfg` is really a customizable view of the
underlying procedure CFG; we can get a view the CFG with its edges backward
(`ProcCfg.Backward`), with or without exceptional edges (`Normal`/`Exceptional`,
respectively), or with each node holding at most one instruction
(`OneInstrPerNode`).
[`ProcData`](https://github.com/facebook/infer/blob/master/infer/src/absint/procData.mli)
is a container that holds all of the read-only information required to analyze a
single procedure: procedure description, and `extras`. The `extras` are custom
read-only data that are computed before analysis begins, and can be accessed
from the transfer functions. Most often, no extras are required for analysis
(`ProcData.empty_extras`), but it can be useful to stash information like a map
from a formal to its
[index](https://github.com/facebook/infer/blob/master/infer/src/quandary/TaintAnalysis.ml#L88)
or an invariant
[map](https://github.com/facebook/infer/blob/master/infer/src/backend/preanal.ml#L115)
from a prior analysis in the extras.
## How it works
Coming soon.
## Intro: abstract interpretation
Coming soon.
## How do I make an analysis compositional?
Coming soon.