|
|
|
---
|
|
|
|
id: absint-framework
|
|
|
|
title: Building checkers with the Infer.AI framework
|
|
|
|
---
|
|
|
|
|
|
|
|
Infer.AI is a framework for quickly developing abstract interpretation-based
|
|
|
|
checkers (intraprocedural or interprocedural). You define only:
|
|
|
|
|
|
|
|
(1) An abstract domain (type of abstract state plus `<=`, `join`, and `widen`
|
|
|
|
operations)
|
|
|
|
|
|
|
|
(2) Transfer functions (a transformer that takes an abstract state as input and
|
|
|
|
produces an abstract state as output)
|
|
|
|
|
|
|
|
and then you have an analysis that can run on all of the languages Infer
|
|
|
|
supports (C, Obj-C, C++, and Java)!
|
|
|
|
|
|
|
|
This guide covers how to use the framework. For background on why we built the
|
|
|
|
framework and how it works, check out these
|
|
|
|
[slides](http://fbinfer.com/downloads/pldi17-infer-ai-tutorial.pdf) from a PLDI
|
|
|
|
2017 tutorial and this
|
|
|
|
[talk](https://atscaleconference.com/videos/getting-the-most-out-of-static-analyzers)
|
|
|
|
from @Scale2016.
|
|
|
|
|
|
|
|
**If you feel like coding instead of reading, a great way to get started with
|
|
|
|
Infer.AI is to go through the lab exercise
|
|
|
|
[here](https://github.com/facebook/infer/blob/master/infer/src/labs/lab.md).**
|
|
|
|
|
|
|
|
## By example: intraprocedural analysis
|
|
|
|
|
|
|
|
This section helps you get started ASAP if you already understand
|
|
|
|
[abstract interpretation](http://www.di.ens.fr/~cousot/AI/IntroAbsInt.html) (or
|
|
|
|
don't, but are feeling bold).
|
|
|
|
|
|
|
|
Take a look at
|
|
|
|
[liveness.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/liveness.ml).
|
|
|
|
This code is performing a compilers-101 style liveness analysis over
|
|
|
|
[SIL](#ir-basics-sil-cfgs-tenvs-procdescs-and-procnames), Infer's intermediate
|
|
|
|
language. Since this code is fairly small and you should already understand what
|
|
|
|
it's trying to do, it's a fairly good place to look in order to understand both
|
|
|
|
how to use the abstract interpretation framework and what SIL is.
|
|
|
|
|
|
|
|
There are basically three important bits here: defining the domain, defining the
|
|
|
|
transfer functions, and then passing the pieces to the framework to create an an
|
|
|
|
analysis. Let's break down the third bit:
|
|
|
|
|
|
|
|
```
|
|
|
|
module Analyzer =
|
|
|
|
AbstractInterpreter.Make
|
|
|
|
(ProcCfg.Backward(ProcCfg.Exceptional))
|
|
|
|
(TransferFunctions)
|
|
|
|
```
|
|
|
|
|
|
|
|
The `ProcCfg.Backward(ProcCfg.Exceptional)` part says: "I want the direction of
|
|
|
|
iteration to be backward" (since liveness is a backward analysis), and "I want
|
|
|
|
to the analysis to follow exceptional edges". For a forward analysis that
|
|
|
|
ignores exceptional edges, you would do `ProcCfg.Normal` instead (and many other
|
|
|
|
combinations are possible; take a look at
|
|
|
|
[ProcCfg.mli](https://github.com/facebook/infer/blob/master/infer/src/absint/ProcCfg.mli)
|
|
|
|
for more). And finally, the `TransferFunctions` part says "Use the transfer
|
|
|
|
functions I defined above".
|
|
|
|
|
|
|
|
Now you have an `Analyzer` module that exposes useful functions like
|
|
|
|
[`compute_post`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli#L30)
|
|
|
|
(take a procedure as input and compute a postcondition) and
|
|
|
|
[`exec_pdesc`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli#L36)
|
|
|
|
(take a procedure and compute an invariant map from node id's to the pre/post at
|
|
|
|
each node). The next step is to hook your checker up to the Infer CLI. For the
|
|
|
|
liveness analysis, you would do this by exposing a function for running the
|
|
|
|
checker on a single procedure:
|
|
|
|
|
|
|
|
```
|
|
|
|
let checker { Callbacks.proc_desc; tenv; } =
|
|
|
|
match Analyzer.compute_post (ProcData.make_default proc_desc tenv) with
|
|
|
|
| Some post -> Logging.progress "Computed post %a for %a" Analyzer.Domain.pp post Typ.Procname.pp (Procdesc.get_proc_name proc_desc);
|
|
|
|
| None -> ()
|
|
|
|
```
|
|
|
|
|
|
|
|
and then adding `Liveness.checker, checkers_enabled` to the list of registered
|
|
|
|
checkers
|
|
|
|
[here](https://github.com/facebook/infer/blob/master/infer/src/checkers/registerCheckers.ml#L42).
|
|
|
|
|
|
|
|
you can then run `infer run -a checkers -- <your_build_command>` to run your
|
|
|
|
checker on real code. See [here](/docs/analyzing-apps-or-projects) for more
|
|
|
|
details on the build systems supported by Infer.
|
|
|
|
|
|
|
|
Other examples of simple intraprocedural checkers are
|
|
|
|
[addressTaken.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/addressTaken.ml)
|
|
|
|
and
|
|
|
|
[copyPropagation.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/copyPropagation.ml).
|
|
|
|
|
|
|
|
## Basic error reporting
|
|
|
|
|
|
|
|
Useful analyses have output. Basic printing to stderr or stderr is good for
|
|
|
|
debugging, but to report a programmer-readable error that is tied to a source
|
|
|
|
code location, you'll want to use `Reporting.log_error`. Some examples of
|
|
|
|
error-logging code:
|
|
|
|
[1](https://github.com/facebook/infer/blob/master/infer/src/concurrency/RacerD.ml#L166),
|
|
|
|
[2](https://github.com/facebook/infer/blob/master/infer/src/checkers/annotationReachability.ml#L224),
|
|
|
|
or
|
|
|
|
[3](https://github.com/facebook/infer/blob/master/infer/src/quandary/TaintAnalysis.ml#L186).
|
|
|
|
|
|
|
|
## By example: interprocedural analysis
|
|
|
|
|
|
|
|
Let's assume you have already read and understood the "intraprocedural analysis"
|
|
|
|
section and have an intraprocedural checker. The abstract interpretation
|
|
|
|
framework makes it easy to convert your intraprocedural analysis into a
|
|
|
|
_modular_ interprocedural analysis. Let me emphasize the _modular_ point once
|
|
|
|
more; global analyses cannot be expressed in this framework.
|
|
|
|
|
|
|
|
To make your checker interprocedural, you need to:
|
|
|
|
|
|
|
|
(1) Define the type of procedure summaries for your analysis and add some
|
|
|
|
boilerplate for storing your data alongside the summaries for other analyses
|
|
|
|
|
|
|
|
(2) Add logic for (a) using summaries in your transfer functions and (b)
|
|
|
|
converting your intraprocedural abstract state to a summary.
|
|
|
|
|
|
|
|
A good example to look at here is
|
|
|
|
[siof.ml](https://github.com/facebook/infer/blob/master/infer/src/checkers/Siof.ml).
|
|
|
|
Step (1) is just:
|
|
|
|
|
|
|
|
```
|
|
|
|
module Summary = Summary.Make (struct
|
|
|
|
type summary = SiofDomain.astate
|
|
|
|
|
|
|
|
let update_payload astate payload =
|
|
|
|
{ payload with Specs.siof = Some astate }
|
|
|
|
|
|
|
|
let read_from_payload payload =
|
|
|
|
payload.Specs.siof
|
|
|
|
end)
|
|
|
|
```
|
|
|
|
|
|
|
|
along with adding the `Specs.siof`
|
|
|
|
[field](https://github.com/facebook/infer/blob/master/infer/src/backend/specs.ml#L329)
|
|
|
|
to the `Specs.payload` record
|
|
|
|
[type](https://github.com/facebook/infer/blob/master/infer/src/backend/specs.ml#L321).
|
|
|
|
Here, the type of the abstract state and the type of the summary are the same,
|
|
|
|
which makes things easier for us (no logic to convert an abstract state to a
|
|
|
|
summary).
|
|
|
|
|
|
|
|
Part (2a) is
|
|
|
|
[here](https://github.com/facebook/infer/blob/master/infer/src/checkers/Siof.ml#L65):
|
|
|
|
|
|
|
|
```
|
|
|
|
match Summary.read_summary pdesc callee_pname with
|
|
|
|
```
|
|
|
|
|
|
|
|
This says: "read the summary for `callee_pname` from procedure `pdesc` with type
|
|
|
|
environment `tenv`". You must then add logic for applying the summary to the
|
|
|
|
current abstract state (often, this is as simple as doing a join).
|
|
|
|
|
|
|
|
Because our summary type is the same as the abstract state, part (2b) can be
|
|
|
|
done for us by making use of the convenient
|
|
|
|
`AbstractInterpreter.Interprocedural`
|
|
|
|
[functor](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli#L19)
|
|
|
|
(for an example of what to do when the types are different, take a look at
|
|
|
|
[Quandary](https://github.com/facebook/infer/blob/master/infer/src/quandary/TaintAnalysis.ml#L540)):
|
|
|
|
|
|
|
|
```
|
|
|
|
module Interprocedural = Analyzer.Interprocedural (Summary)
|
|
|
|
```
|
|
|
|
|
|
|
|
This `Interprocedural` module will automatically do the work of computing and
|
|
|
|
storing the summary for us. All we need to do is change the exposed `checker`
|
|
|
|
function registered in `registerCheckers.ml` to call `Interprocedural.checker`
|
|
|
|
instead:
|
|
|
|
|
|
|
|
```
|
|
|
|
let checker callback =
|
|
|
|
ignore(Interprocedural.checker callback ProcData.empty_extras in)
|
|
|
|
```
|
|
|
|
|
|
|
|
That's it! We now have an interprocedural analysis.
|
|
|
|
|
|
|
|
One very important note here: a current (and soon-to-be-lifted) limitation
|
|
|
|
prevents us from running multiple interprocedural checkers at the same time. If
|
|
|
|
you register an interprocedural checker, be sure to unregister the other other
|
|
|
|
ones. Otherwise, there's a risk that the checkers will clobber each other's
|
|
|
|
results.
|
|
|
|
|
|
|
|
## Relevant code
|
|
|
|
|
|
|
|
Some pointers to useful code for building new analyses, and to the
|
|
|
|
implementation of the framework for the interested:
|
|
|
|
|
|
|
|
Domain combinators:
|
|
|
|
|
|
|
|
- `AbstractDomain.BottomLifted`, `AbstractDomain.FiniteSet`,
|
|
|
|
`AbstractDomain.Map`, `AbstractDomain.Pair` (all in
|
|
|
|
[AbstractDomain](https://github.com/facebook/infer/blob/master/infer/src/checkers/AbstractDomain.mli))
|
|
|
|
|
|
|
|
Domains and domain building blocks:
|
|
|
|
|
|
|
|
- [AccessPath](https://github.com/facebook/infer/blob/master/infer/src/checkers/accessPath.mli)
|
|
|
|
- [AccessPathDomains](https://github.com/facebook/infer/blob/master/infer/src/checkers/accessPathDomains.mli)
|
|
|
|
- [AccessTree](https://github.com/facebook/infer/blob/master/infer/src/checkers/accessTree.ml)
|
|
|
|
|
|
|
|
Reporting errors with interprocedural traces:
|
|
|
|
|
|
|
|
- Examples:
|
|
|
|
[`SiofTrace.ml`](https://github.com/facebook/infer/blob/master/infer/src/checkers/SiofTrace.ml),
|
|
|
|
[`JavaTrace.ml`](https://github.com/facebook/infer/blob/master/infer/src/quandary/JavaTrace.ml),
|
|
|
|
[`CppTrace.ml`](https://github.com/facebook/infer/blob/master/infer/src/quandary/CppTrace.ml).
|
|
|
|
- Implementation:
|
|
|
|
[`Trace`](https://github.com/facebook/infer/blob/master/infer/src/checkers/Trace.mli)
|
|
|
|
|
|
|
|
Implementation:
|
|
|
|
|
|
|
|
- [`AbstractDomain`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractDomain.ml)
|
|
|
|
- [`TransferFunctions`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli)
|
|
|
|
- [`AbstractInterpreter`](https://github.com/facebook/infer/blob/master/infer/src/absint/AbstractInterpreter.mli)
|
|
|
|
- [`ProcCFG`](https://github.com/facebook/infer/blob/master/infer/src/absint/ProcCfg.mli)
|
|
|
|
- [`Summary`](https://github.com/facebook/infer/blob/master/infer/src/absint/Summary.ml)
|
|
|
|
- [`Scheduler`](https://github.com/facebook/infer/blob/master/infer/src/absint/Scheduler.ml)
|
|
|
|
|
|
|
|
## IR basics: SIL, CFG's, `tenv`'s, `procdesc`'s, and `procname`'s
|
|
|
|
|
|
|
|
All of the languages analyzed by Infer are converted into a common intermediate
|
|
|
|
representation. A program is represented as a control-flow graph
|
|
|
|
([CFG](https://github.com/facebook/infer/blob/master/infer/src/IR/Cfg.rei))
|
|
|
|
whose nodes contain lists of instructions in the SIL language. SIL is a small
|
|
|
|
low-level language that has some similarities with C, LLVM
|
|
|
|
[IR](http://llvm.org/docs/LangRef.html), and
|
|
|
|
[Boogie](https://research.microsoft.com/en-us/um/people/leino/papers/krml178.pdf).
|
|
|
|
|
|
|
|
[Expressions](https://github.com/facebook/infer/blob/master/infer/src/IR/Exp.rei#L25)
|
|
|
|
are literals, program variables (`Pvar`'s), temporary variables (`Ident`'s), a
|
|
|
|
field offset from a struct (OO features like objects are lowered into struct's),
|
|
|
|
or an index offset from an array.
|
|
|
|
|
|
|
|
There are four interesting kinds of
|
|
|
|
[instructions](https://github.com/facebook/infer/blob/master/infer/src/IR/Sil.rei#L38):
|
|
|
|
`Load` for reading into a temporary variable, `Store` for writing to a program
|
|
|
|
variable, field of a struct, or an array, `Prune e` (often called `assume` in
|
|
|
|
other PL formalisms) blocks execution unless the expression `e` evaluates to
|
|
|
|
true, and `Call` represents function calls.
|
|
|
|
|
|
|
|
Instructions and expressions have
|
|
|
|
[types](https://github.com/facebook/infer/blob/master/infer/src/IR/Typ.rei#L76).
|
|
|
|
A `Tstruct` (think: object) type has a
|
|
|
|
[`Typename`](https://github.com/facebook/infer/blob/master/infer/src/IR/Typename.rei#L13),
|
|
|
|
and it is often useful to look up metadata about the type (what fields does it
|
|
|
|
have, what methods does it declare, what is its superclass, etc.) in the type
|
|
|
|
environment, or
|
|
|
|
[`tenv`](https://github.com/facebook/infer/blob/master/infer/src/IR/Tenv.rei#L37).
|
|
|
|
|
|
|
|
A procedure description or
|
|
|
|
[`procdesc`](https://github.com/facebook/infer/blob/master/infer/src/IR/Procdesc.rei)
|
|
|
|
(sometimes abbreviated `pdesc`) is an abstraction of a procedure declaration: it
|
|
|
|
stores the CFG of the procedure, its signature, its annotations, and so on.
|
|
|
|
|
|
|
|
A procedure name or
|
|
|
|
[`procname`](https://github.com/facebook/infer/blob/master/infer/src/IR/Procname.rei)
|
|
|
|
(sometimes abbreviated `pname`) is an abstraction of a called procedure name.
|
|
|
|
One procname may correspond to multiple (or zero) `procdesc`'s after resolution.
|
|
|
|
|
|
|
|
## Framework-specific IR: `ProcCFG`, `ProcData`, and `extras`
|
|
|
|
|
|
|
|
The abstract interpretation framework has a few additional constructs that are
|
|
|
|
worth explaining.
|
|
|
|
|
|
|
|
A
|
|
|
|
[`ProcCfG`](https://github.com/facebook/infer/blob/master/infer/src/absint/procCfg.mli)
|
|
|
|
represents the CFG of a _single_ procedure whereas (perhaps confusingly) a
|
|
|
|
[`Cfg`](https://github.com/facebook/infer/blob/master/infer/src/IR/Cfg.rei) is
|
|
|
|
the CFG for an entire file. A `ProcCfg` is really a customizable view of the
|
|
|
|
underlying procedure CFG; we can get a view the CFG with its edges backward
|
|
|
|
(`ProcCfg.Backward`), with or without exceptional edges (`Normal`/`Exceptional`,
|
|
|
|
respectively), or with each node holding at most one instruction
|
|
|
|
(`OneInstrPerNode`).
|
|
|
|
|
|
|
|
[`ProcData`](https://github.com/facebook/infer/blob/master/infer/src/absint/procData.mli)
|
|
|
|
is a container that holds all of the read-only information required to analyze a
|
|
|
|
single procedure: procedure description, and `extras`. The `extras` are custom
|
|
|
|
read-only data that are computed before analysis begins, and can be accessed
|
|
|
|
from the transfer functions. Most often, no extras are required for analysis
|
|
|
|
(`ProcData.empty_extras`), but it can be useful to stash information like a map
|
|
|
|
from a formal to its
|
|
|
|
[index](https://github.com/facebook/infer/blob/master/infer/src/quandary/TaintAnalysis.ml#L88)
|
|
|
|
or an invariant
|
|
|
|
[map](https://github.com/facebook/infer/blob/master/infer/src/backend/preanal.ml#L115)
|
|
|
|
from a prior analysis in the extras.
|
|
|
|
|
|
|
|
## How it works
|
|
|
|
|
|
|
|
Coming soon.
|
|
|
|
|
|
|
|
## Intro: abstract interpretation
|
|
|
|
|
|
|
|
Coming soon.
|
|
|
|
|
|
|
|
## How do I make an analysis compositional?
|
|
|
|
|
|
|
|
Coming soon.
|