[website] delete outdated or duplicated pages

Summary:
Delete the following outdated pages. Some of them have better
replacements now but some of them are just too out of date to easily
salvage.

- 01-adding-models.md: this is biabduction-specific and hard to make
  that distinction. Most checkers have no way to add models. Even
  suggesting that people add their own biabduction models was a bit
  awkward as it requires recompiling infer to use.
- 01-checkers.md: outdated document that describes "new" checkers like
  Quandary :)
- 01-experimental-checkers.md: very outdated!
- 02-limitations.md: also outdated: it roughly says "we don't do concurrency or
  arithmetic, the former is basically impossible but the latter should
  be easy"... well, 5 years later this hasn't aged too well ;)
- 01-eradicate.md: content has moved to auto-generated documentation
- 01-linters.md: content has moved to auto-generated documentation
- 01-racerd.md: content has moved to auto-generated documentation

Reviewed By: mityal

Differential Revision: D21934748

fbshipit-source-id: a933e5022
master
Jules Villard 5 years ago committed by Facebook GitHub Bot
parent f195161742
commit f8a510fe2d

@ -1,283 +0,0 @@
---
id: adding-models
title: Adding models
---
## Why do we need models
When analyzing projects with call dependencies between functions, Infer follows
the call graph to decide in which order to analyze these functions. The main
goal is to use the analysis summary of a function wherever this function is
called. On the following example:
```c
int foo(int x) {
if (x < 42) {
return x;
} else {
return 0;
}
}
int bar() {
return foo(24);
}
int baz() {
return foo(54);
}
```
Infer starts with the analysis on `foo` and detect that this function either
returns `0` if the argument is greater than or equal to `42`, or returns the
value of the argument otherwise. With this information, Infer detects that `bar`
always returns `24` and `baz` always returns `0`.
Now, it may happen that the code of some function is not available during the
analysis. For example, this happens when a project uses pre-compiled libraries.
The most typical case is the use of the standard library like in the following
example:
```c
#include <stdlib.h>
int* create() {
int *p = malloc(sizeof(int));
if (p == NULL) exit(1);
return p;
}
void assign(int x, int *p) {
*p = x;
}
int* my_function() {
int *p = create();
assign(42, p);
return p;
}
```
Here, Infer will start with the analysis of `create` but will not find the
source code for `malloc`. To deal with this situation, Infer relies on models of
the missing functions to proceed with the analysis. The function `malloc` is
internally modeled as either returning `NULL`, or returning a valid and
allocated pointer. Similarly, the function `exit` is modeled as terminating the
execution. Using these two models, Infer detects that `create` always returns an
allocated pointer and that `my_function` is safe.
At this point, it is important to note that missing source code and missing
models do not make the analysis fail. Missing functions are treated as having no
effect. However, even though skipping these missing functions is fine in most
cases, there can be cases where it affects the quality of the analysis. For
example, missing models can lead to incorrect bug reports.
Consider the case of a function `lib_exit` having the same semantics as `exit`
but defined in an pre-compiled library not part of the project being analyzed:
```c
void lib_exit(int);
int* create() {
int *p = malloc(sizeof(int));
if (p == NULL) lib_exit(1);
return p;
}
```
In this case, Infer will not be able to know that the return statement is only
possible in the case where `p` is not null. When analyzing `my_function`, Infer
will consider the null case and report a null dereference error in the call to
`assign(42, p)`.
Similarly, considering a function `lib_alloc` equivalent to `malloc`, and the
function `create` now defined as:
```c
int* lib_alloc(int);
int* create() {
int *p = lib_alloc(sizeof(int));
return p;
}
```
Then Infer will not report any null dereference in `my_function`.
## Examples of models
### Some models for C
Adding new models is easy. The models for C can be found in
[`infer/models/c/src/`](https://github.com/facebook/infer/tree/master/infer/models/c/src).
The file
[`libc_basic.c`](https://github.com/facebook/infer/blob/master/infer/models/c/src/libc_basic.c)
contains models for some of the most commonly encountered functions from the C
standard library. For example, the function `xmalloc`, which is essentially the
same function as `create` defined above, is modeled by:
```c
void *xmalloc(size_t size) {
void *ret = malloc(size);
INFER_EXCLUDE_CONDITION(ret == NULL);
return ret;
}
```
The function `xmalloc` is modeled using `malloc` to create an allocated object
and the macro `INFER_EXCLUDE_CONDITION` used to eliminate the case where
`malloc` can return null. The list of helper functions and macros for writing
models can be found in
[`infer_builtins.c`](https://github.com/facebook/infer/blob/master/infer/models/c/src/infer_builtins.c).
For a slightly more complex example, `realloc` is modeled as:
```c
void *realloc(void *ptr, size_t size) {
if(ptr==0) { // if ptr in NULL, behave as malloc
return malloc(size);
}
int old_size;
int can_enlarge;
old_size = __get_array_size(ptr); // force ptr to be an array
can_enlarge = __infer_nondet_int(); // nondeterministically choose whether the current block can be enlarged
if(can_enlarge) {
__set_array_size(ptr, size); // enlarge the block
return ptr;
}
int *newblock = malloc(size);
if(newblock) {
free(ptr);
return newblock;
}
else { // if new allocation fails, do not free the old block
return newblock;
}
}
```
This model is based on existing models for `malloc` and `free` and three helper
functions:
- `__get_array_size(ptr)` which allows to manipulate with a model what Infer
knows about the size of the allocated memory
- `__set_array_size(ptr, size)` to modify the information about the size of the
allocated memory
- `__infer_nondet_int()` to create a variable which can have any possible
integer value
### For Java
The models for Java are following the same approach and the list of helper
functions is in:
[`infer/models/java/src/com/facebook/infer/models/InferBuiltins.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/com/facebook/infer/models/InferBuiltins.java)
[`infer/models/java/src/com/facebook/infer/models/InferUndefined.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/com/facebook/infer/models/InferUndefined.java)
For example, Infer treats Java hash maps using a recency abstraction model:
Infer remembers the last two keys being added by `put` and checked by
`containsKey`, which can be used to make sure that no null pointer exceptions
are coming from the fact that `get(key)` returns null when `key` is not in the
map. This behavior can just be implemented via a model written in Java with the
help of few helper functions understood by Infer. These models can be found in:
[`infer/models/java/src/java/util/HashMap.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/java/util/HashMap.java)
and just rely on these two methods:
- `InferUndefined.boolean_undefined()` to create a non-deterministic choice
- `(V)InferUndefined.object_undefined()` to create a non null undefined object
of type `V`
## How to add new models
Let's look at a toy example in Java. As explained above, models for C,
Objective-C and Java are all following the same approach.
```java
import lib.Server;
public class Test {
enum Status {
SUCCESS, FAILURE, PING_FAILURE, CONNECTION_FAILURE
}
Status convertStatus(Server s) {
switch (s.getStatus()) {
case 0:
return Status.SUCCESS;
case 1:
return Status.FAILURE;
case 2:
return Status.FAILURE;
default: // should not happen
return null;
}
}
String statusName(Server s) {
Status status = convertStatus(s);
return status.name();
}
}
```
Assuming that the class `lib.Server` is part of a pre-compiled library, Infer
will report a null pointer exception in `statusName`. This happens whenever
`s.getStatus()` returns a value greater that `3`, in which case the default
branch of the switch statement is taken and `convertStatus` returns `null`.
However, we know from the documentation that the method `lib.Server.getStatus`
can only return `0`, `1`, or `2`. A possible approach would be to use an
assertion like the Guava `Preconditions.checkState` to inform Infer about the
invariant:
```java
Status convertStatus(Server s) {
int serverStatus = s.getStatus();
Preconditions.checkState(serverStatus >= 0 && serverStatus < 3);
switch (s.getStatus()) {
...
}
}
```
However, in the case where adding preconditions is not possible, we can then
write a model for `getStatus()` in order to make the analysis more precise.
To create a model for `getStatus()`, we need to add a class with the name and
the same package as for the original method. In this example:
- create a file `infer/models/java/src/infer/models/Server.java` with the
following content:
```java
package infer.models;
import com.facebook.infer.models.InferBuiltins;
import com.facebook.infer.models.InferUndefined;
public class Server {
public int getStatus() {
int status = InferUndefined.int_undefined();
InferBuiltins.assume(status >= 0 && status < 3);
return status;
}
}
```
- recompile infer:
```bash
make -C infer
```
- run the analysis again:
```bash
infer run -- javac Test.java
```
Now it should no longer report a null pointer exception.

@ -1,26 +0,0 @@
---
id: checkers
title: "Infer : AI"
---
Infer.AI is a collection of program analyses which range from simple checks to
sophisticated inter-procedural analysis. Infer.AI is so named because it is
based on Abstract Interpretation.
Current Infer.AI's which are in production include ThreadSafety,
AnnotationReachability (e.g., can an allocation be reached from a
@PerformanceCritical method), and
[immutable cast](checkers-bug-types#CHECKERS_IMMUTABLE_CAST) for Java, as well
as Static Initialization Order Fiasco for C++.
The current checkers can be run by adding the option `-a checkers` to the
analysis command as in this example:
```bash
infer run -a checkers -- javac Test.java
```
In addition, we are working on experimental AI's which target security
properties (Quandary) and buffer overruns (Inferbo). The infer commandline man
page (`infer --help`) explains how to run experimental AI's, or how to select
certain AI's and not others.

@ -1,82 +0,0 @@
---
id: eradicate
title: "Infer : Eradicate"
---
> "I call it my billion-dollar mistake. It was the invention of the null
> reference in 1965."
>
> [Tony Hoare](http://en.wikipedia.org/wiki/Tony_Hoare)
### What is Infer:Eradicate?
Infer:Eradicate is a type checker for @Nullable annotations for Java. It is part
of the Infer static analysis suite of tools. The goal is to eradicate null
pointer exceptions.
<a href="https://developer.android.com/reference/android/support/annotation/Nullable.html">@Nullable</a>
annotations denote that a parameter, field or the return value of a method can
be null. When decorating a parameter, this denotes that the parameter can
legitimately be null and the method will need to deal with it. When decorating a
method, this denotes the method might legitimately return null.
Starting from @Nullable-annotated programs, the checker performs a flow
sensitive analysis to propagate the nullability through assignments and calls,
and flags errors for unprotected accesses to nullable values or
inconsistent/missing annotations. It can also be used to add annotations to a
previously un-annotated program.
### What is the @Nullable convention?
If you say nothing, you're saying that the value cannot be null. This is the
recommended option when possible:
Program safely, annotate nothing!
When this cannot be done, add a @Nullable annotation before the type to indicate
that the value can be null.
### What is annotated?
Annotations are placed at the interface of method calls and field accesses:
- Parameters and return type of a method declaration.
- Field declarations.
Local variable declarations are not annotated: their nullability is inferred.
### How is Infer:Eradicate invoked?
Eradicate can be invoked by adding the option `--eradicate` to the checkers mode
as in this example:
```bash
infer run -a checkers --eradicate -- javac Test.java
```
The checker will report an error on the following program that accesses a
nullable value without null check:
```java
class C {
int getLength(@Nullable String s) {
return s.length();
}
}
```
But it will not report an error on this guarded dereference:
```java
class C {
int getLength(@Nullable String s) {
if (s != null) {
return s.length();
} else {
return -1;
}
}
}
```
Eradicate reports the following [warnings](/docs/eradicate-warnings).

@ -1,38 +0,0 @@
---
id: experimental-checkers
title: "Infer : Experimental Checkers"
---
Infer contains a number of experimental checkers that can be run using just like
the normal infer analysis
`infer -a checkers --<checker_name> -- <your build command>`. `checker_name` can
be `bufferoverrun`, `siof`, or `quandary`. We'll explain the capabilities of
each experimental checker, its level of maturity (on a scale including "in
development", "medium", and "probably deployable"), and the language(s) it
targets.
# Inferbo
- Languages: C (but should be easy to adapt to Objective-C/C++, and possibly
Java.)
- Maturity: Medium
Inferbo is a detector for out-of-bounds array accesses. You can read all about
it in this blog
[post](https://research.fb.com/inferbo-infer-based-buffer-overrun-analyzer/). It
has been tuned for C, but we are planning to adapt it to other languages in the
near future.
# Quandary
- Languages: Java, C/C++
- Maturity: Medium
Quandary is a static taint analyzer that identifies a variety of unsafe
information flows. It has a small list of built-in
[sources](https://github.com/facebook/infer/blob/master/infer/src/quandary/JavaTrace.ml#L36)
and
[sinks](https://github.com/facebook/infer/blob/master/infer/src/quandary/JavaTrace.ml#L178),
and you can define custom sources and sinks in your `.inferconfig` file (see
example
[here](https://github.com/facebook/infer/blob/master/infer/tests/codetoanalyze/java/quandary/.inferconfig)).

@ -1,697 +0,0 @@
---
id: linters
title: "Infer : AL"
---
For C/C++ and Objective-C languages, we provide a linters framework. These are
checks about the syntax of the program; it could be about a property, or about
code inside one method, or that a class or method have certain properties. We
provide [a few checks](/docs/linters-bug-types) and we have developed a domain
specific language (DSL) to make it easier to write checks.
## AL: A declarative language for writing linters in Infer
One of the major advantage of Infer when compared with other static analyzers is
the fact it performs sophisticated inter-procedural/inter-file analysis. That
is, Infer can detect bugs which involve tracking values through many procedure
calls and the procedures may live in different files. These may be very subtle
bugs and designing static analyses to do that is quite involved and normally
requires deep static analysis expertise.
However, there are many important software bugs that are confined in the code of
a single procedure (called intra-procedural). To detect these bugs simpler
analyses may suffice which do not require deep technical expertise in static
analysis. Often these bugs can be expressed by referring to the syntax of the
program, or the types of certain expressions. We have defined a new language to
easily design checkers which identify these kind of bugs. The language is called
AL (AST Language) and its main feature is the ability to reason about the
Abstract Syntax Tree of a program in a concise declarative way. AL's checkers
are interpreted by Infer to analyze programs. Thus, to detect new kind of bugs
in Infer one can just write a check in AL. We will see in more detail later,
that for writing AL formulas we also need predicates: simple functions that
check a property of the AST. Predicates are written in OCaml inside Infer, thus
it requires a bit of OCaml knowledge and getting familiar with the OCaml data
structure for the clang AST.
## Getting the clang AST
When you write a linter that traverses the AST of some programs to check some
property, you probably need to understand what the AST looks like. You can get
the AST of programs using clang directly, or using Infer.
If you have a clang command `clang <clang arguments> File.m` then you can get
the AST with
```bash
clang <clang arguments> -Xclang -ast-dump -fsyntax-only File.m
```
You can also get the AST using Infer. One advantage of this is that you don't
need to know the speicifc clang command, just the general build command.
Moreover, what you get here is exactly the form of the AST that Infer has as
input.
For this you need to install an OCaml package `biniou` with
`opam install biniou`. See [the opam website](https://opam.ocaml.org/) for
instructions on how to install opam.
Then, the AST can be created by Infer in debug mode. Call Infer with
```bash
infer --debug -- <build command>
```
This will, among other things, generate a file `/path/to/File.m.ast.sh` for
every file `/path/to/File.m` that is being analyzed. Run this script with
`bash File.m.ast.sh` and a file `/path/to/File.m.ast.bdump` will be generated,
that contains the AST of the program in `bdump` format (similar to json). If you
get an error about `bdump` not being found you may need to run
`eval $(opam env)` to get the `bdump` executable (provided by the biniou opam
package) into your `PATH`.
For general info on the clang AST, you can check out
[clang's website](http://clang.llvm.org/docs/IntroductionToTheClangAST.html).
## Using AL to write linters
Let's start with an example. Suppose we want to write the following
Objective-C's linter:
_"a property containing the word 'delegate', but not containing the word 'queue'
should not be declared strong"_.
We can write this property in the following way:
```bash
DEFINE-CHECKER STRONG_DELEGATE_WARNING = {
LET name_contains_delegate =
declaration_has_name(REGEXP("[dD]elegate"));
LET name_does_not_contain_queue =
NOT declaration_has_name(REGEXP("[qQ]ueue"));
SET report_when =
WHEN
name_contains_delegate
AND name_does_not_contain_queue
AND is_strong_property()
HOLDS-IN-NODE ObjCPropertyDecl;
SET message = "Property or ivar %decl_name% declared strong";
SET suggestion = "In general delegates should be declared weak or assign";
SET severity = "WARNING"
};
```
The linter definition starts with the keyword `DEFINE-CHECKER` followed by the
checker's name. The first `LET` clause defines the _formula variable_
`name_contains_delegate` using the predicate `declaration_has_name` which return
true/false depending whether the property's name contains a word in the language
of the regular expression `[dD]elegate`. In general a predicate is a simple
atomic formula evaluated on an AST node. The list of available predicates is in
the module
[`cPredicates.mli`](https://github.com/facebook/infer/blob/master/infer/src/clang/cPredicates.mli)
(this list is continuously growing and if you need a new predicate you can add
it in ocaml). Formula variables can be used to simplify other definitions. The
`SET report_when` is mandatory and defines a formula that, when evaluates to
true, will tell Infer to report an error. In the case above, the formula is
saying that we should report when visiting an `ObjCPropertyDecl` (that is the
AST node declaring a property in Objective-C) where it holds that: the name
contains "delegate/Delegate" (`name_contains_delegate`) and the name doesn't
contain "queue/Queue" (`name_does_not_contain_queue`) and the node is defining a
"strong" property (`is_strong_property()`).
The `SET message` clause defines the error message that will be displayed to the
user. Notice that the message can include placeholders like `%decl_name%`.
Placeholders are evaluated by Infer and substituted by their current value when
the error message is reported. In this case the name of the declaration. The
`SET suggestion` clause define an optional hint to give to programmer on how to
fix the problem.
The general structure of a checker is the following:
```bash
DEFINE-CHECKER id_of_the_checker = {
LET formula = <formula definition>;
LET ….
SET report_when = <formula definition>;
SET name = <optional name>;
SET message = <error message to show the user>;
SET suggestion = <optional suggestion to the user>;
SET doc_url = <optional URL to documentation of the issue>;
SET severity = INFO | LIKE | ADVICE | WARNING | ERROR;
SET mode = ON | OFF
SET whitelist_path = {path1, path2, ..., pathn };
SET blacklist_path = {path1, path2, ..., pathn };
};
```
The default severity is `WARNING` and the default mode is `ON`, so these are
optional. If the check is `OFF` it will only be available in debug mode (flags
`--debug` or `--linters-developer-mode`). `INFOs` are generally also not
reported, except with some specialzed flags. `name` and `doc_url` are used only
for CI comments at the moment (in Phabricator).
## Defining Paths
`whitelist_path` and `blacklist_path` are optional, by default the rule is
enabled everywhere. For specifying paths, one can use either string constants
(`"File.m"`) or regexes (`REGEXP("path/to/.*")`) or variables. The variables
stand for a list of paths, and are defined in a separate block:
```bash
GLOBAL-PATHS {
path1 = {"A.m", REGEXP("path/to/.*")};
};
```
## Defining Macros
It is possible to define macros that can be used in several checkers. This is
done in the following way:
```bash
GLOBAL-MACROS {
LET is_subclass_of(x) =
is_class(x) HOLDS-IN-SOME-SUPERCLASS-OF ObjCInterfaceDecl;
};
```
`GLOBAL-MACROS` is the section of an AL specification where one can define a
list of global macros. In the example we are defining the macro `is_subclass(x)`
which can now be used in checkers instead of its complex definition.
It is possible to import a library of macros and paths with the following
command:
```
#IMPORT <library.al>
```
In an AL file, the command above import and make available all the macros and
paths defined in the `library.al` file.
## AL Predicates
The simplest formulas we can write are predicates. They are defined inside
Infer. We provide a
[library](https://github.com/facebook/infer/blob/master/infer/src/clang/cPredicates.mli),
but if the predicate that you require is not available, you will need to extend
the library. Here are the some of the currently defined predicates:
```
call_class_method ("class_name", "method_name")
call_function ("method_name")
call_instance_method ("class_name", "method_name")
call_method ("method_name")
captures_cxx_references ()
context_in_synchronized_block ()
declaration_has_name ("decl_name")
declaration_ref_name ("decl_ref_name")
decl_unavailable_in_supported_ios_sdk ()
has_cast_kind("cast_kind") // useful in a cast node
has_type ("type") // only builtin types, pointers and Objective-C classes available at the moment
isa ("class_name")
is_assign_property ()
is_binop_with_kind ("kind")
is_class ("class_name")
is_const_var ()
is_global_var ()
is_ivar_atomic ()
is_method_property_accessor_of_ivar ()
is_node ("node_name")
is_objc_constructor ()
is_objc_dealloc ()
is_objc_extension ()
is_objc_interface_named ("name")
is_property_pointer_type ()
is_strong_property ()
is_weak_property ()
is_unop_with_kind ("kind")
method_return_type ("type") // only builtin type, pointers, and Objective-C classes available at the moment
objc_method_has_nth_parameter_of_type("type")
using_namespace("namespace")
within_responds_to_selector_block ()
```
In general, the parameters of predicates can be constants, or variables, or
regular expressions. Variables are used in macros, see below. The syntax for
using regexes is `REGEX("your_reg_exp_here")`.
**NOTE:** The predicates that expect types, such as `has_type` or
`method_return_type` or `objc_method_has_nth_parameter_of_type` also accept
regexes, but the syntax is a bit different: `REGEX('your_reg_exp_here')`, and
this regex can be embedded inside another string, for example:
`has_type("REGEXP('NS.+')*" )` which stands for pointer to a class of name
starting with NS.
If you need to add a new predicate, write the predicate in
[cPredicates.ml](https://github.com/facebook/infer/blob/master/infer/src/clang/cPredicates.ml)
and then register it in
[CTL.ml](https://github.com/facebook/infer/blob/master/infer/src/clang/cTL.ml#L728).
## AL Formulas
Formulas are defined using a variation of the
[_CTL temporal logic_](https://en.wikipedia.org/wiki/Computation_tree_logic).
CTL is a logic expressing properties of a tree model. In the case of AL, the
tree is the AST of the program. Formulas are defined according to the following
grammar:
```
formula ::= predicate
| NOT formula
| formula1 OR formula2
| formula1 AND formula2
| formula1 IMPLIES formula2
| formula1 HOLDS-UNTIL formula2
| formula1 HOLDS-EVERYWHERE-UNTIL formula2
| formula HOLDS-EVENTUALLY
| formula HOLDS-EVERYWHERE-EVENTUALLY
| formula HOLDS-NEXT
| formula HOLDS-EVERYWHERE-NEXT
| formula HOLDS-ALWAYS
| formula HOLDS-EVERYWHERE-ALWAYS
| WHEN formula HOLDS-IN-NODE node-name-list
| IN-NODE node-name-list WITH-TRANSITION transition-name
formula HOLDS-EVENTUALLY
```
The first four cases (`NOT`, `OR`, `AND`, `IMPLIES`) are classic boolean
operators with the usual semantics. The others are temporal operators describing
how the truth-value of a formula is evaluated in a tree. Let's consider case by
case.
| Formula | Semantic meaning |
| ------------------- | :-------------------------------------------------------------------------------------------: |
| F1 _HOLDS-UNTIL_ F2 | from the current node, there exists a path where F1 holds at every node until F2 becomes true |
An example is depicted in the following tree. When `F1` or `F2` hold in a node
this is indicated between square brackets. The formula `F1 HOLDS-UNTIL F2` holds
in the green nodes.
![](/img/AL/holds_until.jpeg)
---
| Formula | Semantic meaning |
| -------------------- | :--------------------------------------------------------------------------: |
| F _HOLDS-EVENTUALLY_ | from the current node there exists a path where at some point F becomes true |
In the picture below, as `F` holds in `n10`, then `F HOLDS-EVENTUALLY` holds in
the green nodes `n1`, `n7`, `n10`. This is because from these nodes there is a
path reaching `n10` where `F` holds. Note that it holds for `n10` as well
because there exists a trivial path of length 0 from `n1` to itself.
![](/img/AL/holds_eventually.jpeg)
---
| Formula | Semantic meaning |
| ----------------------------- | :-----------------------------------------------------------------------: |
| F HOLDS-EVERYWHERE-EVENTUALLY | in every path starting from the current node at some point F becomes true |
For example, in the tree below, the formula holds in every green node because
every paths starting from each of them eventually reaches a node where F holds.
![](/img/AL/holds_everywhere_eventually.jpeg)
---
| Formula | Semantic meaning |
| ------------ | :--------------------------------------------------------------------------: |
| F HOLDS-NEXT | from the current node (we are visiting) there exists a child where F is true |
In the tree below, the formula `F HOLDS-NEXT` it is true only in n1 as it's the
only node with a child where `F` holds (node n3). In AL, `NEXT` is synonym of
child as, in terms of a path in the tree, a child is the next node.
![](/img/AL/holds_next.jpeg)
---
| Formula | Semantic meaning |
| ----------------------- | :-----------------------------------------------------: |
| F HOLDS-EVERYWHERE-NEXT | from the current node in every existing child F is true |
In the tree below, the formula `F HOLDS-EVERYWHERE-NEXT` it is true in n1 as
it's the only node for which in every child `F` holds (node n2, n3, and n7).
![](/img/AL/holds_everywhere_next.jpeg)
---
| Formula | Semantic meaning |
| -------------- | :-------------------------------------------------------------------: |
| F HOLDS-ALWAYS | from the current node there exists a path where F holds at every node |
In the tree below `F HOLDS-ALWAYS` holds in `n1`, `n2`, `n8` because for each of
these nodes there exists a path where `F` holds at each node in the path.
![](/img/AL/always_holds.jpeg)
---
| Formula | Semantic meaning |
| ------------------------- | :--------------------------------------------------------: |
| F HOLDS-EVERYWHERE-ALWAYS | from the current node, in every path F holds at every node |
`F HOLDS-EVERYWHERE-ALWAYS` holds in `n2`, `n4`, `n5`, and `n8` because when we
visit those nodes in every path that start from them `F` holds in every node.
![](/img/AL/always_holds_everywhere.jpeg)
---
| Formula | Semantic meaning |
| ---------------------------------- | :----------------------------------------------: |
| WHEN F HOLDS-IN-NODE node1,…,nodeK | we are in a node among node1,…,nodeK and F holds |
`WHEN F HOLDS-IN-NODE` `n2`, `n7`, `n6` holds only in node `n2` as it is the
only node in the list `n2`, `n7`, `n6` where F holds.
![](/img/AL/holds_in_node.jpeg)
Let's consider an example of checker using formula
`WHEN F HOLDS-IN-NODE node1,…,nodeK` for checking that a property with pointer
type should not be declared _"assign"_:
```
DEFINE-CHECKER ASSIGN_POINTER_WARNING = {
SET report_when =
WHEN
is_assign_property() AND is_property_pointer_type()
HOLDS-IN-NODE ObjCPropertyDecl;
SET message = "Property `%decl_name%` is a pointer type marked with the `assign` attribute";
SET suggestion = "Use a different attribute like `strong` or `weak`.";
SET severity = "WARNING";
};
```
The checker uses two predefined predicates `is_assign_property()` and
`is_property_pointer_type()` which are true if the property being declared is
assign and has a pointer type respectively. We want to check both conditions
only on nodes declaring properties, i.e., `ObjCPropertyDecl`.
---
| Formula | Semantic meaning |
| ----------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------: |
| IN-NODE node1,…, nodeK WITH-TRANSITION t F HOLDS-EVENTUALLY | from the current node there exists a path which eventually reaches a node among “node1,…,nodeK” with a transition t reaching a child where F holds |
The following tree explain the concept:
![](/img/AL/in_node_with_transition.jpeg)
The concept of transition is needed because of the special structure of the
clang AST. Certain kind of nodes, for example statements, have a list of
children that are statements as well. In this case there is no special tag
attached to the edge between the node and the children. Other nodes have
records, where some of the fields point to other nodes. For example a node
representing a function declaration will have a record where one of the fields
is body. This is pointing to a statement representing the function's body. For
records, sometimes we need to specify that we need a particular node reachable
via a particular field (i.e., a transition).
**Hint** A good way to learn how to write checkers is looking at existing
checkers in the file
[linters.al](https://github.com/facebook/infer/blob/master/infer/lib/linter_rules/linters.al).
## Example checks
In the following we show a few examples of simple checks you may wish to write
and the corresponding formulas:
- A check for flagging a Objective-C class that inherits from a class that
shouldn't be subclassed.
```
DEFINE-CHECKER SUBCLASSING_TEST_EXAMPLE = {
SET report_when = is_class("A") HOLDS-IN-SOME-SUPERCLASS-OF ObjCInterfaceDecl;
SET message = "This is subclassing A. Class A should not be subclassed.";
};
```
- A check for flagging an Objective-C instance method call:
```
DEFINE-CHECKER CALL_INSTANCE_METHOD = {
SET report_when = call_instance_method("A", "foo:");
SET message = "Do not call this method";
};
```
- A check for flagging an Objective-C instance method call of any method of a
class:
```
DEFINE-CHECKER CALL_ANY_INSTANCE_METHODS = {
SET report_when = call_instance_method(A, REGEXP("*"));
SET message = "Do not call any method of class A";
};
```
- A check for flagging an Objective-C class method call:
```
DEFINE-CHECKER CALL_CLASS_METHOD = {
SET report_when = call_class_method("A", "foo:");
SET message = "Do not call this method";
};
```
- A check for flagging an Objective-C method call of a method with int return
type:
```
DEFINE-CHECKER TEST_RETURN_METHOD = {
SET report_when = WHEN method_return_type("int")
HOLDS-IN-NODE ObjCMethodDecl;
SET message = "Method return int";
};
```
- A check for flagging a variable declaration with type long
```
DEFINE-CHECKER TEST_VAR_TYPE_CHECK = {
SET report_when = WHEN has_type("long")
HOLDS-IN-NODE VarDecl;
SET message = "Var %name% has type long";
};
```
- A check for flagging a method that has a parameter of type A\*
```
DEFINE-CHECKER TEST_PARAM_TYPE_CHECK = {
LET method_has_a_parameter_with_type(x) =
WHEN HOLDS-NEXT WITH-TRANSITION Parameters (has_type(x))
HOLDS-IN-NODE ObjCMethodDecl;
SET report_when =
method_has_a_parameter_with_type("A*" );
SET message = "Found a method with a parameter of type A";
};
```
- A check for flagging a method that has all the parameters of type A\* (and at
least one)
```
DEFINE-CHECKER TEST_PARAM_TYPE_CHECK2 = {
LET method_has_at_least_a_parameter =
WHEN HOLDS-NEXT WITH-TRANSITION Parameters (TRUE)
HOLDS-IN-NODE ObjCMethodDecl;
LET method_has_all_parameter_with_type(x) =
WHEN HOLDS-EVERYWHERE-NEXT WITH-TRANSITION Parameters (has_type(x))
HOLDS-IN-NODE ObjCMethodDecl;
SET report_when = method_has_at_least_a_parameter AND
method_has_all_parameter_with_type("int");
SET message = "All the parameters of the method have type int";
};
```
- A check for flagging a method that has the 2nd parameter of type A\*
```
DEFINE-CHECKER TEST_NTH_PARAM_TYPE_CHECK = {
SET report_when =
WHEN objc_method_has_nth_parameter_of_type("2", "A*")
HOLDS-IN-NODE ObjCMethodDecl;
SET message = "Found a method with the 2nd parameter of type A*";
SET severity = "LIKE";
};
```
- A check for flagging a protocol that inherits from a given protocol.
`HOLDS-EVENTUALLY WITH-TRANSITION Protocol` means follow the `Protocol` branch
in the AST until the condition holds.
```
DEFINE-CHECKER TEST_PROTOCOL_DEF_INHERITANCE = {
LET is_subprotocol_of(x) = declaration_has_name(x) HOLDS-EVENTUALLY WITH-TRANSITION Protocol;
SET report_when =
WHEN is_subprotocol_of("P")
HOLDS-IN-NODE ObjCProtocolDecl;
SET message = "Do not inherit from Protocol P";
};
```
- A check for flagging when a constructor is defined with a parameter of a type
that implements a given protocol (or that inherits from it).
`HOLDS-NEXT WITH-TRANSITION Parameters` means, starting in the
`ObjCMethodDecl` node, follow the `Parameters` branch in the AST and check
that the condition holds there.
```
DEFINE-CHECKER TEST_PROTOCOL_TYPE_INHERITANCE = {
LET method_has_parameter_subprotocol_of(x) =
WHEN
HOLDS-NEXT WITH-TRANSITION Parameters
(has_type_subprotocol_of(x))
HOLDS-IN-NODE ObjCMethodDecl;
SET report_when =
WHEN
declaration_has_name(REGEXP("^newWith.*:$")) AND
method_has_parameter_subprotocol_of("P")
HOLDS-IN-NODE ObjCMethodDecl;
SET message = "Do not define parameters of type P.";
};
```
- A check for flagging a variable declaration of type NSArray applied to A.
```
DEFINE-CHECKER TEST_GENERICS_TYPE = {
SET report_when =
WHEN has_type("NSArray<A>*")
HOLDS-IN-NODE VarDecl;
SET message = "Do not create arrays of type A";
};
```
- A check for flagging using a property or variable that is not available in the
supported API. decl_unavailable_in_supported_ios_sdk is a predicate that works
on a declaration, checks the available attribute from the declaration and
compares it with the supported iOS SDK. Notice that we flag the occurrence of
the variable or property, but the attribute is in the declaration, so we need
the transition `PointerToDecl` that follows the pointer from the usage to the
declaration.
```
DEFINE-CHECKER UNAVAILABLE_API_IN_SUPPORTED_IOS_SDK = {
SET report_when =
WHEN HOLDS-NEXT WITH-TRANSITION PointerToDecl
(decl_unavailable_in_supported_ios_sdk() AND
HOLDS-IN-NODE DeclRefExpr;
SET message = "%name% is not available in the required iOS SDK version";
};
```
- A check for flagging using a given namespace
```
DEFINE-CHECKER TEST_USING_NAMESPACE = {
SET report_when = using_namespace("N");
SET message = "Do not use namespace N";
};
```
- A check for flagging the use of given enum constants
```
DEFINE-CHECKER ENUM_CONSTANTS = {
SET report_when = is_enum_constant(REGEXP("MyName.*"));
SET message = "Do not use the enum MyName";
};
```
## AST info in messages
When you write the message of your rule, you may want to specify which
particular AST items were involved in the issue, such as a type or a variable
name. We have a mechanism for that, we specified a few placeholders that can be
used in rules with the syntax `%placeholder%` and it will be substituted by the
correct AST info. At the moment we have `%type%`, `%child_type%` and `%name%`
that print the type of the node, the type of the node's child, and a string
representation of the node, respectively. As with predicates, we can add more as
needed.
## Testing your rule
To test your rule you need to run it with Infer. If you are adding a new linter
you can test it in a separate al file that you can pass to Infer with the option
`--linters-def-file file.al`. Pass the option
`--linters-developer-mode --linter <LINTER_NAME>` to Infer to print debug
information and only run the linter you are developing, so it will be faster and
the debug info will be only about your linter.
To test your code, write a small example that triggers the rule. Then, run your
code with
```
infer --linters-developer-mode --linters-def-file file.al -- clang -c Test.m
```
the bug should be printed in the screen, like, for instance:
```
infer/tests/codetoanalyze/objcpp/linters/global-var/B.mm:34: warning: GLOBAL_VARIABLE_INITIALIZED_WITH_FUNCTION_OR_METHOD_CALL
Global variable kLineSize is initialized using a function or method call at line 34, column 1. If the function/method call is expensive,
it can affect the starting time of the app.
32. static float kPadding = [A bar] ? 10.0 : 11.0; // Error
33.
34. > static const float kLineSize = 1 / [A scale]; // Error
35.
36. static const float ok = 37;
37.
```
Moreover, the bug can be found in the file `infer-out/report.json` where
`infer-out` is the results directory where Infer operates, that is created in
the current directory. You can specify a different directory with the option
`-o`.
## Debugging
If there are syntax errors or other parsing errors with your al file, you will
get an error message when testing the rule, remember to use
`linters-developer-mode` when you are developing a rule. If the rule gets parsed
but still doesn't behave as you expect, you can debug it, by adding the
following line to a test source file in the line where you want to debug the
rule: `//INFER_BREAKPOINT`. Then run infer again in linters developer mode, and
it will stop the execution of the linter on the line of the breakpoint. Then you
can follow the execution step by step. It shows the current formula that is
being evaluated, and the current part of the AST that is being checked. A red
node means that the formula failed, a green node means that it succeeded.
## Demo
<iframe src="https://www.facebook.com/plugins/video.php?href=https%3A%2F%2Fwww.facebook.com%2Finferstaticanalyzer%2Fvideos%2F810308939133850%2F&show_text=0&width=400" width="500" height="500" scrolling="no" frameborder="0" allowTransparency="true" allowFullScreen="true"></iframe>
## Command line options for linters
The linters are run by default when you run Infer. However, there is a way of
running only the linters, which is faster than also running Infer. This is by
adding the option `-a linters` to the analysis command as in this example:
```bash
infer run -a linters -- clang -c Test.m
```
There are a few other command line options that are useful for using or
developing new linters in Infer. You can get those options with the command
`infer-capture --help`:
![](/img/AL/linters_help.png)

@ -1,494 +0,0 @@
---
id: racerd
title: "Infer : RacerD"
---
RacerD finds data races in your Java code. This page gives a more in-depth
explanation of how the analysis works, but may be less complete than the
[Thread Safety Violation bug description page](/docs/checkers-bug-types#thread-safety-violation).
To run the analysis, you can use plain `infer` (to run RacerD along with other
analyses that are run by default) or `infer --racerd-only` (to run only RacerD).
For example, the command `infer --racerd-only -- javac File.java` will run
RacerD on File.java.
## Background
RacerD statically analyzes Java code to detect potential concurrency bugs. This
analysis does not attempt to prove the absence of concurrency issues, rather, it
searches for a high-confidence class of data races. At the moment RacerD
concentrates on race conditions between methods in a class that is itself
intended to be thread safe. A race condition occurs when there are two
concurrent accesses to a class member variable that are not separated by mutual
exclusion, and at least one of the accesses is a write. Mutual exclusion can be
ensured by synchronization primitives such as locks, or by knowledge that both
accesses occur on the same thread.
## Triggering the analysis
RacerD doesn't try to check _all_ code for concurrency issues; it only looks at
code that it believes can run in a concurrent context. There are two signals
that RacerD looks for: (1) Explicitly annotating a class/method with
`@ThreadSafe` and (2) using a lock via the `synchronized` keyword. In both
cases, RacerD will look for concurrency issues in the code containing the signal
and all of its dependencies. In particular, it will report races between any
non-`private` methods of the same class that can peform conflicting accesses.
Annotating a class/interface with `@ThreadSafe` also triggers checking for all
of the subclasses of the class/implementations of the interface.
## Warnings
Let's take a look at the different types of concurrency issues that RacerD
flags. Two of the warning types are data races (`Unprotected write` and
`Read/write race`), and the third warning type encourages adding `@ThreadSafe`
annotations to interfaces to trigger additional checking.
### Unprotected write
RacerD will report an unprotected write when one or more writes can run in
parallel without synchronization. These come in two flavors: (1) a self-race (a
write-write race that occurs due to a method running in parallel with itself)
and (2) two conflicting writes to the same location. Here's an example of the
self-race flavor:
```
@ThreadSafe
public class Dinner {
private int mTemperature;
public void makeDinner() {
boilWater();
}
private void boilWater() {
mTemperature = 100; // unprotected write.
}
}
```
The class `Dinner` will generate the following report on the public method
`makeDinner()`:
`There may be a Thread Safety Violation: makeDinner() indirectly writes to mTemperature outside of synchronization.`
This warning can be fixed by synchronizing the access to `mTemperature`, making
`mTemperature` `volatile`, marking `makeDinner` as `@VisibleForTesting`, or
suppressing the warning by annotating the `Dinner` class or `makeDinner` method
with `@ThreadSafe(enableChecks = false)`.
### Read/Write Race
We sometimes need to protect read accesses as well as writes. Consider the
following class with unsynchronized methods.
```
@ThreadSafe
public class Account {
int mBalance = 0;
public void deposit(int amount) {
if (amount > 0) {
mBalance += amount;
}
}
public int withdraw(int amount){
if (amount >= 0 && mBalance - amount >= 0) {
mBalance -= amount;
return mBalance;
} else {
return 0;
}
}
}
```
If you run the `withdraw()` method in parallel with itself or with `deposit()`
you can get unexpected results here. For instance, if the stored balance is 11
and you run `withdraw(10)` in parallel with itself you can get a negative
balance. Furthermore, if you synchronize only the write statement
`mBalance -= amount`, then you can still get this bad result. The reason is that
there is a read/write race between the boolean condition
`mBalance - amount >= 0` and the writes. RacerD will duly warn
`Read/Write race. Public method int Account.withdraw(int) reads from field Account.mBalance. Potentially races with writes in methods void Account.deposit(int), int Account.withdraw(int)`
on the line with this boolean condition.
A solution to the threading problem here is to make both methods `synchronized`
to wrap both read and write accesses, or to use an `AtomicInteger` for
`mBalance` rather than an ordinary `int`.
### Interface not thread-safe
In the following code, RacerD will report an `Interface not thread-safe` warning
on the call to `i.bar()`:
```
interface I {
void bar();
}
@ThreadSafe
class C {
void foo(I i) {
i.bar(); // RacerD warns here
}
}
```
The way to fix this warning is to add a `@ThreadSafe` annotation to the
interface `I`, which will enforce the thread-safety of each of the
implementations of `I`.
You might wonder why it's necessary to annotate `I` -- can't RacerD just look at
all the implementations of `i` at the call site for `bar`? Although this is a
fine idea idea in principle, it's a bad idea in practice due to a (a) separate
compilation and (b) our diff-based deployment model. In the example above, the
compiler doesn't have to know about all implementations (or indeed, any
implementations) of `I` at the time it compiles this code, so there's no
guarantee that RacerD will know about or be able to check all implementations of
`I`. That's (a). For (b), say that we check that all implementations of `I` are
thread-safe at the time this code is written, but we don't add the annotation.
If someone else comes along and adds a new implementation of `I` that is not
thread-safe, RacerD will have no way of knowing that this will cause a potential
bug in `foo`. But if `I` is annotated, RacerD will enforce that all new
implementations of `I` are thread-safe, and `foo` will remain bug-free.
## Annotations to help RacerD understand your code
Getting started with RacerD doesn't require any annotations at all -- RacerD
will look at your usage of locks and figure out what data is not guarded
consistently. But increasing the coverage and signal-to-noise ratio may require
adding `@ThreadSafe` annotations along with some of the other annotations
described below. Most of annotations described below can be used via the Maven
Central package available
[here](https://maven-repository.com/artifact/com.facebook.infer.annotation/infer-annotation).
### `@ThreadConfined`
The intuitive idea of thread-safety is that a class is impervious to concurrency
issues for all concurrent contexts, even those that have not been written yet
(it is future-proof). RacerD implements this by naively assuming that any method
can potentially be called on any thread. You may determine, however, that an
object, method, or field is only ever accessed on a single thread during program
execution. Annotating such elements with `@ThreadConfined` informs RacerD of
this restriction. Note that a thread-confined method cannot race with itself but
it can still race with other methods.
```
List mCache;
@ThreadConfined(UI)
void prepareCache() {
// populate the cache
mCache.add(...);
// post cache cleanup task to run later
mUIExecutor.execute(new Runnable() {
@ThreadConfined(UI)
public void run() {
mCache.clear();
}
});
}
```
In this example, both `prepareCache` and `run` touch `mCache`. But there's no
possibility of a race between the two methods because both of them will run
sequentially on the UI thread. Adding a `@ThreadConfined(UI)` or `@UiThread`
annotation to these methods will stop it from warning that there is a race on
`mCache`. We could also choose to add a `@ThreadConfined` annotation to `mCache`
itself.
### `@Functional`
Not all races are bugs; a race can be benign. Consider the following:
```
@Functional Boolean askNetworkIfShouldShowFeature();
private Boolean mShouldShowFeature;
@ThreadSafe boolean shouldShowFeature() {
if (mShouldShowFeature == null) {
mShouldShowFeature = askNetworkIfShouldShowFeature();
}
return mShouldShowFeature;
}
```
This code caches the result of an expensive network call that checks whether the
current user should be shown an experimental feature. This code looks racy, and
indeed it is: if two threads execute `shouldShowFeature()` at the same time, one
may read `mShouldShowFeature` at the same time the other is writing it.
However, this is actually a _benign_ race that the programmer intentionally
allows for performance reasons. The reason this code is safe is that the
programmer knows that `askNetworkIfShouldShowFeature()` will always return the
same value in the same run of the app. Adding synchronization would remove the
race, but acquiring/releasing locks and lock contention would potentially slow
down every call to `shouldShowFeature()`. The benign race approach makes every
call after the first fast without changing the safety of the code.
RacerD will report a race on this code by default, but adding the
`@Functional annotation to askNetworkIfShouldShowFeature()` informs RacerD that
the function is always expected to return the same value. This assumption allows
RacerD to understand that this particular code is safe, though it will still
(correctly) warn if `mShouldShowFeature` is read/written elsewhere.
Be sure not to use the `@Functional` pattern for _singleton instantiation_, as
it's possible the "singleton" can be constructed more than once.
```
public class MySingleton {
private static sInstance;
// Not @Functional
public MySingleton getInstance() {
if (sInstance == null) {
// Different threads may construct their own instances.
sInstance == new MySingleton();
}
return sInstance;
}
}
```
### `@ReturnsOwnership`
RacerD does not warn on unprotected writes to _owned_ objects. An object is
owned if it has been freshly allocated in the current thread and has not escaped
to another thread. RacerDf automatically tracks ownership in most cases, but it
needs help with `abstract` and `interface` methods that return ownership:
```
@ThreadSafe
public interface Car {
@ReturnsOwnership abstract Car buyCar();
void carsStuff() {
Car myCar = new Car();
myCar.wheels = 4; // RacerD won't warn here because it knows myCar is owned
Car otherCar = buyCar();
otherCar.wheels = 3; // RacerD would normally warn here, but won't because of the `@ReturnsOwnership` annotation
}
}
```
### `@VisibleForTesting`
RacerD reports races between any two non`-private` methods of a class that may
run in a concurrent context. Sometimes, a RacerD report may be false because one
of the methods cannot actually be called from outside the current class. One fix
is making the method `private` to enforce this, but this might break unit tests
that need to call the method in order to test it. In this case, the
`@VisibleForTesting` annotation will allow RacerD to consider the method as
effectively `private` will still allowing it to be called from the unit test:
```
@VisibleForTesting void setF() {
this.f = ...; // RacerD would normally warn here, but @VisibleForTesting will silence the warning
}
synchronized void setFWithLock() {
setF();
}
```
Unlike the other annotations shown here, this one lives in
[Android](https://developer.android.com/reference/android/support/annotation/VisibleForTesting.html).
## <a name="interprocedural"></a> Interprocedural Reasoning
An important feature of RacerD is that it finds races by analyzing not just one
file or class, but by looking at memory accesses that occur after going through
several procedure calls. It handles this even between classes and between files.
Here is a very basic example
```
@ThreadSafe
class A{
void m1(B bb) {
bb.meth_write();
}
}
class B{
Integer x;
void meth_write() {
x = 88;
}
}
```
Class `B` is not annotated `@ThreadSafe` and does not have any locks, so RacerD
does not directly look for threading issues there. However, method `m1()` in
class `A` has a potential self-race, if it is run in parallel with itself and
the same argument for each call. RacerD discovers this.
```
InterProc.java:17: error: THREAD_SAFETY_VIOLATION
Unprotected write. Non-private method `A.m1` indirectly writes to field `&this.B.x` outside of synchronization.
Reporting because the current class is annotated `@ThreadSafe`, so we assume that this method can run in
parallel with other non-private methods in the class (incuding itself).
15.
16. void m1(B bb) {
17. > bb.meth_write();
18. }
19. }
```
RacerD does this sort of reasoning using what is known as a _compositional
inteprocedural analysis_. There, each method is analyzed independently of its
context to produce a summary of the behaviour of the procedure. In this case the
summaries for `m1()' and`meth()' include information as follows.
```
Procedure: void A.m1(B)
Accesses: { Unprotected({ 1 }) -> { Write to &bb.B.x at void B.meth_write() at line 17 } }
Procedure: void B.meth_write()
Accesses { Unprotected({ 0 }) -> { Write to &this.B.x at at line 25 } }
```
The descriptions here are cryptic and do not include all the information in the
summaries, but the main point is that you can use RacerD to look for races in
codebases where the mutations done by threads might occur only after a chain of
procedure calls.
## <a name="context"></a> Context and Selected Related Work
Reasoning about concurrency divides into bug detection and proving absence of
bugs. RacerD is on the detection side of reasoning.
The rapid growth in the number of interleavings is problematic for tools that
attempt exhaustive exploration. With just 150 instructions for two threads, the
number 10^88 of interleavings is more that the estimated number of atoms in the
known universe.
[There has been important work which uses various techniques to attempt to reduce the number of interleavings](https://en.wikipedia.org/wiki/Partial_order_reduction)
while still in principle covering all possibilities, but scale is still a
challenge. Note that RacerD is not exhaustive: it has false negatives (missed
bugs). But in compensation it is fast, and effective (it finds bugs in
practice).
Static analysis for concurrency has attracted a lot of attention from
researchers, but difficulties with scalability and precision have meant that
previous techniques have had little industrial impact. Automatic static race
detection itself has seen significant work. The most advanced approaches,
exemplified by the [Chord](http://www.cis.upenn.edu/~mhnaik/pubs/pldi06.pdf)
tool, often use a whole-program analysis paired with a sophisticated alias
analysis, two features we have consciously avoided. Generally speaking, the
leading research tools can be more precise, but RacerD is faster and can operate
without the whole program: we have opted to go for speed in a way that enables
industrial deployment on a large, rapidly changing codebase, while trying to use
as simple techniques as possible to cover many (not all) of the patterns covered
by slower but precise research tools.
An industrial static analysis tool from
[Contemplate](http://homepages.inf.ed.ac.uk/dts/pub/avocs2015.pdf) also targets
@ThreadSafe annotations, but limits the amount of inter-procedural reasoning:
“This analysis is interprocedural, but to keep the overall analysis scalable,
only calls to private and protected methods on the same class are followed”.
RacerD does deep, cross-file and cross-class inter-procedural reasoning, and yet
still scales; the inter-class capability was one of the first requests from
Facebook engineers.
[A separate blog post looked at 100 recent data race fixes](https://code.facebook.com/posts/1537144479682247/finding-inter-procedural-bugs-at-scale-with-infer-static-analyzer/)
in Infer's deployment in various bug categories, and for data races observed
that 53 of them were inter-file (and thus involving multiple classes).
[See above](racerd#interprocedural) for an example of RacerD's interprocedural
capabilities.
One reaction to the challenge of developing effective static race detectors has
been to ask the programmer to do more work to help the analyzer. Examples of
this approach include the
[Clang Thread Safety Analyzer](https://clang.llvm.org/docs/ThreadSafetyAnalysis.html),
the typing of [locks](https://doc.rust-lang.org/std/sync/struct.Mutex.html) in
Rust, and the use/checking of @GuardedBy annotations in
[Java](https://homes.cs.washington.edu/~mernst/pubs/locking-semantics-nfm2016.pdf)
including in
[Google's Error Prone analyzer](https://github.com/google/error-prone/blob/master/docs/bugpattern/GuardedBy.md).
When lock annotations are present they make the analyzer's life easier, and we
have
[GuardedBy checking as part of Infer](checkers-bug-types#UNSAFE_GUARDEDBY_ACCESS)
(though separate from the race detector). Our GuardedBy checker can find some
bugs that RacerD does not (see
[this example on anonymous inner classes](checkers-bug-types#anonymous_inner)),
but the race detector finds a greater number because it can work on un-annotated
code. It is possible to have a very effective race analysis without decreeing
that such annotations must be present. This was essential for our deployment,
since _requiring_ lock annotations would have been a show stopper for converting
many thousands of lines of code to a concurrent context. We believe that this
finding should be transportable to new type systems and language designs, as
well as to other analyses for existing languages.
Another reaction to difficulties in static race detection has been to instead
develop dynamic analyses, automatic testing tools which work by running a
program to attempt to find flaws. Google's Thread Sanitizer is a widely used and
mature tool in this area, which has been used in production to find many bugs in
C-family languages.
[The Thread Sanitizer authors explicitly call out limitations with static race analyzers](http://www.cs.columbia.edu/~junfeng/11fa-e6121/papers/thread-sanitizer.pdf)
as part of their motivation: “It seems unlikely that static detectors will work
effectively in our environment: Googles code is large and complex enough that
it would be expensive to add the annotations required by a typical static
detector”.
We have worked to limit the annotations that RacerD needs, for reasons similar
those expressed by the Thread Sanitizer authors. And we have sought to bring the
complementary benefits of static analysis — possibility of cheaper analysis and
fast reporting, and ability to analyze code before it is placed in a context to
run — to race detection. But we are interested as well in the future in
leveraging ideas in the dynamic techniques to improve or add to our analysis for
race detection.
## Limitations
There are a number of known limitations to the design of the race detector.
- It looks for races involving syntactically identical access paths, and misses
races due to aliasing
- It misses races that arise from a locally declared object escaping its scope
- It uses a boolean locks abstraction, and so misses races where two accesses
are mistakenly protected by different locks
- It assumes a deep ownership model, which misses races where local objects
refer to or contain non-owned objects.
- It avoids reasoning about weak memory and Java's volatile keyword
Most of these limitations are consistent with the design goal of reducing false
positives, even if they lead to false negatives. They also allow technical
tradeoffs which are different than if we were to favour reduction of false
negatives over false positives.
A different kind of limitation concerns the bugs searched for: Data races are
the most basic form of concurrency error, but there are many types of
concurrency issues out there that RacerD does not check for (but might in the
future). Examples include deadlock, atomicity, and check-then-act bugs (shown
below). You must look for these bugs yourself!
```
@ThreadSafe
public class SynchronizedList<T> {
synchronized boolean isEmpty() { ... }
synchronized T add(T item) { ... }
// Not thread safe!!!
public class ListUtil<T> {
public void addIfEmpty(SynchronizedList<T> list, T item) {
if (list.isEmpty()) {
// In a race, another thread can add to the list here.
list.add(item);
}
}
}
```
Finally, using `synchronized` blindly as a means to fix every unprotected write
or read is not always safe. Even with RacerD, finding, understanding, and fixing
concurrency issues is difficult. If you would like to learn more about best
practices, [Java Concurrency in Practice](http://jcip.net/) is an excellent
resource.

@ -1,83 +0,0 @@
---
id: limitations
title: Limitations, etc
---
## Expectations <a name="expectations"></a>
We want to be clear that if you run Infer on your project you might get very
good results, but it is also possible that you don't. Although we have had good
fix rates working with Facebook mobile codebases, we are not making strong
claims about rates of false alarms or similar when applied to arbitrary
codebases. For example, we have had some success
[getting bugs fixed in the DuckDuckGo Android App](blog/2015/05/22/Infer-on-open-source-android-apps),
but we encountered many false alarms when running Infer on GNU coreutils. It is
typical of program verification and static analysis tools that their results
vary, and that is to be expected, e.g., because they are tackling undecidable
problems and because different codebases they are applied to will have been
coded differently.
The good thing, though, is that you might get useful results! And, where the
results are imperfect, this can be taken as input for improvement.
Apart from these general remarks, Infer has a number of specific technical
limitations, which we describe in terms of bug types and language features.
## Bug types <a name="bugtypes"></a>
At present Infer is reporting on a restricted collection of
[bug types](/docs/checkers-bug-types), typically involving null pointers and
memory or resource leaks. The initial set of bug types Infer has focused on was
driven by the most pressing needs for serving the Facebook mobile developers.
Our approach has been to report less initially, to iterate with developers and
provide value to them, and gradually expand what we can do while still providing
value.
Some bug types we don't report as of yet include
- Array bounds errors
- Cast exceptions
- Leaking of tainted data
- Concurrency race conditions
and more. In the first three cases we have partial treatments inside of Infer,
but we have not surfaced these capabilities yet; the reports are not of
sufficient quality to present to developers. For example, Infer can find some
potential array bounds errors, but many of its reports are false alarms and it
misses still more.
Put another way: there is more work to do!
## Language Features <a name="languagefeatures"></a>
A different dimension in which Infer is limited concerns language features.
Infer either does not understand or has a weak treatment of
- Concurrency, including Java's Concurrency Utilities and iOS's Grand Central
Dispatch
- Dynamic dispatch
- Reflection
- Android lifecycles
- Arithmetic
- and more
Some of these problems are fundamental, largely open, problems in program
analysis (especially concurrency), while for others there is much prior and
successful work to draw upon (e.g., arithmetic) and are simply on our todo list
awaiting work.
Thus, Infer's core algorithms can be understood as being sound with respect to
an idealized model (that is all soundness can ever be), but this idealized model
is some distance from real execution models for programs where Infer is
deployed. One consequence of this is that we cannot claim that Infer reasons
about <i> all </i> flows through an application, but only <i> some </i> flows.
In approaching these limitations going forward we must consider solutions that
take into account our use case: to comment in minutes on modifications to large
codebases. Methods based on whole program analysis are challenging to consider
when approaching these problems for our deployment model.
These limitations can be seen positively as opportunities for improvement, to do
more static analysis and program verification for the benefit of programmers
everywhere! We will be delighted if people from the static analysis and program
verification communities join us in working on these problems.

@ -14,21 +14,14 @@ module.exports = {
"infer-workflow",
"analyzing-apps-or-projects",
"steps-for-ci",
"checkers",
"eradicate",
"linters",
"racerd",
"experimental-checkers",
"advanced-features",
"adding-models",
"man-pages",
],
"Analyses and Issue Types": checkers.doc_entries,
Foundations: [
"about-Infer",
"separation-logic-and-bi-abduction",
"limitations",
],
"Analyses and Issue Types": checkers.doc_entries,
Contribute: ["absint-framework", "adding-checkers", "internal-API"],
Versions: ["versions"],
},

Loading…
Cancel
Save