diff --git a/website/docs/01-adding-models.md b/website/docs/01-adding-models.md deleted file mode 100644 index 8294fd0ba..000000000 --- a/website/docs/01-adding-models.md +++ /dev/null @@ -1,283 +0,0 @@ ---- -id: adding-models -title: Adding models ---- - -## Why do we need models - -When analyzing projects with call dependencies between functions, Infer follows -the call graph to decide in which order to analyze these functions. The main -goal is to use the analysis summary of a function wherever this function is -called. On the following example: - -```c -int foo(int x) { - if (x < 42) { - return x; - } else { - return 0; - } -} - -int bar() { - return foo(24); -} - -int baz() { - return foo(54); -} -``` - -Infer starts with the analysis on `foo` and detect that this function either -returns `0` if the argument is greater than or equal to `42`, or returns the -value of the argument otherwise. With this information, Infer detects that `bar` -always returns `24` and `baz` always returns `0`. - -Now, it may happen that the code of some function is not available during the -analysis. For example, this happens when a project uses pre-compiled libraries. -The most typical case is the use of the standard library like in the following -example: - -```c -#include - -int* create() { - int *p = malloc(sizeof(int)); - if (p == NULL) exit(1); - return p; -} - -void assign(int x, int *p) { - *p = x; -} - -int* my_function() { - int *p = create(); - assign(42, p); - return p; -} -``` - -Here, Infer will start with the analysis of `create` but will not find the -source code for `malloc`. To deal with this situation, Infer relies on models of -the missing functions to proceed with the analysis. The function `malloc` is -internally modeled as either returning `NULL`, or returning a valid and -allocated pointer. Similarly, the function `exit` is modeled as terminating the -execution. Using these two models, Infer detects that `create` always returns an -allocated pointer and that `my_function` is safe. - -At this point, it is important to note that missing source code and missing -models do not make the analysis fail. Missing functions are treated as having no -effect. However, even though skipping these missing functions is fine in most -cases, there can be cases where it affects the quality of the analysis. For -example, missing models can lead to incorrect bug reports. - -Consider the case of a function `lib_exit` having the same semantics as `exit` -but defined in an pre-compiled library not part of the project being analyzed: - -```c -void lib_exit(int); - -int* create() { - int *p = malloc(sizeof(int)); - if (p == NULL) lib_exit(1); - return p; -} -``` - -In this case, Infer will not be able to know that the return statement is only -possible in the case where `p` is not null. When analyzing `my_function`, Infer -will consider the null case and report a null dereference error in the call to -`assign(42, p)`. - -Similarly, considering a function `lib_alloc` equivalent to `malloc`, and the -function `create` now defined as: - -```c -int* lib_alloc(int); - -int* create() { - int *p = lib_alloc(sizeof(int)); - return p; -} -``` - -Then Infer will not report any null dereference in `my_function`. - -## Examples of models - -### Some models for C - -Adding new models is easy. The models for C can be found in -[`infer/models/c/src/`](https://github.com/facebook/infer/tree/master/infer/models/c/src). -The file -[`libc_basic.c`](https://github.com/facebook/infer/blob/master/infer/models/c/src/libc_basic.c) -contains models for some of the most commonly encountered functions from the C -standard library. For example, the function `xmalloc`, which is essentially the -same function as `create` defined above, is modeled by: - -```c -void *xmalloc(size_t size) { - void *ret = malloc(size); - INFER_EXCLUDE_CONDITION(ret == NULL); - return ret; -} -``` - -The function `xmalloc` is modeled using `malloc` to create an allocated object -and the macro `INFER_EXCLUDE_CONDITION` used to eliminate the case where -`malloc` can return null. The list of helper functions and macros for writing -models can be found in -[`infer_builtins.c`](https://github.com/facebook/infer/blob/master/infer/models/c/src/infer_builtins.c). - -For a slightly more complex example, `realloc` is modeled as: - -```c -void *realloc(void *ptr, size_t size) { - if(ptr==0) { // if ptr in NULL, behave as malloc - return malloc(size); - } - int old_size; - int can_enlarge; - old_size = __get_array_size(ptr); // force ptr to be an array - can_enlarge = __infer_nondet_int(); // nondeterministically choose whether the current block can be enlarged - if(can_enlarge) { - __set_array_size(ptr, size); // enlarge the block - return ptr; - } - int *newblock = malloc(size); - if(newblock) { - free(ptr); - return newblock; - } - else { // if new allocation fails, do not free the old block - return newblock; - } -} -``` - -This model is based on existing models for `malloc` and `free` and three helper -functions: - -- `__get_array_size(ptr)` which allows to manipulate with a model what Infer - knows about the size of the allocated memory -- `__set_array_size(ptr, size)` to modify the information about the size of the - allocated memory -- `__infer_nondet_int()` to create a variable which can have any possible - integer value - -### For Java - -The models for Java are following the same approach and the list of helper -functions is in: - -[`infer/models/java/src/com/facebook/infer/models/InferBuiltins.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/com/facebook/infer/models/InferBuiltins.java) -[`infer/models/java/src/com/facebook/infer/models/InferUndefined.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/com/facebook/infer/models/InferUndefined.java) - -For example, Infer treats Java hash maps using a recency abstraction model: -Infer remembers the last two keys being added by `put` and checked by -`containsKey`, which can be used to make sure that no null pointer exceptions -are coming from the fact that `get(key)` returns null when `key` is not in the -map. This behavior can just be implemented via a model written in Java with the -help of few helper functions understood by Infer. These models can be found in: - -[`infer/models/java/src/java/util/HashMap.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/java/util/HashMap.java) - -and just rely on these two methods: - -- `InferUndefined.boolean_undefined()` to create a non-deterministic choice -- `(V)InferUndefined.object_undefined()` to create a non null undefined object - of type `V` - -## How to add new models - -Let's look at a toy example in Java. As explained above, models for C, -Objective-C and Java are all following the same approach. - -```java -import lib.Server; - -public class Test { - - enum Status { - SUCCESS, FAILURE, PING_FAILURE, CONNECTION_FAILURE - } - - Status convertStatus(Server s) { - switch (s.getStatus()) { - case 0: - return Status.SUCCESS; - case 1: - return Status.FAILURE; - case 2: - return Status.FAILURE; - default: // should not happen - return null; - } - } - - String statusName(Server s) { - Status status = convertStatus(s); - return status.name(); - } - -} -``` - -Assuming that the class `lib.Server` is part of a pre-compiled library, Infer -will report a null pointer exception in `statusName`. This happens whenever -`s.getStatus()` returns a value greater that `3`, in which case the default -branch of the switch statement is taken and `convertStatus` returns `null`. -However, we know from the documentation that the method `lib.Server.getStatus` -can only return `0`, `1`, or `2`. A possible approach would be to use an -assertion like the Guava `Preconditions.checkState` to inform Infer about the -invariant: - -```java -Status convertStatus(Server s) { - int serverStatus = s.getStatus(); - Preconditions.checkState(serverStatus >= 0 && serverStatus < 3); - switch (s.getStatus()) { - ... - } -} -``` - -However, in the case where adding preconditions is not possible, we can then -write a model for `getStatus()` in order to make the analysis more precise. - -To create a model for `getStatus()`, we need to add a class with the name and -the same package as for the original method. In this example: - -- create a file `infer/models/java/src/infer/models/Server.java` with the - following content: - -```java -package infer.models; - -import com.facebook.infer.models.InferBuiltins; -import com.facebook.infer.models.InferUndefined; - -public class Server { - - public int getStatus() { - int status = InferUndefined.int_undefined(); - InferBuiltins.assume(status >= 0 && status < 3); - return status; - } -} -``` - -- recompile infer: - - ```bash - make -C infer - ``` - -- run the analysis again: - - ```bash - infer run -- javac Test.java - ``` - -Now it should no longer report a null pointer exception. diff --git a/website/docs/01-checkers.md b/website/docs/01-checkers.md deleted file mode 100644 index 337971b7b..000000000 --- a/website/docs/01-checkers.md +++ /dev/null @@ -1,26 +0,0 @@ ---- -id: checkers -title: "Infer : AI" ---- - -Infer.AI is a collection of program analyses which range from simple checks to -sophisticated inter-procedural analysis. Infer.AI is so named because it is -based on Abstract Interpretation. - -Current Infer.AI's which are in production include ThreadSafety, -AnnotationReachability (e.g., can an allocation be reached from a -@PerformanceCritical method), and -[immutable cast](checkers-bug-types#CHECKERS_IMMUTABLE_CAST) for Java, as well -as Static Initialization Order Fiasco for C++. - -The current checkers can be run by adding the option `-a checkers` to the -analysis command as in this example: - -```bash -infer run -a checkers -- javac Test.java -``` - -In addition, we are working on experimental AI's which target security -properties (Quandary) and buffer overruns (Inferbo). The infer commandline man -page (`infer --help`) explains how to run experimental AI's, or how to select -certain AI's and not others. diff --git a/website/docs/01-eradicate.md b/website/docs/01-eradicate.md deleted file mode 100644 index 9a9a5bc1b..000000000 --- a/website/docs/01-eradicate.md +++ /dev/null @@ -1,82 +0,0 @@ ---- -id: eradicate -title: "Infer : Eradicate" ---- - -> "I call it my billion-dollar mistake. It was the invention of the null -> reference in 1965." -> -> [Tony Hoare](http://en.wikipedia.org/wiki/Tony_Hoare) - -### What is Infer:Eradicate? - -Infer:Eradicate is a type checker for @Nullable annotations for Java. It is part -of the Infer static analysis suite of tools. The goal is to eradicate null -pointer exceptions. - -@Nullable -annotations denote that a parameter, field or the return value of a method can -be null. When decorating a parameter, this denotes that the parameter can -legitimately be null and the method will need to deal with it. When decorating a -method, this denotes the method might legitimately return null. - -Starting from @Nullable-annotated programs, the checker performs a flow -sensitive analysis to propagate the nullability through assignments and calls, -and flags errors for unprotected accesses to nullable values or -inconsistent/missing annotations. It can also be used to add annotations to a -previously un-annotated program. - -### What is the @Nullable convention? - -If you say nothing, you're saying that the value cannot be null. This is the -recommended option when possible: - -Program safely, annotate nothing! - -When this cannot be done, add a @Nullable annotation before the type to indicate -that the value can be null. - -### What is annotated? - -Annotations are placed at the interface of method calls and field accesses: - -- Parameters and return type of a method declaration. -- Field declarations. - -Local variable declarations are not annotated: their nullability is inferred. - -### How is Infer:Eradicate invoked? - -Eradicate can be invoked by adding the option `--eradicate` to the checkers mode -as in this example: - -```bash -infer run -a checkers --eradicate -- javac Test.java -``` - -The checker will report an error on the following program that accesses a -nullable value without null check: - -```java -class C { - int getLength(@Nullable String s) { - return s.length(); - } -} -``` - -But it will not report an error on this guarded dereference: - -```java -class C { - int getLength(@Nullable String s) { - if (s != null) { - return s.length(); - } else { - return -1; - } - } -} -``` - -Eradicate reports the following [warnings](/docs/eradicate-warnings). diff --git a/website/docs/01-experimental-checkers.md b/website/docs/01-experimental-checkers.md deleted file mode 100644 index 5668d3d15..000000000 --- a/website/docs/01-experimental-checkers.md +++ /dev/null @@ -1,38 +0,0 @@ ---- -id: experimental-checkers -title: "Infer : Experimental Checkers" ---- - -Infer contains a number of experimental checkers that can be run using just like -the normal infer analysis -`infer -a checkers -- -- `. `checker_name` can -be `bufferoverrun`, `siof`, or `quandary`. We'll explain the capabilities of -each experimental checker, its level of maturity (on a scale including "in -development", "medium", and "probably deployable"), and the language(s) it -targets. - -# Inferbo - -- Languages: C (but should be easy to adapt to Objective-C/C++, and possibly - Java.) -- Maturity: Medium - -Inferbo is a detector for out-of-bounds array accesses. You can read all about -it in this blog -[post](https://research.fb.com/inferbo-infer-based-buffer-overrun-analyzer/). It -has been tuned for C, but we are planning to adapt it to other languages in the -near future. - -# Quandary - -- Languages: Java, C/C++ -- Maturity: Medium - -Quandary is a static taint analyzer that identifies a variety of unsafe -information flows. It has a small list of built-in -[sources](https://github.com/facebook/infer/blob/master/infer/src/quandary/JavaTrace.ml#L36) -and -[sinks](https://github.com/facebook/infer/blob/master/infer/src/quandary/JavaTrace.ml#L178), -and you can define custom sources and sinks in your `.inferconfig` file (see -example -[here](https://github.com/facebook/infer/blob/master/infer/tests/codetoanalyze/java/quandary/.inferconfig)). diff --git a/website/docs/01-linters.md b/website/docs/01-linters.md deleted file mode 100644 index 01f25f301..000000000 --- a/website/docs/01-linters.md +++ /dev/null @@ -1,697 +0,0 @@ ---- -id: linters -title: "Infer : AL" ---- - -For C/C++ and Objective-C languages, we provide a linters framework. These are -checks about the syntax of the program; it could be about a property, or about -code inside one method, or that a class or method have certain properties. We -provide [a few checks](/docs/linters-bug-types) and we have developed a domain -specific language (DSL) to make it easier to write checks. - -## AL: A declarative language for writing linters in Infer - -One of the major advantage of Infer when compared with other static analyzers is -the fact it performs sophisticated inter-procedural/inter-file analysis. That -is, Infer can detect bugs which involve tracking values through many procedure -calls and the procedures may live in different files. These may be very subtle -bugs and designing static analyses to do that is quite involved and normally -requires deep static analysis expertise. - -However, there are many important software bugs that are confined in the code of -a single procedure (called intra-procedural). To detect these bugs simpler -analyses may suffice which do not require deep technical expertise in static -analysis. Often these bugs can be expressed by referring to the syntax of the -program, or the types of certain expressions. We have defined a new language to -easily design checkers which identify these kind of bugs. The language is called -AL (AST Language) and its main feature is the ability to reason about the -Abstract Syntax Tree of a program in a concise declarative way. AL's checkers -are interpreted by Infer to analyze programs. Thus, to detect new kind of bugs -in Infer one can just write a check in AL. We will see in more detail later, -that for writing AL formulas we also need predicates: simple functions that -check a property of the AST. Predicates are written in OCaml inside Infer, thus -it requires a bit of OCaml knowledge and getting familiar with the OCaml data -structure for the clang AST. - -## Getting the clang AST - -When you write a linter that traverses the AST of some programs to check some -property, you probably need to understand what the AST looks like. You can get -the AST of programs using clang directly, or using Infer. - -If you have a clang command `clang File.m` then you can get -the AST with - -```bash -clang -Xclang -ast-dump -fsyntax-only File.m -``` - -You can also get the AST using Infer. One advantage of this is that you don't -need to know the speicifc clang command, just the general build command. -Moreover, what you get here is exactly the form of the AST that Infer has as -input. - -For this you need to install an OCaml package `biniou` with -`opam install biniou`. See [the opam website](https://opam.ocaml.org/) for -instructions on how to install opam. - -Then, the AST can be created by Infer in debug mode. Call Infer with - -```bash -infer --debug -- -``` - -This will, among other things, generate a file `/path/to/File.m.ast.sh` for -every file `/path/to/File.m` that is being analyzed. Run this script with -`bash File.m.ast.sh` and a file `/path/to/File.m.ast.bdump` will be generated, -that contains the AST of the program in `bdump` format (similar to json). If you -get an error about `bdump` not being found you may need to run -`eval $(opam env)` to get the `bdump` executable (provided by the biniou opam -package) into your `PATH`. - -For general info on the clang AST, you can check out -[clang's website](http://clang.llvm.org/docs/IntroductionToTheClangAST.html). - -## Using AL to write linters - -Let's start with an example. Suppose we want to write the following -Objective-C's linter: - -_"a property containing the word 'delegate', but not containing the word 'queue' -should not be declared strong"_. - -We can write this property in the following way: - -```bash -DEFINE-CHECKER STRONG_DELEGATE_WARNING = { - - LET name_contains_delegate = - declaration_has_name(REGEXP("[dD]elegate")); - - LET name_does_not_contain_queue = - NOT declaration_has_name(REGEXP("[qQ]ueue")); - - SET report_when = - WHEN - name_contains_delegate - AND name_does_not_contain_queue - AND is_strong_property() - HOLDS-IN-NODE ObjCPropertyDecl; - - SET message = "Property or ivar %decl_name% declared strong"; - SET suggestion = "In general delegates should be declared weak or assign"; - SET severity = "WARNING" - }; -``` - -The linter definition starts with the keyword `DEFINE-CHECKER` followed by the -checker's name. The first `LET` clause defines the _formula variable_ -`name_contains_delegate` using the predicate `declaration_has_name` which return -true/false depending whether the property's name contains a word in the language -of the regular expression `[dD]elegate`. In general a predicate is a simple -atomic formula evaluated on an AST node. The list of available predicates is in -the module -[`cPredicates.mli`](https://github.com/facebook/infer/blob/master/infer/src/clang/cPredicates.mli) -(this list is continuously growing and if you need a new predicate you can add -it in ocaml). Formula variables can be used to simplify other definitions. The -`SET report_when` is mandatory and defines a formula that, when evaluates to -true, will tell Infer to report an error. In the case above, the formula is -saying that we should report when visiting an `ObjCPropertyDecl` (that is the -AST node declaring a property in Objective-C) where it holds that: the name -contains "delegate/Delegate" (`name_contains_delegate`) and the name doesn't -contain "queue/Queue" (`name_does_not_contain_queue`) and the node is defining a -"strong" property (`is_strong_property()`). - -The `SET message` clause defines the error message that will be displayed to the -user. Notice that the message can include placeholders like `%decl_name%`. -Placeholders are evaluated by Infer and substituted by their current value when -the error message is reported. In this case the name of the declaration. The -`SET suggestion` clause define an optional hint to give to programmer on how to -fix the problem. - -The general structure of a checker is the following: - -```bash -DEFINE-CHECKER id_of_the_checker = { - - LET formula = ; - LET …. - - SET report_when = ; - - SET name = ; - SET message = ; - SET suggestion = ; - SET doc_url = ; - SET severity = INFO | LIKE | ADVICE | WARNING | ERROR; - SET mode = ON | OFF - SET whitelist_path = {path1, path2, ..., pathn }; - SET blacklist_path = {path1, path2, ..., pathn }; - }; -``` - -The default severity is `WARNING` and the default mode is `ON`, so these are -optional. If the check is `OFF` it will only be available in debug mode (flags -`--debug` or `--linters-developer-mode`). `INFOs` are generally also not -reported, except with some specialzed flags. `name` and `doc_url` are used only -for CI comments at the moment (in Phabricator). - -## Defining Paths - -`whitelist_path` and `blacklist_path` are optional, by default the rule is -enabled everywhere. For specifying paths, one can use either string constants -(`"File.m"`) or regexes (`REGEXP("path/to/.*")`) or variables. The variables -stand for a list of paths, and are defined in a separate block: - -```bash - GLOBAL-PATHS { - path1 = {"A.m", REGEXP("path/to/.*")}; - }; -``` - -## Defining Macros - -It is possible to define macros that can be used in several checkers. This is -done in the following way: - -```bash -GLOBAL-MACROS { - LET is_subclass_of(x) = - is_class(x) HOLDS-IN-SOME-SUPERCLASS-OF ObjCInterfaceDecl; - }; -``` - -`GLOBAL-MACROS` is the section of an AL specification where one can define a -list of global macros. In the example we are defining the macro `is_subclass(x)` -which can now be used in checkers instead of its complex definition. - -It is possible to import a library of macros and paths with the following -command: - -``` -#IMPORT -``` - -In an AL file, the command above import and make available all the macros and -paths defined in the `library.al` file. - -## AL Predicates - -The simplest formulas we can write are predicates. They are defined inside -Infer. We provide a -[library](https://github.com/facebook/infer/blob/master/infer/src/clang/cPredicates.mli), -but if the predicate that you require is not available, you will need to extend -the library. Here are the some of the currently defined predicates: - -``` -call_class_method ("class_name", "method_name") -call_function ("method_name") -call_instance_method ("class_name", "method_name") -call_method ("method_name") -captures_cxx_references () -context_in_synchronized_block () -declaration_has_name ("decl_name") -declaration_ref_name ("decl_ref_name") -decl_unavailable_in_supported_ios_sdk () -has_cast_kind("cast_kind") // useful in a cast node -has_type ("type") // only builtin types, pointers and Objective-C classes available at the moment -isa ("class_name") -is_assign_property () -is_binop_with_kind ("kind") -is_class ("class_name") -is_const_var () -is_global_var () -is_ivar_atomic () -is_method_property_accessor_of_ivar () -is_node ("node_name") -is_objc_constructor () -is_objc_dealloc () -is_objc_extension () -is_objc_interface_named ("name") -is_property_pointer_type () -is_strong_property () -is_weak_property () -is_unop_with_kind ("kind") -method_return_type ("type") // only builtin type, pointers, and Objective-C classes available at the moment -objc_method_has_nth_parameter_of_type("type") -using_namespace("namespace") -within_responds_to_selector_block () -``` - -In general, the parameters of predicates can be constants, or variables, or -regular expressions. Variables are used in macros, see below. The syntax for -using regexes is `REGEX("your_reg_exp_here")`. - -**NOTE:** The predicates that expect types, such as `has_type` or -`method_return_type` or `objc_method_has_nth_parameter_of_type` also accept -regexes, but the syntax is a bit different: `REGEX('your_reg_exp_here')`, and -this regex can be embedded inside another string, for example: -`has_type("REGEXP('NS.+')*" )` which stands for pointer to a class of name -starting with NS. - -If you need to add a new predicate, write the predicate in -[cPredicates.ml](https://github.com/facebook/infer/blob/master/infer/src/clang/cPredicates.ml) -and then register it in -[CTL.ml](https://github.com/facebook/infer/blob/master/infer/src/clang/cTL.ml#L728). - -## AL Formulas - -Formulas are defined using a variation of the -[_CTL temporal logic_](https://en.wikipedia.org/wiki/Computation_tree_logic). -CTL is a logic expressing properties of a tree model. In the case of AL, the -tree is the AST of the program. Formulas are defined according to the following -grammar: - -``` -formula ::= predicate - | NOT formula - | formula1 OR formula2 - | formula1 AND formula2 - | formula1 IMPLIES formula2 - | formula1 HOLDS-UNTIL formula2 - | formula1 HOLDS-EVERYWHERE-UNTIL formula2 - | formula HOLDS-EVENTUALLY - | formula HOLDS-EVERYWHERE-EVENTUALLY - | formula HOLDS-NEXT - | formula HOLDS-EVERYWHERE-NEXT - | formula HOLDS-ALWAYS - | formula HOLDS-EVERYWHERE-ALWAYS - | WHEN formula HOLDS-IN-NODE node-name-list - | IN-NODE node-name-list WITH-TRANSITION transition-name - formula HOLDS-EVENTUALLY -``` - -The first four cases (`NOT`, `OR`, `AND`, `IMPLIES`) are classic boolean -operators with the usual semantics. The others are temporal operators describing -how the truth-value of a formula is evaluated in a tree. Let's consider case by -case. - -| Formula | Semantic meaning | -| ------------------- | :-------------------------------------------------------------------------------------------: | -| F1 _HOLDS-UNTIL_ F2 | from the current node, there exists a path where F1 holds at every node until F2 becomes true | - -An example is depicted in the following tree. When `F1` or `F2` hold in a node -this is indicated between square brackets. The formula `F1 HOLDS-UNTIL F2` holds -in the green nodes. - -![](/img/AL/holds_until.jpeg) - ---- - -| Formula | Semantic meaning | -| -------------------- | :--------------------------------------------------------------------------: | -| F _HOLDS-EVENTUALLY_ | from the current node there exists a path where at some point F becomes true | - -In the picture below, as `F` holds in `n10`, then `F HOLDS-EVENTUALLY` holds in -the green nodes `n1`, `n7`, `n10`. This is because from these nodes there is a -path reaching `n10` where `F` holds. Note that it holds for `n10` as well -because there exists a trivial path of length 0 from `n1` to itself. - -![](/img/AL/holds_eventually.jpeg) - ---- - -| Formula | Semantic meaning | -| ----------------------------- | :-----------------------------------------------------------------------: | -| F HOLDS-EVERYWHERE-EVENTUALLY | in every path starting from the current node at some point F becomes true | - -For example, in the tree below, the formula holds in every green node because -every paths starting from each of them eventually reaches a node where F holds. - -![](/img/AL/holds_everywhere_eventually.jpeg) - ---- - -| Formula | Semantic meaning | -| ------------ | :--------------------------------------------------------------------------: | -| F HOLDS-NEXT | from the current node (we are visiting) there exists a child where F is true | - -In the tree below, the formula `F HOLDS-NEXT` it is true only in n1 as it's the -only node with a child where `F` holds (node n3). In AL, `NEXT` is synonym of -child as, in terms of a path in the tree, a child is the next node. - -![](/img/AL/holds_next.jpeg) - ---- - -| Formula | Semantic meaning | -| ----------------------- | :-----------------------------------------------------: | -| F HOLDS-EVERYWHERE-NEXT | from the current node in every existing child F is true | - -In the tree below, the formula `F HOLDS-EVERYWHERE-NEXT` it is true in n1 as -it's the only node for which in every child `F` holds (node n2, n3, and n7). - -![](/img/AL/holds_everywhere_next.jpeg) - ---- - -| Formula | Semantic meaning | -| -------------- | :-------------------------------------------------------------------: | -| F HOLDS-ALWAYS | from the current node there exists a path where F holds at every node | - -In the tree below `F HOLDS-ALWAYS` holds in `n1`, `n2`, `n8` because for each of -these nodes there exists a path where `F` holds at each node in the path. - -![](/img/AL/always_holds.jpeg) - ---- - -| Formula | Semantic meaning | -| ------------------------- | :--------------------------------------------------------: | -| F HOLDS-EVERYWHERE-ALWAYS | from the current node, in every path F holds at every node | - -`F HOLDS-EVERYWHERE-ALWAYS` holds in `n2`, `n4`, `n5`, and `n8` because when we -visit those nodes in every path that start from them `F` holds in every node. - -![](/img/AL/always_holds_everywhere.jpeg) - ---- - -| Formula | Semantic meaning | -| ---------------------------------- | :----------------------------------------------: | -| WHEN F HOLDS-IN-NODE node1,…,nodeK | we are in a node among node1,…,nodeK and F holds | - -`WHEN F HOLDS-IN-NODE` `n2`, `n7`, `n6` holds only in node `n2` as it is the -only node in the list `n2`, `n7`, `n6` where F holds. - -![](/img/AL/holds_in_node.jpeg) - -Let's consider an example of checker using formula -`WHEN F HOLDS-IN-NODE node1,…,nodeK` for checking that a property with pointer -type should not be declared _"assign"_: - -``` -DEFINE-CHECKER ASSIGN_POINTER_WARNING = { - - SET report_when = - WHEN - is_assign_property() AND is_property_pointer_type() - HOLDS-IN-NODE ObjCPropertyDecl; - - SET message = "Property `%decl_name%` is a pointer type marked with the `assign` attribute"; - SET suggestion = "Use a different attribute like `strong` or `weak`."; - SET severity = "WARNING"; - }; -``` - -The checker uses two predefined predicates `is_assign_property()` and -`is_property_pointer_type()` which are true if the property being declared is -assign and has a pointer type respectively. We want to check both conditions -only on nodes declaring properties, i.e., `ObjCPropertyDecl`. - ---- - -| Formula | Semantic meaning | -| ----------------------------------------------------------- | :------------------------------------------------------------------------------------------------------------------------------------------------: | -| IN-NODE node1,…, nodeK WITH-TRANSITION t F HOLDS-EVENTUALLY | from the current node there exists a path which eventually reaches a node among “node1,…,nodeK” with a transition t reaching a child where F holds | - -The following tree explain the concept: - -![](/img/AL/in_node_with_transition.jpeg) - -The concept of transition is needed because of the special structure of the -clang AST. Certain kind of nodes, for example statements, have a list of -children that are statements as well. In this case there is no special tag -attached to the edge between the node and the children. Other nodes have -records, where some of the fields point to other nodes. For example a node -representing a function declaration will have a record where one of the fields -is body. This is pointing to a statement representing the function's body. For -records, sometimes we need to specify that we need a particular node reachable -via a particular field (i.e., a transition). - -**Hint** A good way to learn how to write checkers is looking at existing -checkers in the file -[linters.al](https://github.com/facebook/infer/blob/master/infer/lib/linter_rules/linters.al). - -## Example checks - -In the following we show a few examples of simple checks you may wish to write -and the corresponding formulas: - -- A check for flagging a Objective-C class that inherits from a class that - shouldn't be subclassed. - -``` -DEFINE-CHECKER SUBCLASSING_TEST_EXAMPLE = { - SET report_when = is_class("A") HOLDS-IN-SOME-SUPERCLASS-OF ObjCInterfaceDecl; - SET message = "This is subclassing A. Class A should not be subclassed."; -}; -``` - -- A check for flagging an Objective-C instance method call: - -``` -DEFINE-CHECKER CALL_INSTANCE_METHOD = { - SET report_when = call_instance_method("A", "foo:"); - SET message = "Do not call this method"; -}; -``` - -- A check for flagging an Objective-C instance method call of any method of a - class: - -``` -DEFINE-CHECKER CALL_ANY_INSTANCE_METHODS = { - SET report_when = call_instance_method(A, REGEXP("*")); - SET message = "Do not call any method of class A"; -}; -``` - -- A check for flagging an Objective-C class method call: - -``` -DEFINE-CHECKER CALL_CLASS_METHOD = { - SET report_when = call_class_method("A", "foo:"); - SET message = "Do not call this method"; -}; -``` - -- A check for flagging an Objective-C method call of a method with int return - type: - -``` -DEFINE-CHECKER TEST_RETURN_METHOD = { - SET report_when = WHEN method_return_type("int") - HOLDS-IN-NODE ObjCMethodDecl; - SET message = "Method return int"; -}; -``` - -- A check for flagging a variable declaration with type long - -``` -DEFINE-CHECKER TEST_VAR_TYPE_CHECK = { - SET report_when = WHEN has_type("long") - HOLDS-IN-NODE VarDecl; - SET message = "Var %name% has type long"; -}; -``` - -- A check for flagging a method that has a parameter of type A\* - -``` -DEFINE-CHECKER TEST_PARAM_TYPE_CHECK = { - LET method_has_a_parameter_with_type(x) = - WHEN HOLDS-NEXT WITH-TRANSITION Parameters (has_type(x)) - HOLDS-IN-NODE ObjCMethodDecl; - SET report_when = - method_has_a_parameter_with_type("A*" ); - SET message = "Found a method with a parameter of type A"; -}; -``` - -- A check for flagging a method that has all the parameters of type A\* (and at - least one) - -``` -DEFINE-CHECKER TEST_PARAM_TYPE_CHECK2 = { - LET method_has_at_least_a_parameter = - WHEN HOLDS-NEXT WITH-TRANSITION Parameters (TRUE) - HOLDS-IN-NODE ObjCMethodDecl; - - LET method_has_all_parameter_with_type(x) = - WHEN HOLDS-EVERYWHERE-NEXT WITH-TRANSITION Parameters (has_type(x)) - HOLDS-IN-NODE ObjCMethodDecl; - - SET report_when = method_has_at_least_a_parameter AND - method_has_all_parameter_with_type("int"); - SET message = "All the parameters of the method have type int"; -}; -``` - -- A check for flagging a method that has the 2nd parameter of type A\* - -``` -DEFINE-CHECKER TEST_NTH_PARAM_TYPE_CHECK = { - SET report_when = - WHEN objc_method_has_nth_parameter_of_type("2", "A*") - HOLDS-IN-NODE ObjCMethodDecl; - SET message = "Found a method with the 2nd parameter of type A*"; - SET severity = "LIKE"; -}; -``` - -- A check for flagging a protocol that inherits from a given protocol. - `HOLDS-EVENTUALLY WITH-TRANSITION Protocol` means follow the `Protocol` branch - in the AST until the condition holds. - -``` -DEFINE-CHECKER TEST_PROTOCOL_DEF_INHERITANCE = { - LET is_subprotocol_of(x) = declaration_has_name(x) HOLDS-EVENTUALLY WITH-TRANSITION Protocol; - SET report_when = - WHEN is_subprotocol_of("P") - HOLDS-IN-NODE ObjCProtocolDecl; - SET message = "Do not inherit from Protocol P"; -}; -``` - -- A check for flagging when a constructor is defined with a parameter of a type - that implements a given protocol (or that inherits from it). - `HOLDS-NEXT WITH-TRANSITION Parameters` means, starting in the - `ObjCMethodDecl` node, follow the `Parameters` branch in the AST and check - that the condition holds there. - -``` -DEFINE-CHECKER TEST_PROTOCOL_TYPE_INHERITANCE = { - - LET method_has_parameter_subprotocol_of(x) = - WHEN - HOLDS-NEXT WITH-TRANSITION Parameters - (has_type_subprotocol_of(x)) - HOLDS-IN-NODE ObjCMethodDecl; - - SET report_when = - WHEN - declaration_has_name(REGEXP("^newWith.*:$")) AND - method_has_parameter_subprotocol_of("P") - HOLDS-IN-NODE ObjCMethodDecl; - - SET message = "Do not define parameters of type P."; -}; -``` - -- A check for flagging a variable declaration of type NSArray applied to A. - -``` -DEFINE-CHECKER TEST_GENERICS_TYPE = { - SET report_when = - WHEN has_type("NSArray*") - HOLDS-IN-NODE VarDecl; - SET message = "Do not create arrays of type A"; -}; -``` - -- A check for flagging using a property or variable that is not available in the - supported API. decl_unavailable_in_supported_ios_sdk is a predicate that works - on a declaration, checks the available attribute from the declaration and - compares it with the supported iOS SDK. Notice that we flag the occurrence of - the variable or property, but the attribute is in the declaration, so we need - the transition `PointerToDecl` that follows the pointer from the usage to the - declaration. - -``` -DEFINE-CHECKER UNAVAILABLE_API_IN_SUPPORTED_IOS_SDK = { - SET report_when = - WHEN HOLDS-NEXT WITH-TRANSITION PointerToDecl - (decl_unavailable_in_supported_ios_sdk() AND - HOLDS-IN-NODE DeclRefExpr; - SET message = "%name% is not available in the required iOS SDK version"; -}; -``` - -- A check for flagging using a given namespace - -``` -DEFINE-CHECKER TEST_USING_NAMESPACE = { - SET report_when = using_namespace("N"); - SET message = "Do not use namespace N"; -}; -``` - -- A check for flagging the use of given enum constants - -``` -DEFINE-CHECKER ENUM_CONSTANTS = { - SET report_when = is_enum_constant(REGEXP("MyName.*")); - SET message = "Do not use the enum MyName"; -}; -``` - -## AST info in messages - -When you write the message of your rule, you may want to specify which -particular AST items were involved in the issue, such as a type or a variable -name. We have a mechanism for that, we specified a few placeholders that can be -used in rules with the syntax `%placeholder%` and it will be substituted by the -correct AST info. At the moment we have `%type%`, `%child_type%` and `%name%` -that print the type of the node, the type of the node's child, and a string -representation of the node, respectively. As with predicates, we can add more as -needed. - -## Testing your rule - -To test your rule you need to run it with Infer. If you are adding a new linter -you can test it in a separate al file that you can pass to Infer with the option -`--linters-def-file file.al`. Pass the option -`--linters-developer-mode --linter ` to Infer to print debug -information and only run the linter you are developing, so it will be faster and -the debug info will be only about your linter. - -To test your code, write a small example that triggers the rule. Then, run your -code with - -``` -infer --linters-developer-mode --linters-def-file file.al -- clang -c Test.m -``` - -the bug should be printed in the screen, like, for instance: - -``` -infer/tests/codetoanalyze/objcpp/linters/global-var/B.mm:34: warning: GLOBAL_VARIABLE_INITIALIZED_WITH_FUNCTION_OR_METHOD_CALL - Global variable kLineSize is initialized using a function or method call at line 34, column 1. If the function/method call is expensive, - it can affect the starting time of the app. - 32. static float kPadding = [A bar] ? 10.0 : 11.0; // Error - 33. - 34. > static const float kLineSize = 1 / [A scale]; // Error - 35. - 36. static const float ok = 37; - 37. -``` - -Moreover, the bug can be found in the file `infer-out/report.json` where -`infer-out` is the results directory where Infer operates, that is created in -the current directory. You can specify a different directory with the option -`-o`. - -## Debugging - -If there are syntax errors or other parsing errors with your al file, you will -get an error message when testing the rule, remember to use -`linters-developer-mode` when you are developing a rule. If the rule gets parsed -but still doesn't behave as you expect, you can debug it, by adding the -following line to a test source file in the line where you want to debug the -rule: `//INFER_BREAKPOINT`. Then run infer again in linters developer mode, and -it will stop the execution of the linter on the line of the breakpoint. Then you -can follow the execution step by step. It shows the current formula that is -being evaluated, and the current part of the AST that is being checked. A red -node means that the formula failed, a green node means that it succeeded. - -## Demo - - - -## Command line options for linters - -The linters are run by default when you run Infer. However, there is a way of -running only the linters, which is faster than also running Infer. This is by -adding the option `-a linters` to the analysis command as in this example: - -```bash -infer run -a linters -- clang -c Test.m -``` - -There are a few other command line options that are useful for using or -developing new linters in Infer. You can get those options with the command -`infer-capture --help`: - -![](/img/AL/linters_help.png) diff --git a/website/docs/01-racerd.md b/website/docs/01-racerd.md deleted file mode 100644 index 4b18cc546..000000000 --- a/website/docs/01-racerd.md +++ /dev/null @@ -1,494 +0,0 @@ ---- -id: racerd -title: "Infer : RacerD" ---- - -RacerD finds data races in your Java code. This page gives a more in-depth -explanation of how the analysis works, but may be less complete than the -[Thread Safety Violation bug description page](/docs/checkers-bug-types#thread-safety-violation). - -To run the analysis, you can use plain `infer` (to run RacerD along with other -analyses that are run by default) or `infer --racerd-only` (to run only RacerD). - -For example, the command `infer --racerd-only -- javac File.java` will run -RacerD on File.java. - -## Background - -RacerD statically analyzes Java code to detect potential concurrency bugs. This -analysis does not attempt to prove the absence of concurrency issues, rather, it -searches for a high-confidence class of data races. At the moment RacerD -concentrates on race conditions between methods in a class that is itself -intended to be thread safe. A race condition occurs when there are two -concurrent accesses to a class member variable that are not separated by mutual -exclusion, and at least one of the accesses is a write. Mutual exclusion can be -ensured by synchronization primitives such as locks, or by knowledge that both -accesses occur on the same thread. - -## Triggering the analysis - -RacerD doesn't try to check _all_ code for concurrency issues; it only looks at -code that it believes can run in a concurrent context. There are two signals -that RacerD looks for: (1) Explicitly annotating a class/method with -`@ThreadSafe` and (2) using a lock via the `synchronized` keyword. In both -cases, RacerD will look for concurrency issues in the code containing the signal -and all of its dependencies. In particular, it will report races between any -non-`private` methods of the same class that can peform conflicting accesses. -Annotating a class/interface with `@ThreadSafe` also triggers checking for all -of the subclasses of the class/implementations of the interface. - -## Warnings - -Let's take a look at the different types of concurrency issues that RacerD -flags. Two of the warning types are data races (`Unprotected write` and -`Read/write race`), and the third warning type encourages adding `@ThreadSafe` -annotations to interfaces to trigger additional checking. - -### Unprotected write - -RacerD will report an unprotected write when one or more writes can run in -parallel without synchronization. These come in two flavors: (1) a self-race (a -write-write race that occurs due to a method running in parallel with itself) -and (2) two conflicting writes to the same location. Here's an example of the -self-race flavor: - -``` -@ThreadSafe -public class Dinner { - private int mTemperature; - - public void makeDinner() { - boilWater(); - } - - private void boilWater() { - mTemperature = 100; // unprotected write. - } -} -``` - -The class `Dinner` will generate the following report on the public method -`makeDinner()`: - -`There may be a Thread Safety Violation: makeDinner() indirectly writes to mTemperature outside of synchronization.` - -This warning can be fixed by synchronizing the access to `mTemperature`, making -`mTemperature` `volatile`, marking `makeDinner` as `@VisibleForTesting`, or -suppressing the warning by annotating the `Dinner` class or `makeDinner` method -with `@ThreadSafe(enableChecks = false)`. - -### Read/Write Race - -We sometimes need to protect read accesses as well as writes. Consider the -following class with unsynchronized methods. - -``` -@ThreadSafe -public class Account { - - int mBalance = 0; - - public void deposit(int amount) { - if (amount > 0) { - mBalance += amount; - } - } - - public int withdraw(int amount){ - if (amount >= 0 && mBalance - amount >= 0) { - mBalance -= amount; - return mBalance; - } else { - return 0; - } - } -} -``` - -If you run the `withdraw()` method in parallel with itself or with `deposit()` -you can get unexpected results here. For instance, if the stored balance is 11 -and you run `withdraw(10)` in parallel with itself you can get a negative -balance. Furthermore, if you synchronize only the write statement -`mBalance -= amount`, then you can still get this bad result. The reason is that -there is a read/write race between the boolean condition -`mBalance - amount >= 0` and the writes. RacerD will duly warn - -`Read/Write race. Public method int Account.withdraw(int) reads from field Account.mBalance. Potentially races with writes in methods void Account.deposit(int), int Account.withdraw(int)` - -on the line with this boolean condition. - -A solution to the threading problem here is to make both methods `synchronized` -to wrap both read and write accesses, or to use an `AtomicInteger` for -`mBalance` rather than an ordinary `int`. - -### Interface not thread-safe - -In the following code, RacerD will report an `Interface not thread-safe` warning -on the call to `i.bar()`: - -``` -interface I { - void bar(); -} - -@ThreadSafe -class C { - void foo(I i) { - i.bar(); // RacerD warns here - } -} -``` - -The way to fix this warning is to add a `@ThreadSafe` annotation to the -interface `I`, which will enforce the thread-safety of each of the -implementations of `I`. - -You might wonder why it's necessary to annotate `I` -- can't RacerD just look at -all the implementations of `i` at the call site for `bar`? Although this is a -fine idea idea in principle, it's a bad idea in practice due to a (a) separate -compilation and (b) our diff-based deployment model. In the example above, the -compiler doesn't have to know about all implementations (or indeed, any -implementations) of `I` at the time it compiles this code, so there's no -guarantee that RacerD will know about or be able to check all implementations of -`I`. That's (a). For (b), say that we check that all implementations of `I` are -thread-safe at the time this code is written, but we don't add the annotation. -If someone else comes along and adds a new implementation of `I` that is not -thread-safe, RacerD will have no way of knowing that this will cause a potential -bug in `foo`. But if `I` is annotated, RacerD will enforce that all new -implementations of `I` are thread-safe, and `foo` will remain bug-free. - -## Annotations to help RacerD understand your code - -Getting started with RacerD doesn't require any annotations at all -- RacerD -will look at your usage of locks and figure out what data is not guarded -consistently. But increasing the coverage and signal-to-noise ratio may require -adding `@ThreadSafe` annotations along with some of the other annotations -described below. Most of annotations described below can be used via the Maven -Central package available -[here](https://maven-repository.com/artifact/com.facebook.infer.annotation/infer-annotation). - -### `@ThreadConfined` - -The intuitive idea of thread-safety is that a class is impervious to concurrency -issues for all concurrent contexts, even those that have not been written yet -(it is future-proof). RacerD implements this by naively assuming that any method -can potentially be called on any thread. You may determine, however, that an -object, method, or field is only ever accessed on a single thread during program -execution. Annotating such elements with `@ThreadConfined` informs RacerD of -this restriction. Note that a thread-confined method cannot race with itself but -it can still race with other methods. - -``` -List mCache; - -@ThreadConfined(UI) -void prepareCache() { - // populate the cache - mCache.add(...); - // post cache cleanup task to run later - mUIExecutor.execute(new Runnable() { - @ThreadConfined(UI) - public void run() { - mCache.clear(); - } - }); -} -``` - -In this example, both `prepareCache` and `run` touch `mCache`. But there's no -possibility of a race between the two methods because both of them will run -sequentially on the UI thread. Adding a `@ThreadConfined(UI)` or `@UiThread` -annotation to these methods will stop it from warning that there is a race on -`mCache`. We could also choose to add a `@ThreadConfined` annotation to `mCache` -itself. - -### `@Functional` - -Not all races are bugs; a race can be benign. Consider the following: - -``` -@Functional Boolean askNetworkIfShouldShowFeature(); - -private Boolean mShouldShowFeature; - -@ThreadSafe boolean shouldShowFeature() { - if (mShouldShowFeature == null) { - mShouldShowFeature = askNetworkIfShouldShowFeature(); - } - return mShouldShowFeature; -} -``` - -This code caches the result of an expensive network call that checks whether the -current user should be shown an experimental feature. This code looks racy, and -indeed it is: if two threads execute `shouldShowFeature()` at the same time, one -may read `mShouldShowFeature` at the same time the other is writing it. - -However, this is actually a _benign_ race that the programmer intentionally -allows for performance reasons. The reason this code is safe is that the -programmer knows that `askNetworkIfShouldShowFeature()` will always return the -same value in the same run of the app. Adding synchronization would remove the -race, but acquiring/releasing locks and lock contention would potentially slow -down every call to `shouldShowFeature()`. The benign race approach makes every -call after the first fast without changing the safety of the code. - -RacerD will report a race on this code by default, but adding the -`@Functional annotation to askNetworkIfShouldShowFeature()` informs RacerD that -the function is always expected to return the same value. This assumption allows -RacerD to understand that this particular code is safe, though it will still -(correctly) warn if `mShouldShowFeature` is read/written elsewhere. - -Be sure not to use the `@Functional` pattern for _singleton instantiation_, as -it's possible the "singleton" can be constructed more than once. - -``` -public class MySingleton { - private static sInstance; - - // Not @Functional - public MySingleton getInstance() { - if (sInstance == null) { - // Different threads may construct their own instances. - sInstance == new MySingleton(); - } - return sInstance; - } -} -``` - -### `@ReturnsOwnership` - -RacerD does not warn on unprotected writes to _owned_ objects. An object is -owned if it has been freshly allocated in the current thread and has not escaped -to another thread. RacerDf automatically tracks ownership in most cases, but it -needs help with `abstract` and `interface` methods that return ownership: - -``` -@ThreadSafe -public interface Car { - @ReturnsOwnership abstract Car buyCar(); - - void carsStuff() { - Car myCar = new Car(); - myCar.wheels = 4; // RacerD won't warn here because it knows myCar is owned - Car otherCar = buyCar(); - otherCar.wheels = 3; // RacerD would normally warn here, but won't because of the `@ReturnsOwnership` annotation - } -} -``` - -### `@VisibleForTesting` - -RacerD reports races between any two non`-private` methods of a class that may -run in a concurrent context. Sometimes, a RacerD report may be false because one -of the methods cannot actually be called from outside the current class. One fix -is making the method `private` to enforce this, but this might break unit tests -that need to call the method in order to test it. In this case, the -`@VisibleForTesting` annotation will allow RacerD to consider the method as -effectively `private` will still allowing it to be called from the unit test: - -``` -@VisibleForTesting void setF() { - this.f = ...; // RacerD would normally warn here, but @VisibleForTesting will silence the warning -} - -synchronized void setFWithLock() { - setF(); -} -``` - -Unlike the other annotations shown here, this one lives in -[Android](https://developer.android.com/reference/android/support/annotation/VisibleForTesting.html). - -## Interprocedural Reasoning - -An important feature of RacerD is that it finds races by analyzing not just one -file or class, but by looking at memory accesses that occur after going through -several procedure calls. It handles this even between classes and between files. - -Here is a very basic example - -``` -@ThreadSafe -class A{ - - void m1(B bb) { - bb.meth_write(); - } -} - -class B{ - Integer x; - - void meth_write() { - x = 88; - } - -} -``` - -Class `B` is not annotated `@ThreadSafe` and does not have any locks, so RacerD -does not directly look for threading issues there. However, method `m1()` in -class `A` has a potential self-race, if it is run in parallel with itself and -the same argument for each call. RacerD discovers this. - -``` -InterProc.java:17: error: THREAD_SAFETY_VIOLATION - Unprotected write. Non-private method `A.m1` indirectly writes to field `&this.B.x` outside of synchronization. - Reporting because the current class is annotated `@ThreadSafe`, so we assume that this method can run in - parallel with other non-private methods in the class (incuding itself). - 15. - 16. void m1(B bb) { - 17. > bb.meth_write(); - 18. } - 19. } -``` - -RacerD does this sort of reasoning using what is known as a _compositional -inteprocedural analysis_. There, each method is analyzed independently of its -context to produce a summary of the behaviour of the procedure. In this case the -summaries for `m1()' and`meth()' include information as follows. - -``` -Procedure: void A.m1(B) -Accesses: { Unprotected({ 1 }) -> { Write to &bb.B.x at void B.meth_write() at line 17 } } - -Procedure: void B.meth_write() -Accesses { Unprotected({ 0 }) -> { Write to &this.B.x at at line 25 } } -``` - -The descriptions here are cryptic and do not include all the information in the -summaries, but the main point is that you can use RacerD to look for races in -codebases where the mutations done by threads might occur only after a chain of -procedure calls. - -## Context and Selected Related Work - -Reasoning about concurrency divides into bug detection and proving absence of -bugs. RacerD is on the detection side of reasoning. - -The rapid growth in the number of interleavings is problematic for tools that -attempt exhaustive exploration. With just 150 instructions for two threads, the -number 10^88 of interleavings is more that the estimated number of atoms in the -known universe. -[There has been important work which uses various techniques to attempt to reduce the number of interleavings](https://en.wikipedia.org/wiki/Partial_order_reduction) -while still in principle covering all possibilities, but scale is still a -challenge. Note that RacerD is not exhaustive: it has false negatives (missed -bugs). But in compensation it is fast, and effective (it finds bugs in -practice). - -Static analysis for concurrency has attracted a lot of attention from -researchers, but difficulties with scalability and precision have meant that -previous techniques have had little industrial impact. Automatic static race -detection itself has seen significant work. The most advanced approaches, -exemplified by the [Chord](http://www.cis.upenn.edu/~mhnaik/pubs/pldi06.pdf) -tool, often use a whole-program analysis paired with a sophisticated alias -analysis, two features we have consciously avoided. Generally speaking, the -leading research tools can be more precise, but RacerD is faster and can operate -without the whole program: we have opted to go for speed in a way that enables -industrial deployment on a large, rapidly changing codebase, while trying to use -as simple techniques as possible to cover many (not all) of the patterns covered -by slower but precise research tools. - -An industrial static analysis tool from -[Contemplate](http://homepages.inf.ed.ac.uk/dts/pub/avocs2015.pdf) also targets -@ThreadSafe annotations, but limits the amount of inter-procedural reasoning: -“This analysis is interprocedural, but to keep the overall analysis scalable, -only calls to private and protected methods on the same class are followed”. -RacerD does deep, cross-file and cross-class inter-procedural reasoning, and yet -still scales; the inter-class capability was one of the first requests from -Facebook engineers. -[A separate blog post looked at 100 recent data race fixes](https://code.facebook.com/posts/1537144479682247/finding-inter-procedural-bugs-at-scale-with-infer-static-analyzer/) -in Infer's deployment in various bug categories, and for data races observed -that 53 of them were inter-file (and thus involving multiple classes). -[See above](racerd#interprocedural) for an example of RacerD's interprocedural -capabilities. - -One reaction to the challenge of developing effective static race detectors has -been to ask the programmer to do more work to help the analyzer. Examples of -this approach include the -[Clang Thread Safety Analyzer](https://clang.llvm.org/docs/ThreadSafetyAnalysis.html), -the typing of [locks](https://doc.rust-lang.org/std/sync/struct.Mutex.html) in -Rust, and the use/checking of @GuardedBy annotations in -[Java](https://homes.cs.washington.edu/~mernst/pubs/locking-semantics-nfm2016.pdf) -including in -[Google's Error Prone analyzer](https://github.com/google/error-prone/blob/master/docs/bugpattern/GuardedBy.md). -When lock annotations are present they make the analyzer's life easier, and we -have -[GuardedBy checking as part of Infer](checkers-bug-types#UNSAFE_GUARDEDBY_ACCESS) -(though separate from the race detector). Our GuardedBy checker can find some -bugs that RacerD does not (see -[this example on anonymous inner classes](checkers-bug-types#anonymous_inner)), -but the race detector finds a greater number because it can work on un-annotated -code. It is possible to have a very effective race analysis without decreeing -that such annotations must be present. This was essential for our deployment, -since _requiring_ lock annotations would have been a show stopper for converting -many thousands of lines of code to a concurrent context. We believe that this -finding should be transportable to new type systems and language designs, as -well as to other analyses for existing languages. - -Another reaction to difficulties in static race detection has been to instead -develop dynamic analyses, automatic testing tools which work by running a -program to attempt to find flaws. Google's Thread Sanitizer is a widely used and -mature tool in this area, which has been used in production to find many bugs in -C-family languages. -[The Thread Sanitizer authors explicitly call out limitations with static race analyzers](http://www.cs.columbia.edu/~junfeng/11fa-e6121/papers/thread-sanitizer.pdf) -as part of their motivation: “It seems unlikely that static detectors will work -effectively in our environment: Google’s code is large and complex enough that -it would be expensive to add the annotations required by a typical static -detector”. - -We have worked to limit the annotations that RacerD needs, for reasons similar -those expressed by the Thread Sanitizer authors. And we have sought to bring the -complementary benefits of static analysis — possibility of cheaper analysis and -fast reporting, and ability to analyze code before it is placed in a context to -run — to race detection. But we are interested as well in the future in -leveraging ideas in the dynamic techniques to improve or add to our analysis for -race detection. - -## Limitations - -There are a number of known limitations to the design of the race detector. - -- It looks for races involving syntactically identical access paths, and misses - races due to aliasing -- It misses races that arise from a locally declared object escaping its scope -- It uses a boolean locks abstraction, and so misses races where two accesses - are mistakenly protected by different locks -- It assumes a deep ownership model, which misses races where local objects - refer to or contain non-owned objects. -- It avoids reasoning about weak memory and Java's volatile keyword - -Most of these limitations are consistent with the design goal of reducing false -positives, even if they lead to false negatives. They also allow technical -tradeoffs which are different than if we were to favour reduction of false -negatives over false positives. - -A different kind of limitation concerns the bugs searched for: Data races are -the most basic form of concurrency error, but there are many types of -concurrency issues out there that RacerD does not check for (but might in the -future). Examples include deadlock, atomicity, and check-then-act bugs (shown -below). You must look for these bugs yourself! - -``` -@ThreadSafe -public class SynchronizedList { - synchronized boolean isEmpty() { ... } - synchronized T add(T item) { ... } - -// Not thread safe!!! -public class ListUtil { - public void addIfEmpty(SynchronizedList list, T item) { - if (list.isEmpty()) { - // In a race, another thread can add to the list here. - list.add(item); - } - } -} -``` - -Finally, using `synchronized` blindly as a means to fix every unprotected write -or read is not always safe. Even with RacerD, finding, understanding, and fixing -concurrency issues is difficult. If you would like to learn more about best -practices, [Java Concurrency in Practice](http://jcip.net/) is an excellent -resource. diff --git a/website/docs/02-limitations.md b/website/docs/02-limitations.md deleted file mode 100644 index cbd3e5c9c..000000000 --- a/website/docs/02-limitations.md +++ /dev/null @@ -1,83 +0,0 @@ ---- -id: limitations -title: Limitations, etc ---- - -## Expectations - -We want to be clear that if you run Infer on your project you might get very -good results, but it is also possible that you don't. Although we have had good -fix rates working with Facebook mobile codebases, we are not making strong -claims about rates of false alarms or similar when applied to arbitrary -codebases. For example, we have had some success -[getting bugs fixed in the DuckDuckGo Android App](blog/2015/05/22/Infer-on-open-source-android-apps), -but we encountered many false alarms when running Infer on GNU coreutils. It is -typical of program verification and static analysis tools that their results -vary, and that is to be expected, e.g., because they are tackling undecidable -problems and because different codebases they are applied to will have been -coded differently. - -The good thing, though, is that you might get useful results! And, where the -results are imperfect, this can be taken as input for improvement. - -Apart from these general remarks, Infer has a number of specific technical -limitations, which we describe in terms of bug types and language features. - -## Bug types - -At present Infer is reporting on a restricted collection of -[bug types](/docs/checkers-bug-types), typically involving null pointers and -memory or resource leaks. The initial set of bug types Infer has focused on was -driven by the most pressing needs for serving the Facebook mobile developers. -Our approach has been to report less initially, to iterate with developers and -provide value to them, and gradually expand what we can do while still providing -value. - -Some bug types we don't report as of yet include - -- Array bounds errors -- Cast exceptions -- Leaking of tainted data -- Concurrency race conditions - -and more. In the first three cases we have partial treatments inside of Infer, -but we have not surfaced these capabilities yet; the reports are not of -sufficient quality to present to developers. For example, Infer can find some -potential array bounds errors, but many of its reports are false alarms and it -misses still more. - -Put another way: there is more work to do! - -## Language Features - -A different dimension in which Infer is limited concerns language features. -Infer either does not understand or has a weak treatment of - -- Concurrency, including Java's Concurrency Utilities and iOS's Grand Central - Dispatch -- Dynamic dispatch -- Reflection -- Android lifecycles -- Arithmetic -- and more - -Some of these problems are fundamental, largely open, problems in program -analysis (especially concurrency), while for others there is much prior and -successful work to draw upon (e.g., arithmetic) and are simply on our todo list -awaiting work. - -Thus, Infer's core algorithms can be understood as being sound with respect to -an idealized model (that is all soundness can ever be), but this idealized model -is some distance from real execution models for programs where Infer is -deployed. One consequence of this is that we cannot claim that Infer reasons -about all flows through an application, but only some flows. - -In approaching these limitations going forward we must consider solutions that -take into account our use case: to comment in minutes on modifications to large -codebases. Methods based on whole program analysis are challenging to consider -when approaching these problems for our deployment model. - -These limitations can be seen positively as opportunities for improvement, to do -more static analysis and program verification for the benefit of programmers -everywhere! We will be delighted if people from the static analysis and program -verification communities join us in working on these problems. diff --git a/website/sidebars.js b/website/sidebars.js index 6f878c029..59bf4c51a 100644 --- a/website/sidebars.js +++ b/website/sidebars.js @@ -14,21 +14,14 @@ module.exports = { "infer-workflow", "analyzing-apps-or-projects", "steps-for-ci", - "checkers", - "eradicate", - "linters", - "racerd", - "experimental-checkers", "advanced-features", - "adding-models", "man-pages", ], + "Analyses and Issue Types": checkers.doc_entries, Foundations: [ "about-Infer", "separation-logic-and-bi-abduction", - "limitations", ], - "Analyses and Issue Types": checkers.doc_entries, Contribute: ["absint-framework", "adding-checkers", "internal-API"], Versions: ["versions"], },