You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

284 lines
8.4 KiB

---
id: adding-models
title: Adding models
---
## Why do we need models
When analyzing projects with call dependencies between functions, Infer follows
the call graph to decide in which order to analyze these functions. The main
goal is to use the analysis summary of a function wherever this function is
called. On the following example:
```c
int foo(int x) {
if (x < 42) {
return x;
} else {
return 0;
}
}
int bar() {
return foo(24);
}
int baz() {
return foo(54);
}
```
Infer starts with the analysis on `foo` and detect that this function either
returns `0` if the argument is greater than or equal to `42`, or returns the
value of the argument otherwise. With this information, Infer detects that `bar`
always returns `24` and `baz` always returns `0`.
Now, it may happen that the code of some function is not available during the
analysis. For example, this happens when a project uses pre-compiled libraries.
The most typical case is the use of the standard library like in the following
example:
```c
#include <stdlib.h>
int* create() {
int *p = malloc(sizeof(int));
if (p == NULL) exit(1);
return p;
}
void assign(int x, int *p) {
*p = x;
}
int* my_function() {
int *p = create();
assign(42, p);
return p;
}
```
Here, Infer will start with the analysis of `create` but will not find the
source code for `malloc`. To deal with this situation, Infer relies on models of
the missing functions to proceed with the analysis. The function `malloc` is
internally modeled as either returning `NULL`, or returning a valid and
allocated pointer. Similarly, the function `exit` is modeled as terminating the
execution. Using these two models, Infer detects that `create` always returns an
allocated pointer and that `my_function` is safe.
At this point, it is important to note that missing source code and missing
models do not make the analysis fail. Missing functions are treated as having no
effect. However, even though skipping these missing functions is fine in most
cases, there can be cases where it affects the quality of the analysis. For
example, missing models can lead to incorrect bug reports.
Consider the case of a function `lib_exit` having the same semantics as `exit`
but defined in an pre-compiled library not part of the project being analyzed:
```c
void lib_exit(int);
int* create() {
int *p = malloc(sizeof(int));
if (p == NULL) lib_exit(1);
return p;
}
```
In this case, Infer will not be able to know that the return statement is only
possible in the case where `p` is not null. When analyzing `my_function`, Infer
will consider the null case and report a null dereference error in the call to
`assign(42, p)`.
Similarly, considering a function `lib_alloc` equivalent to `malloc`, and the
function `create` now defined as:
```c
int* lib_alloc(int);
int* create() {
int *p = lib_alloc(sizeof(int));
return p;
}
```
Then Infer will not report any null dereference in `my_function`.
## Examples of models
### Some models for C
Adding new models is easy. The models for C can be found in
[`infer/models/c/src/`](https://github.com/facebook/infer/tree/master/infer/models/c/src).
The file
[`libc_basic.c`](https://github.com/facebook/infer/blob/master/infer/models/c/src/libc_basic.c)
contains models for some of the most commonly encountered functions from the C
standard library. For example, the function `xmalloc`, which is essentially the
same function as `create` defined above, is modeled by:
```c
void *xmalloc(size_t size) {
void *ret = malloc(size);
INFER_EXCLUDE_CONDITION(ret == NULL);
return ret;
}
```
The function `xmalloc` is modeled using `malloc` to create an allocated object
and the macro `INFER_EXCLUDE_CONDITION` used to eliminate the case where
`malloc` can return null. The list of helper functions and macros for writing
models can be found in
[`infer_builtins.c`](https://github.com/facebook/infer/blob/master/infer/models/c/src/infer_builtins.c).
For a slightly more complex example, `realloc` is modeled as:
```c
void *realloc(void *ptr, size_t size) {
if(ptr==0) { // if ptr in NULL, behave as malloc
return malloc(size);
}
int old_size;
int can_enlarge;
old_size = __get_array_size(ptr); // force ptr to be an array
can_enlarge = __infer_nondet_int(); // nondeterministically choose whether the current block can be enlarged
if(can_enlarge) {
__set_array_size(ptr, size); // enlarge the block
return ptr;
}
int *newblock = malloc(size);
if(newblock) {
free(ptr);
return newblock;
}
else { // if new allocation fails, do not free the old block
return newblock;
}
}
```
This model is based on existing models for `malloc` and `free` and three helper
functions:
- `__get_array_size(ptr)` which allows to manipulate with a model what Infer
knows about the size of the allocated memory
- `__set_array_size(ptr, size)` to modify the information about the size of the
allocated memory
- `__infer_nondet_int()` to create a variable which can have any possible
integer value
### For Java
The models for Java are following the same approach and the list of helper
functions is in:
[`infer/models/java/src/com/facebook/infer/models/InferBuiltins.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/com/facebook/infer/models/InferBuiltins.java)
[`infer/models/java/src/com/facebook/infer/models/InferUndefined.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/com/facebook/infer/models/InferUndefined.java)
For example, Infer treats Java hash maps using a recency abstraction model:
Infer remembers the last two keys being added by `put` and checked by
`containsKey`, which can be used to make sure that no null pointer exceptions
are coming from the fact that `get(key)` returns null when `key` is not in the
map. This behavior can just be implemented via a model written in Java with the
help of few helper functions understood by Infer. These models can be found in:
[`infer/models/java/src/java/util/HashMap.java`](https://github.com/facebook/infer/blob/master/infer/models/java/src/java/util/HashMap.java)
and just rely on these two methods:
- `InferUndefined.boolean_undefined()` to create a non-deterministic choice
- `(V)InferUndefined.object_undefined()` to create a non null undefined object
of type `V`
## How to add new models
Let's look at a toy example in Java. As explained above, models for C,
Objective-C and Java are all following the same approach.
```java
import lib.Server;
public class Test {
enum Status {
SUCCESS, FAILURE, PING_FAILURE, CONNECTION_FAILURE
}
Status convertStatus(Server s) {
switch (s.getStatus()) {
case 0:
return Status.SUCCESS;
case 1:
return Status.FAILURE;
case 2:
return Status.FAILURE;
default: // should not happen
return null;
}
}
String statusName(Server s) {
Status status = convertStatus(s);
return status.name();
}
}
```
Assuming that the class `lib.Server` is part of a pre-compiled library, Infer
will report a null pointer exception in `statusName`. This happens whenever
`s.getStatus()` returns a value greater that `3`, in which case the default
branch of the switch statement is taken and `convertStatus` returns `null`.
However, we know from the documentation that the method `lib.Server.getStatus`
can only return `0`, `1`, or `2`. A possible approach would be to use an
assertion like the Guava `Preconditions.checkState` to inform Infer about the
invariant:
```java
Status convertStatus(Server s) {
int serverStatus = s.getStatus();
Preconditions.checkState(serverStatus >= 0 && serverStatus < 3);
switch (s.getStatus()) {
...
}
}
```
However, in the case where adding preconditions is not possible, we can then
write a model for `getStatus()` in order to make the analysis more precise.
To create a model for `getStatus()`, we need to add a class with the name and
the same package as for the original method. In this example:
- create a file `infer/models/java/src/infer/models/Server.java` with the
following content:
```java
package infer.models;
import com.facebook.infer.models.InferBuiltins;
import com.facebook.infer.models.InferUndefined;
public class Server {
public int getStatus() {
int status = InferUndefined.int_undefined();
InferBuiltins.assume(status >= 0 && status < 3);
return status;
}
}
```
- recompile infer:
```bash
make -C infer
```
- run the analysis again:
```bash
infer run -- javac Test.java
```
Now it should no longer report a null pointer exception.