Summary: Using a dedicated abstract domain, like Quandary does, is more suitable for taint analysis.
Reviewed By: sblackshear
Differential Revision: D5473794
fbshipit-source-id: c917417
Summary:
Both `stringWithUTF8String` and `stringWithString` implements copy semantics that copies the content of their parameter into a newly allocated buffer. We modeled this as pointer assignment in the past, which means that once we write
```
NSString* foo() {
char buf[...];
...
return [NSString stringWithUTF8String:buf];
}
```
We are going to get a spurious stack variable address escape report because local pointer `buf` is assigned to the newly created string and the string gets returned.
This diff tries to address the issue by heap-allocating a buffer and `memcpy` the contents in `stringWithUTF8String` and `stringWithString`. But this change will create another problem: the allocated buffer will be reported as leaked by the backend, while in reality those buffers won't actually be leaked as they are allocated in a region that will be periodically autoreleased. To suppress spurious memory leak FPs, I added another attribute `Awont_leak` that will suppress the leakage report on any expressions that get tagged with it.
Reviewed By: jeremydubreil
Differential Revision: D5403084
fbshipit-source-id: df6de7f
Summary:
Pretty basic: warn when we see an assignment instruction `x = ...` and `x` is not live in the post of the instruction.
Only enabled for Clang at the moment because linters already warn on this for Java. But we can enable it later if we want to (should be fully generic).
Reviewed By: jeremydubreil
Differential Revision: D5450439
fbshipit-source-id: 693514c
Summary: We get the wrong answer on most of them for now, but that is expected
Reviewed By: ngorogiannis
Differential Revision: D5429242
fbshipit-source-id: 4899079
Summary:
This commit avoids precision loss on pruning.
// x -> [s$1, s$2]
if(x) { ... }
// x -> ?
before: x -> [min(0, s$1), max(0, s$2)]
because two x values, [0, 0] (true case) and [s$1, s$2] (false case), were joined after the if branch.
after: x -> [s$1, s$2]
Reviewed By: mbouaziz
Differential Revision: D5431009
fbshipit-source-id: 14a9efe
Summary:
This just makes the warnings silent for now. We may improve the analysis to check if the null check on the captured fields are consistent with the annotation on the corresponding parameters.
Eradicate also has the same issue. I added a test to outline this. The biabduction analysis will also probably fail on the same of annotation lookup. We may want implement the proper fix at the level of `Annotation.field_has_annot`.
Reviewed By: sblackshear
Differential Revision: D5419243
fbshipit-source-id: 6460de8
Summary: CFG nodes were not connected and some instructions ended up in wrong place. Fix those issues
Reviewed By: dulmarod
Differential Revision: D5406720
fbshipit-source-id: 2a70e1a
Summary:
Problem: The analyzer did not know that the value of `v.size()` is an alias of `v.infer_size`, so `v.infer_size` is not pruned by the if condition. As a result it raises a false alarm.
void safe_access(std::vector<int> v) {
if (v.size() >= 10) {
v[9] = 1; // error: BUFFER_OVERRUN Offset: [9, 9] Size: [5, 5]
}
}
void call_safe_access_Good() {
std::vector<int> v(5, 0);
safe_access(v);
}
Solution: Adding alias for return value to the abstract domain.
Now Inferbo can prune `v.infer_size` because it knows that the value of `v.size()` is an alias of `v.infer_size`. There is already an alias domain in Inferbo, so we added a specific room for the retrun value.
Reviewed By: jvillard, mbouaziz
Differential Revision: D5396988
fbshipit-source-id: 4a4702c
Summary: Adding to the Quandary tests, the list of tests that are already working for the bi-abduction based taint analysis.
Reviewed By: sblackshear
Differential Revision: D5395734
fbshipit-source-id: c4f2e79
Summary:
:
because otherwise people would believe they can use the internal representation of these std lib but it fails for our models.
Reviewed By: jvillard
Differential Revision: D5368671
fbshipit-source-id: 4e53d5a
Summary:
:
Get rid of model location in reports.
The goal is to avoid changing `issues.exp` whenever a model is updated.
Reviewed By: jvillard
Differential Revision: D5356608
fbshipit-source-id: 88ecaba
Summary:
Indexing into a string literal expression would generate a fresh
variable on every application of a transformer. This violated
finiteness of the domain, and caused divergence.
Reviewed By: da319
Differential Revision: D5342951
fbshipit-source-id: e95e84e
Summary:
It instantiates fields of structures when a pointer to which is given
as a function parameter, e.g., `foo(&s);`.
Reviewed By: mbouaziz, jvillard
Differential Revision: D5337645
fbshipit-source-id: c06da29
Summary:
We keep track of both `beginPtr` and `endPtr` but the modelling was mostly
about `beginPtr` as some sort of approximation I guess. This shouldn't change
much but will be useful later when doing more iterator stuff.
Reviewed By: mbouaziz
Differential Revision: D5255772
fbshipit-source-id: 0f6e3e8
Summary: This seems to move in the right direction. Also, `const operator[]` did not do an `access_at`, which I fixed.
Reviewed By: mbouaziz
Differential Revision: D5320427
fbshipit-source-id: c31c5ea
Summary: Unknown library returns the unknown pointer as well as the top interval.
Reviewed By: mbouaziz, jvillard
Differential Revision: D5282669
fbshipit-source-id: 34c7e18
Summary:
This diff tries to achieve the followings: if we have the following C++ codes:
```
bool foo(int x, int y) {
return &x == &y;
}
```
We want the C++ frontend to emit Sil as if the input is written as
```
bool foo(int x, int y) {
if (&x == &y) return 1; else return 0;
}
```
This matches the behavior of our Java frontend.
The reason why we prefer an explicit branch is that it will force the backend to eagerly produce two different specs for `foo`. Without the explicit branch, for the above example the backend would produce one spec with `return = (&x == &y)` as the post condition, which is not ideal because (1) we don't want local variables to escape to the function summary, and (2) with the knowledge that no two local variables may alias each other, the backend could actually determines that `&x == &y` is always false, emitting a more precise postcondition `return = 0`. This is not possible if we do not eagerly resolve the comparison expression.
Reviewed By: akotulski
Differential Revision: D5260745
fbshipit-source-id: 6bbbf99
Summary:
:
There are throw wrapper functions like `std::__throw_bad_alloc()` defined in both libstdc++ (https://github.com/gcc-mirror/gcc/blob/master/libstdc%2B%2B-v3/include/bits/functexcept.h) and libc++ (e.g. 907c1196a7/include/new (L145)). Folly actually exports some of them as well (diffusion/FBS/browse/master/fbcode/folly/portability/BitsFunctexcept.h). The function body of those wrappers merely throws the corresponding exception. My understanding is that the primary purpose of the wrappers is to throw the exception if everything goes well and to fall back to something reasonable when exception is disabled (e.g. when `-fno-exceptions` is passed to the compiler).
The problem is that infer doesn't really understand what those functions do, and I've seen some false positives get reported as a result of it. So to remove those FPs we need to either model them or handle them specially. Modeling those wrappers by either whitelisting them or overriding the include files turns out to be difficult, as those wrappers are only declared but not defined in the STL headers. Their implementations are not available to Infer so whitelisting them does nothing, and if I provide custom implementations in the headers then normal compilation process will be disrupted because the linker would complain about duplicated implementation.
What I did here is to replace functions whose name matches one of the throw wrapper's name with a `BuiltinDecls.exit`. I have to admit that this is a bit hacky: initially I was trying to do something more general: replacing functions with `noreturn` attribute with `BuitinDecls.exit`. That did not work because, CMIIW, the current frontend only exports function attributes for functions with actual bodies, not declaration-only functions. I'd love to be informed if there are better ways to handle those wrappers.
Reviewed By: jeremydubreil
Differential Revision: D5266030
fbshipit-source-id: 4580227
Summary: We had a model for `Pools.SimplePool`, but were missing models for `Pools.Pool`. Since `SimplePool` and `SynchronizedPool` both extend `Pool`, modeling it should cover all of the cases.
Reviewed By: ngorogiannis
Differential Revision: D5236280
fbshipit-source-id: 9bbdb25
Summary:
:
No longer use deprecated reporting function for the suggest nullable checker
Depends on D5205009
Reviewed By: grievejia
Differential Revision: D5205843
fbshipit-source-id: f6dd059
Summary:
Read/write race errors should always show one trace for a read and one trace for a write.
We forget to pass the conflicting writes to the reporting function in one case, which prevented us from showing a well-formed trace.
Fixed it by making the `conflicts` parameter non-optional
Reviewed By: jberdine
Differential Revision: D5209332
fbshipit-source-id: 05da01a
Summary: We were almost always using `~report_reachable:true`, and in the cases where we weren't it is fine to do so. In general, a sink could read any state from its parameters, so it makes sense to complain if anything reachable from them is tainted.
Reviewed By: mbouaziz
Differential Revision: D5169067
fbshipit-source-id: ea7d659
Summary:
For now, we just support clearing the taint on a return value.
Ideally, we would associate a kind with the sanitizer and only clear taint that matches that kind.
However, it's fairly complicated to make that work properly with footprint sources.
I have some ideas about how to do it with passthroughs instead, but let's just do the simple thing for now.
Reviewed By: jeremydubreil
Differential Revision: D5141906
fbshipit-source-id: a5b8b5e
Summary:
- model `exit` as `Bottom`
- model `fgetc` as returning `[-1; 255]` rather than `[-1; +oo]`
- reduced the number of model functions for simple models
Reviewed By: KihongHeo
Differential Revision: D5137485
fbshipit-source-id: 943eeeb
Summary:
This is a minimal change to (poorly) recognize and model std::mutex
lock and unlock methods, and to surface all thread safety issues for
C++ based on the computed summaries with no filtering.
This ignores much of the Java analysis, including everything about the
Threads domain. The S/N is comically low at this point.
Reviewed By: sblackshear
Differential Revision: D5120485
fbshipit-source-id: 0f08caa
Summary:
This diff fixes unintentional bottoms in pointer arithmetic of inferbo.
The pointer arithmetic on addresses of variables (not array) just returns
the operand.
Reviewed By: jvillard
Differential Revision: D5060424
fbshipit-source-id: 495d8b8
Summary: Using Conjunction for thread join has known false negatives. Finer grained recording of threading information fixes this.
Reviewed By: sblackshear
Differential Revision: D5111161
fbshipit-source-id: aab483c
Summary: This fixes a couple of false positives as objects of BufferedReader don't need to be closed if the wrapped reader resource gets closed correctly.
Reviewed By: sblackshear
Differential Revision: D5106596
fbshipit-source-id: 725fb80
Summary: Gflags is a popular library used to create command line arguments. Flags shouldn't flow directly to `exec` etc.
Reviewed By: jvillard, mbouaziz
Differential Revision: D5058393
fbshipit-source-id: ab062f8
Summary: Useful to have Eradicate and Biabduction agree on how to inform that the analysis that some objects are not null.
Reviewed By: sblackshear
Differential Revision: D5075127
fbshipit-source-id: 9e56981
Summary: String are very important for taint analysis, have to make sure that we have the right models/the right behaviors for unknown code.
Reviewed By: jvillard
Differential Revision: D5054832
fbshipit-source-id: 7e7ee07
Summary:
Previously all knowledge of the dynamic length of such arrays was lost to infer:
```
void foo(int len) {
int a[len];
}
```
The translation of this program would make no reference to `len` (except as a
param of `foo`).
Translate this "initialization" using the existing `__set_array_length` infer
builtin, as:
```
# Declare local a[_]
n$0 = len;
__set_array_length(a, len);
```
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz
Differential Revision: D4969446
fbshipit-source-id: dff860f
Summary:
This commit fixes a problem that the buffer overrun checker incorrectly
stops when a global variable (bottom) is involved in control flow.
In the new version, abstract memories return Top for unanalyzed abstract
variables.
Reviewed By: mbouaziz
Differential Revision: D5016447
fbshipit-source-id: 5132448
Summary:
The code was pretty fragile, it's less fragile now. We should probably just
delete that and start outputting proper class/package info directly in the
report.json instead of reverse-engineering them.
Fixes#640
Reviewed By: jeremydubreil
Differential Revision: D4905973
fbshipit-source-id: 1590067
Summary:
An array has a static or dynamic length (number of elements), but it also has a
stride, determined by the type of the element: `sizeof(element_type)`. We don't
have a good `sizeof()` function available on SIL types, so record that stride
in the array type.
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz
Differential Revision: D4969697
fbshipit-source-id: 98e0670
Summary: In particular, the heuristics for propagating taint via unknown code needs to be aware of the frontend's trick of introducing dummy return variables.
Reviewed By: mbouaziz
Differential Revision: D5046345
fbshipit-source-id: da87665
Summary:
HIL had only been tested in Java, and it made some assumptions about what array expressions look like (the LHS is always resolvable to an access path) and assignments (the LHS is always an access path) that aren't true in C.
Fixed the code so we won't crash in this case.
Thanks to jeremydubreil for catching this.
Reviewed By: jeremydubreil
Differential Revision: D5047649
fbshipit-source-id: e8484f4
Summary:
There are two pointer-related operations you can do in C++ but not Java that we need to support in taint analysis:
(1) `*formal_ptr = ...` when `formal_ptr` is a formal that's a pointer type. Java doesn't have raw pointers, so we didn't need to handle this case.
(2) Passing by reference, which Java also doesn't have (everything is pass-by-value).
Reviewed By: mbouaziz
Differential Revision: D5041246
fbshipit-source-id: 4e8f962
Summary:
Introduce `infer-<command>` for each command, except for internal commands
(only `infer-clang` for now) which are not exported. Install these executables
(which are just symlinks to `infer`) on `make install`. The main executable
looks at the name it was invoked with to figure out if it should behave as a
particular command.
Get rid of `InferClang`, `InferAnalyze`, and `InferPrint`. As a bonus, we now
only need to build one executable: `infer`, which should be a few seconds
faster (less link time).
`InferAnalyze` is now `infer-analyze` and `InferPrint` is `infer-print`. To run
`InferClang`, use a symlink named `clang`, `clang++`, etc. to `infer`. There
are such symlinks available in "infer/lib/wrappers/" already.
I also noticed that the scripts in xcodebuild_wrappers/ don't seem useful
anymore, so use wrappers/ instead, as for `make`.
Reviewed By: mbouaziz
Differential Revision: D5036495
fbshipit-source-id: 4a90030
Summary: Same as D5026082, but allowing specification in JSON rather than harcoded in Infer.
Reviewed By: jeremydubreil
Differential Revision: D5030042
fbshipit-source-id: 8a6cfee
Summary: A lot of C++ library functions look like this, so it's important to have.
Reviewed By: mbouaziz
Differential Revision: D5026082
fbshipit-source-id: 6f421b6
Summary: Making sure simple passthroughs like the identity function work in C++.
Reviewed By: mbouaziz
Differential Revision: D5024031
fbshipit-source-id: ce48ead
Summary:
Now,
infer -a infer -- ...
and
infer -a checkers --biabduction -- ...
will return the same list of errors
Reviewed By: sblackshear
Differential Revision: D5023223
fbshipit-source-id: f52ce5d
Summary:
Needed because this is how the Clang frontend translates returns of non-POD, non pointer values (I think)?
Will handle the more general case of pass by reference soon.
Reviewed By: jvillard
Differential Revision: D5017653
fbshipit-source-id: 1fbcea5
Summary:
`[a, b] < [a, _]` and `[_, a] < [b, a]` are most probably false (it comes from size < size)
Mark definitely unsatisfied conditions as B1, others as B2+
Reviewed By: KihongHeo, jvillard
Differential Revision: D4962107
fbshipit-source-id: ba8f469
Summary:
The bufferoverrun checkers can now be run with:
infer -a checkers --bufferoverrun -- ...
Reviewed By: jeremydubreil
Differential Revision: D5010689
fbshipit-source-id: 2eaa396