Summary:
Make sl_file field strongly typed in the AST - store SourceFile.t instead of string. This will make it harder
to access raw string which shouldn't really be accessed before going through `SourceFile` module
Reviewed By: jvillard
Differential Revision: D4299468
fbshipit-source-id: e8ff87e
Summary:
The Java frontend translates exceptions by assigning them to the return value.
This leads to weird behavior when the return type of the function is void.
Already handled one case of this in Quandary (ignoring assignments of exceptions to return value), but was missing the case where null is assigned to the return value.
The frontend does this to "clear" the value of previously assigned exceptions.
Reviewed By: jeremydubreil
Differential Revision: D4294060
fbshipit-source-id: 6bef5ef
Summary:
We previously used `Procname.java_get_parameters` to compute the indices of parameters to taint, but this doesn't always work.
`java_get_parameters` omits the `this` param, which we may sometimes want to taint.
Use the actuals (already passed to `Sink.get`) instead
Reviewed By: jvillard
Differential Revision: D4285164
fbshipit-source-id: d462a0b
Summary: Globals that are constexpr-initializable do not participate in SIOF.
Reviewed By: sblackshear
Differential Revision: D4277216
fbshipit-source-id: fd601c8
Summary:
Functions related to source files were already namespaced by `source_file_` prefix. Make separate module for them.
In high level it replaces all `source_file_` with `SourceFile.` and then fixes all remaining compilation errors
Reviewed By: jvillard
Differential Revision: D4299053
fbshipit-source-id: 20b1d39
Summary: This is very useful to debug issues that have to do with types, for example the cast errors
Reviewed By: sblackshear
Differential Revision: D4289790
fbshipit-source-id: ef5a8bf
Summary: Enabling `--cxx` flag makes C++ analysis much better. It's time to turn it on by default
Reviewed By: dulmarod, jvillard
Differential Revision: D4285327
fbshipit-source-id: 261359a
Summary:
This is legacy code that dates back from when we needed the fields in the models to match exactly the fields in the classes being modeled. We no longer need to this. Besides, it seems that this `android.jar` stopped being part of the release for a while. So this change will not affect the results in prod
I check in details the difference on the code of Guava and it seems to remove (weird) false positives only.
Reviewed By: sblackshear
Differential Revision: D4242444
fbshipit-source-id: 84dd782
Summary:
Although the Builder pattern is not actually thread-safe, Builder's are not expected to be shared between threads.
Handle this by ignoring all unprotected accesses in classes the end with "Builder".
We might be able to soften this heuristic in the future by ensuring rather than assuming that Builder are not shared between methods (or, ideally, between threads).
Reviewed By: peterogithub
Differential Revision: D4280761
fbshipit-source-id: a4e6738
Summary:
When calling function g_realloc(gpointer mem,gsize n_bytes) one of the spec considers the case
whereby n_bytes is zero. In that case g_realloc would return null.
If we call with sizeof(int), infer would compare sizeof(int) with zero. But the prover would fail to
understand that sizeof(int) != 0.
This diff fix this. We try to convert expression to constant when they can be converted (eg in case of sizeof).
The method currently make a partial set of conversion. This could be extended.
Reviewed By: jberdine
Differential Revision: D4166944
fbshipit-source-id: 3ec4fd7
Summary:
Remember which globals are static locals.
It's useful to distinguish those from global variables in objc and in the SIOF
checker. Previously in ObjC we would accomplish that by looking at the name of
the variable, but that wouldn't work reliably in C++. Keep the old method around for
now as the way we deal with static locals in ObjC needs some fixing.
Reviewed By: akotulski
Differential Revision: D4198993
fbshipit-source-id: 357dd11
Summary:
Whenever header file is in changed-files-index, it should be captured and analyzed on demand.
It was already being captured, but ondemand analysis wasn't triggered for code in header file. This diff does it.
Use hacky header->source mapping to go from header to source (cluster) and then analyze everything in that cluster (inlucing code coming from header)
Reviewed By: jberdine
Differential Revision: D4265495
fbshipit-source-id: 61606f4
Summary: When infer runs on preprocessed source, original files may not be around anymore. Don't crash infer when that happens.
Reviewed By: jvillard, jberdine
Differential Revision: D4258285
fbshipit-source-id: a19569c
Summary: `ReentrantReadWriteLock.ReadLock` and `ReentrantReadWriteLock.WriteLock` are commonly used lock types that were not previously modeled.
Reviewed By: peterogithub
Differential Revision: D4262032
fbshipit-source-id: 4ff81a7
Summary:
`o.<init>` cannot be called in parallel with other methods of `o` from outside, so it's less likely to have thread safety violations in `o.<init>`.
This diff suppresses reporting of thread safety violations for fields touched (transitively) by a constructor.
We can do better than this in the future (t14842325).
Reviewed By: peterogithub
Differential Revision: D4259719
fbshipit-source-id: 20db71f
Summary:
Trying to stop other users of the trace domain from making the mistake that Quandary made before D4234766.
This should also improve the performance of Quandary, since the filtering of FP's is now done before building up the full interprocedural trace (which requires disk reads).
Reviewed By: jeremydubreil
Differential Revision: D4234770
fbshipit-source-id: e7e9291
Summary:
source_file_[to|from]_string were dangerous. While removing source_file_to_string is hard/impossible, source file should never be created from string.
Also include many random changes related to `source_file`:
- improve comments in DB.mli
- define behavior of changed-files-index and improve its description
- move some of the "dangerous" code inline to discourage its reuse
This mostly concludes cleanup of DB.source_file, the last bit is to unify filtering by filename (we have duplicated logic in `InferConfig`, `CLocation` and `JMain`)
Reviewed By: jvillard
Differential Revision: D4258795
fbshipit-source-id: 36735a8
Summary:
`DB.source_file_to_string` is very easy to misuse and it shouldn't even exist.
In preparation for that day, replace most of `source_file_to_string` with `source_file_pp`
Reviewed By: jvillard
Differential Revision: D4258390
fbshipit-source-id: 447cf5a
Summary:
We only ought to report a source-sink flow at the call site where the sink is introduced.
Otherwise, we will report silly false positives.
Reviewed By: jeremydubreil
Differential Revision: D4234766
fbshipit-source-id: 118051f
Summary: This should make it easier to understand complex error reports.
Reviewed By: peterogithub
Differential Revision: D4254341
fbshipit-source-id: fb32d73
Summary: We'll eventually want fancy interprocedural traces. This diff adds the required boilerplate for this and adds the line number of each access to the error message. Real traces will come in a follow-up
Reviewed By: peterogithub
Differential Revision: D4251985
fbshipit-source-id: c9d9823
Summary: Noticed this when I was writing the documentation for the abstract interpretation framework and was curious about why `Ondemand.analyze_proc` needs the type environment. It turns out that the type environment is only used to transform/normalize Infer bi-abduction specs before storing them to disk, but this can be done elsewhere. Doing this normalization elsewhere simplifies the on-demand API, which is a win for all of its clients.
Reviewed By: cristianoc
Differential Revision: D4241279
fbshipit-source-id: 957b243
Summary: Adding this so we can test interprocedural trace-based reporting in a subsequent diff.
Reviewed By: peterogithub
Differential Revision: D4243046
fbshipit-source-id: 7d07f20
Summary: We're at risk for some silly false positives without these models.
Reviewed By: peterogithub
Differential Revision: D4244795
fbshipit-source-id: b0367e6
Summary:
Currently cfg nodes are written into dot files in whatever order they
appear in a hash table. This seems unnecessarily sensitive, so this
diff sorts the nodes.
Reviewed By: dulmarod
Differential Revision: D4232377
fbshipit-source-id: a907cc6
Summary: Add some basic command line API to run Infer using Buck genrules. Remains to fix issues with absolute vs relative paths and to see how to create these genrules on the fly for a given java or android library.
Reviewed By: sblackshear
Differential Revision: D4245622
fbshipit-source-id: 1cda4ee
Summary:
Clean up code related to --changed-files-index option:
1. Store DB.SourceFileSet.t in DB.changed_source_files_set
2. Refactor rest of the code to use it
3. Bunch of minor changes to make code more consise
Reviewed By: jberdine
Differential Revision: D4238736
fbshipit-source-id: 51e5684
Summary:
Implement heuristic to capture more of the user code:
In C++ there is a lot of interesting code in header files. On the other hand,
that code gets included in multiple places and we don't want to capture it by default (for performance reasons).
Right now we capture everything from source file + all symbols from headers that source file needs.
New heuristic will extend "capturing everything" to matching header files (ie. capture everything in X.h if source file is X.cpp)
Reviewed By: jberdine
Differential Revision: D4238008
fbshipit-source-id: 0528250
Summary:
Dealing with symbolic links in project root is tricky. To avoid it, always normalize all paths to sources with `realpath`.
Changes to tests are expected - infer started to resolve symbolic links which screws up with our testing mechanism.
Reviewed By: jberdine
Differential Revision: D4237587
fbshipit-source-id: fe1cb01
Summary:
Before, we were using a set domain of strings to model a boolean domain.
An explicit boolean domain makes it a bit clear what's going on.
There are two things to note here:
(1) This actually changed the semantics from the old set domain. The set domain wouldn't warn if the lock is held on only one side of a branch, which isn't what we want.
(2) We can't actually test this because the modeling for `Lock.lock()` etc doesn't work :(.
The reason is that the models (which do things like adding attributes for `Lock.lock`) are analyzed for Infer, but not for the checkers.
We'll have to add separate models for thread safety.
Reviewed By: peterogithub
Differential Revision: D4242487
fbshipit-source-id: 9fc599d
Summary:
In Java, we handle unknown code by propagating behavior from the parameters of the unknown function call to the return value (or constructed object, in the case of a constructor). But we do this in a somewhat silly way--generating a new summary with these semantics at each unknown call site. Instead, this diff introduces these two options as predefined behaviors and adds specialized code for them.
As a side effect of this approach, unknown functions are no longer counted as passthroughs. This is ok; the original behavior was less of a reasoned decision and more of an unintended consequence of the way we decided to handle unknown code.
This new approach ought to be more efficient than the old one, and as a virtuous side effect it will be easier to specify how to handle unknown code in other languages like C++.
Reviewed By: jeremydubreil
Differential Revision: D4205624
fbshipit-source-id: bf97445
Summary:
Let's introduce some concepts. A "known unknown" function is one for which no Java code exists (e.g., `native`, `abstract`, and `interface methods`). An "unknown unknown" function is one for which Java code may or may not exist, but we don't have the code or we choose not to analyze it (e.g., non-modeled methods from the core Java or Android libraries).
Previously, Quandary handled both known unknowns and unknown unknowns by propagating taint from the parameters of the unknown function to its return value. It turns out that it is really expensive to do this for known unknown functions. D4142697 was the diff that starting handling known unknown functions in this way, and bisecting shows that it was the start of the recent performance problems for Quandary.
This diff essentially reverts D4142697 by handling known unknowns as skips instead. Pragmatically, doing the propagation trick for Java/Android library functions (e.g., `String` functions!) matters much more, so i'm not too worried about the missed behaviors from this. Ideally, we will go back to the old handling once performance has improved (have lots of ideas there). But I need this to unblock me in the meantime.
Reviewed By: jeremydubreil
Differential Revision: D4205507
fbshipit-source-id: 79cb9c8
Summary:
Useful for refactoring purposes, to provide a list of modules in
dependency order.
Reviewed By: jeremydubreil
Differential Revision: D4232363
fbshipit-source-id: 2adaaf5
Summary:
Implement heuristics to get from corresponding source files for header files.
We already had initial implementation for CaptureCompilationDatabase - moved it and extended it
to handle C/C++/objC/objC++
Reviewed By: jberdine
Differential Revision: D4231864
fbshipit-source-id: 4516287
Summary:
introduce `AttributesTable.load_defined_attributes` which will return proc attributes only if the procedure is defined. In order to not mess up
with existing caching, create another hashmap to store those procdescs.
We need to do that because with reactive capture we no longer can assume that all proc attributes are final before analysis starts
Reviewed By: jberdine
Differential Revision: D4231575
fbshipit-source-id: e795bcb