Summary:
Also, make it explicit when we load the global tenv instead of the per-file tenv.
This allows for some nice simplifications in some places, notably:
- `tenv_file` is gone from `Exe_env.file_data`
- `DB.global_tenv_fname` is no more
This will help moving the tenv from the capture/source_file/ directories on the
filesystem to the database, as keys for the relevant table are `SourceFile.t`.
Reviewed By: mbouaziz
Differential Revision: D6796594
fbshipit-source-id: 1ffd5b0
Summary:
They were constructed for each source file, and then joined into a global call
graph, only to get per-file lists of procedures. A tad wasteful.
Get this list from cfgs instead. Still record them in `exe_env` for now as
changing that code is a whole other beast.
One test falls victim of the flakiness of the analysis of recursive functions.
Reviewed By: jeremydubreil, mbouaziz
Differential Revision: D6324268
fbshipit-source-id: d5ff58f
Summary:
Last piece of significant logic relying on the call graph: the execution
environment. Mostly carve out the logic to detect duplicate symbols to not rely
on the call graph.
Also make the keys of the `file_map` be source files, because not having the
"cg filename" makes it harder otherwise.
Reviewed By: sblackshear
Differential Revision: D6621196
fbshipit-source-id: cab50ba
Summary:
In preparation for getting rid of call graphs, we need to find another way to
get the list of defined procedures (which is the only place where we use the
globally-computed call graph for now).
The natural way to get the list of procedures defined in a file is to load the
cfg for that file and look at the proc names that are the keys of the cfg. This
is way too expensive, as the CFG is big. Thus, we cache this list of proc names
as another column in the SQLite database of cfgs. This gives good performance
in benchmarks.
Reviewed By: jeremydubreil
Differential Revision: D6621142
fbshipit-source-id: ed265fe
Summary:
Record the db schema, infer version, and run dates into
infer-out/.infer_runstate.json. This allows us to check on startup whether the
results directory was generated using a compatible version of infer or not, and
give a better error message in the latter case than some SQLite error about
mismatching tables.
This will be used in a follow-up diff to record capture phases too, and avoid
relying on filesystem timestamps of the infer-out/capture/foo/ directories for
reactive analysis.
Had to change some tests Makefiles to make sure they do not attempt to re-use
stale infer-out directories, which would now fail the run.
The stale infer-out directory gets deleted if `--force-delete-results-dir` is
passed (but a warning still gets printed).
Reviewed By: mbouaziz
Differential Revision: D6760787
fbshipit-source-id: f36f7df
Summary:
This declutters `CommandLineOption` a little bit, and will be useful in a
follow-up diff where `InferCommand.t` will be used from an atd-generated file.
Reviewed By: mbouaziz
Differential Revision: D6772990
fbshipit-source-id: 3d32d00
Summary:
Was trying to decide where to add a new Java utility function and realized that things are a bit disorganized.
Some operations on `Typ.Name.t`'s live in `Typ.Procname`, and some live inside an inner `Java` module whereas some are outside of the module with a `java_` prefix.
Let's move toward putting all Java/C/Objc/C++-specific functions in dedicated modules.
This diff does some of the work for Java.
There are Java-specific functions that operate on `Typ.Procname.t`'s that will have to be converted to work on `Typ.Procname.Java.t`'s, but changing those clients will be more involved.
Will also move C/Objc/C++ functions in a follow-up.
Reviewed By: jeremydubreil
Differential Revision: D6737724
fbshipit-source-id: cdd6e68
Summary: Use the Hashtbl functions directly as `Cfg` knows that a cfg is a hashtbl.
Reviewed By: sblackshear, jeremydubreil
Differential Revision: D6727732
fbshipit-source-id: 2cdda91
Summary:
Found the dead code with the script in the next commit, iteratively until no
warnings remained.
Methodology:
1. I kept pretty-printers for values, which can be useful to use from infer's REPL (or
when printf-debugging infer in general)
2. I kept functions that formed some consistent API (but not often, so YMMV), for instance if it looked like `Set.S`, or if it provides utility functions for stuff in development (mostly the procname dispatcher functions)
3. I tried not to lose comments associated with values no longer exported: if the value is commented in the .mli and not the .ml, I moved the comment
4. Some comments needed updating (not claiming I caught all of those)
5. Sometimes I rewrote the comments a bit when I noticed mis-attached comments
Reviewed By: mbouaziz
Differential Revision: D6723482
fbshipit-source-id: eabaafd
Summary:
Almost all the files are supposed to do this. When they don't, the build adds
it automatically anyway, but it's better to always include it to avoid
confusion (and help other automated tools).
Reviewed By: mbouaziz
Differential Revision: D6723285
fbshipit-source-id: 0fe8a16
Summary: This should make no difference as the `Ondemand` would already only run the analysis when the procedure description is found, and naturally skip the analysis otherwise.
Reviewed By: sblackshear, jvillard
Differential Revision: D6705813
fbshipit-source-id: 44bffee
Summary:
This is a more accurate type since an execution environment is always a single
file, and allows us to make one of the fields of `Exe_env.t` immutable.
Reviewed By: mbouaziz
Differential Revision: D6620477
fbshipit-source-id: 9553516
Summary:
This avoids relying on the directories in infer-out/captured/ being created,
and instead gets the list of captured source files from the DB. This gives a
better type to clusters: `SourceFile.t` instead of `DB.source_dir`, which makes
the code a bit nicer too.
Reviewed By: jeremydubreil
Differential Revision: D6620460
fbshipit-source-id: c0edbf6
Summary: This should avoid making copies of procedure descriptions which are mutable data-stuctures.
Reviewed By: sblackshear
Differential Revision: D6658527
fbshipit-source-id: 688a142
Summary: There was several implementations of the same function accross the codebase
Reviewed By: sblackshear
Differential Revision: D6658266
fbshipit-source-id: e12507b
Summary: Filtering the defined procedure at this level is not necessary. This check already happens when running the analysis in Ondemand. This could also cause flakiness if the "definedness" here does not agree with the check done in Ondemand. The fact that the analysis of a procedure is triggered from the top-level iteration or on-demand when analyzing another procedure is not deterministic.
Reviewed By: sblackshear
Differential Revision: D6575057
fbshipit-source-id: ff0bc2d
Summary:
Upgrade ocamlformat to 0.3, and (necessarily) base to v0.10.0.
- Fix accumulated mis-formatting
- Update opam.lock to unbreak clean build
- Update to base v0.10.0
- Update opam.lock for base
- Update offline opam repo
- Everyone should already have removed their ocamlformat pin
- ocamlformat 0.3 supports output to stdout natively
- bump version of ocamlformat
Reviewed By: jeremydubreil
Differential Revision: D6636741
fbshipit-source-id: 41a56a8
Summary: This is to allow the bi-abduction analysis and the nullable checker for Clang languages to run together without stepping on each other toes.
Reviewed By: sblackshear
Differential Revision: D6567934
fbshipit-source-id: a318c33
Summary:
Instead of storing the cfgs of source files inside their own individual files,
put them in results.db, in their own table. (that table may change in the
future to map source files to more than just their cfgs, eg their tenv as well)
Reviewed By: jberdine
Differential Revision: D6297201
fbshipit-source-id: 7fa891d
Summary:
This diff adds a layer of report deduplication logic in addition to
the existing scheme.
Suppose issue 1 with trace1a and trace1b, and issue 2 with trace2a and
trace2b. If trace1a ends at the same location as trace2a (resp.,
trace2b) and trace1b ends at the same location as trace2b (resp.,
trace2a), then consider issues 1 and 2 to be duplicates.
This chooses to report the issue with the smaller sum of trace
lengths, breaking ties using the issue hashes, and eventually the
entire issue. Therefore there is a potential for flakiness with
respect the the choice of which report to make within a
hash-equivalence class.
Reviewed By: sblackshear
Differential Revision: D6519607
fbshipit-source-id: 63210ab
Summary: As Dulma pointed out, adding or removing paramters in a method in Objective C is changing the name of the method. Such changes should not make pre-exisiting issues reported as introduced. This diff is to prevent this by only keeping in the bug hash the part of the name that is before the first colon.
Reviewed By: dulmarod
Differential Revision: D6491215
fbshipit-source-id: 3c00fae
Summary: For the Buck integration for Java, caching when a summary is not found avoids going through the whole classpath every time a summary for the same method is not found.
Reviewed By: mbouaziz, jvillard
Differential Revision: D6402833
fbshipit-source-id: 4feb422
Summary: This will avoid collisions when the inner classes are implementing the same methods. For example, the previous version of the bug hash could conflate the issues when several annonymous inner classes are implementing the same method, e.g. a annonymous subclass of `Runnable` implementing `run()`.
Reviewed By: sblackshear
Differential Revision: D6461594
fbshipit-source-id: 2bb8545
Summary:
Simpler bug hash that is more independent of the underlying analysis. This now computes the hash based on:
- the bug type.
- the base filename: i.e for my/source/File.java, just keep File.java. So the hash will not change when moving files around.
- the simple method name: i.e. without package information and list of parameters. So changing the list of parameters will not affect the bug hash.
- the error message were the line numbers have been removed. So moving code or reformatting will not affect the hash.
Reviewed By: jberdine
Differential Revision: D6445639
fbshipit-source-id: 82e3cbe
Summary:
Summaries can be big, and they can always be printed via `infer report` if we want to see them.
There's no reason to log them eagerly, even in debug mode.
Reviewed By: jeremydubreil
Differential Revision: D6451815
fbshipit-source-id: 643cd47
Summary: In every place this was used except one, `debug_mode` is also used as a gate.
Reviewed By: jeremydubreil
Differential Revision: D6450913
fbshipit-source-id: 6a5716d
Summary:
It seems that the abstraction instructions were not previously added the the CFG.
This is a functional changes to make sure that the abstraction state is always added. We can simplify the code later and just run this step before storing the CFG instead of after loading them.
Reviewed By: sblackshear, jvillard
Differential Revision: D6383672
fbshipit-source-id: cedcb8a
Summary:
Deduping issues when generating a single report and then diffing the
reports can lead to introduced issues being considered duplicates of
existing issues.
Reviewed By: sblackshear
Differential Revision: D6414673
fbshipit-source-id: bba81fd
Summary: I accidentally deleted the support for `infer report file.specs` which was printing the summary to standard output.
Reviewed By: sblackshear
Differential Revision: D6416690
fbshipit-source-id: 62246f3
Summary:
The diff is very big but it's mostly removing code. It was inspired by the fact that we were getting Dead Store FPs because we were modeling some functions from CoreFoundation and CoreGraphics directly as alloc in the frontend, which caused the parameters of the function to be seen as dead. See the new test.
To deal with this, if we are going to skip the function, we model it as malloc instead. Given how many models we had for those "model as malloc" functions, I removed them to rely solely on the new mechanism.
The modeling of malloc and release was still based on the old retain count implementation, even though all we do here is a malloc/free kind of analysis. I also changed
that to be actually malloc/free which removed many Assert false in the tests. CFRelease is not exactly free though, and it's possible to use the variable afterwards. So used a custom free builtin that only cares about removing the Memory attribute and focuses on minimizing Memory Leaks FPs.
Otherwise we were translating CFBridgingRelease as a special cast, and this wasn't working. To simplify this as well, I removed all the code for the special cast, and just modeled CFBridgingRelease and CFAutorelease also as free_cf, to avoid Memory Leak false positives. I also treated the cast __bridge_transfer as a free_cf model. This means we stopped trying to report Memory Leaks on those objects.
The modeling of CoreGraph release functions was done in the frontend, but seemed simpler to also simplify that code and model all the relevant functions.
Reviewed By: sblackshear
Differential Revision: D6397150
fbshipit-source-id: b1dc636
Summary:
This is a good moment to close Sqlite's DB handles, and in general can be used to postpone some actions right before infer terminates.
Since exiting is done via uncaught exception handling, the `late_epilogue` callback will run at the very end, even after all the `at_exit` callbacks have been invoked. The only exception is made in case of signalling, in which case the `late_epilogue` is still invoked, but before any of the `at_exit` callbacks.
Reviewed By: jvillard
Differential Revision: D6404961
fbshipit-source-id: 8ff7a05
Summary: There is a lot of code to create LaTeX output of the Infer datastructures, but this does not seem to be used anymore.
Reviewed By: jvillard
Differential Revision: D6355686
fbshipit-source-id: 55de8e9