Summary:
One "interesting" feature of the approach of merging the captured targets in Java, is that we union their type environments, as opposed to store partial tenvs together with each source file, which is the case for Clang.
This means
- the final global type environment is potentially huge because it contains all the types in all targets.
- all analysis workers start by loading that tenv in memory, meaning we consume `|size of tenv| x #cpus` memory, which can tip the balance towards OOMs
This diff attempts to economise on global tenv size. This is done by increasing sharing which is then preserved by marshalling. It's done in a brute force way, with hashtables for each struct component, and is not fully effective due to the recursion amongst types and types names, as well types appearing inside other constructs such as procnames.
This is done when calling `Tenv.store` so that
- the computation can be parallelised somewhat (capture is parallel, merging is not)
- buck caching will benefit from smaller tenvs.
This saves about 24% of total memory devoted to the type environment.
Reviewed By: mbouaziz
Differential Revision: D15840054
fbshipit-source-id: 6f03be1a4
Summary:
This adds an option `--trace-events` that generates a Chrome trace event[1] to
quickly visualise the performance of infer.
Reviewed By: mbouaziz
Differential Revision: D9831599
fbshipit-source-id: 96a33c627
Summary:
Use `ignore` instead, as this will warn if the argument is an arrow type,
unlike `let _ = ...`. This makes the code more future-proof: if an argument is
added to a function called in `let _ = f x` then the compiler will complain
instead of silently turning a value into a partial evaluation.
Also got rid of particularly irksome `let _ = <stuff returning unit> in` where I could.
Reviewed By: mbouaziz
Differential Revision: D9217176
fbshipit-source-id: 3be463405
Summary:
Change the license of the source code from BSD + PATENTS to MIT.
Change `checkCopyright` to reflect the new license and learn some new file
types.
Generated with:
```
git grep BSD | xargs -n 1 ./scripts/checkCopyright -i
```
Reviewed By: jeremydubreil, mbouaziz, jberdine
Differential Revision: D8071249
fbshipit-source-id: 97ca23a
Summary:
Add warning 60 (unused module) to the list of fatal warnings. Whitelisting
modules at toplevel is tricky (see inline comments) but doable.
Reviewed By: mbouaziz
Differential Revision: D7790073
fbshipit-source-id: 6f591c4
Summary:
Limit the scope of what gets included into IStd.ml to only values that we want
to shadow. New values go into other files.
Also, build istd/ with `Core` open.
Reviewed By: mbouaziz
Differential Revision: D7382111
fbshipit-source-id: 969f0e8
Summary:
Record "capture phases" in the runstate and in the source files table of the
database. Use this instead of filesystem timestamps to decide which files need
re-analyzing in the reactive analysis.
Reviewed By: jeremydubreil
Differential Revision: D6760833
fbshipit-source-id: 7955621
Summary:
Upgrade ocamlformat to 0.3, and (necessarily) base to v0.10.0.
- Fix accumulated mis-formatting
- Update opam.lock to unbreak clean build
- Update to base v0.10.0
- Update opam.lock for base
- Update offline opam repo
- Everyone should already have removed their ocamlformat pin
- ocamlformat 0.3 supports output to stdout natively
- bump version of ocamlformat
Reviewed By: jeremydubreil
Differential Revision: D6636741
fbshipit-source-id: 41a56a8
Summary:
Install ocamlformat from github as part of `make devsetup`, and use it
for formatting OCaml (and jbuild) code.
Reviewed By: jvillard
Differential Revision: D6092464
fbshipit-source-id: 4ba0845
Summary:
Use a monotonic time source instead.
Also, sleep between retries in the Serialization code.
Reviewed By: jberdine
Differential Revision: D5941697
fbshipit-source-id: 05efbe1
Summary:
Use an SQLite database to store proc attributes, instead of files on disk.
Wrap SQLite operations in two layers:
1. `SqliteUtils` provides helper functions to make sure DB operations succeed
2. `KeyValue` provides a functor to expose a simple and type-safe key/value store backed by the SQLite DB.
Reviewed By: jberdine
Differential Revision: D5640053
fbshipit-source-id: 31050e5
Summary:
This can be a long-running step and it's useful to know how long it took. We
already dump some statistics on stderr after merging is done, this just adds
one more line.
Reviewed By: mbouaziz
Differential Revision: D5833580
fbshipit-source-id: 70e19ab
Summary:
The only path to the `MergeCaptured.slink` function is when we should merge, so
it doesn't make sense to check `Config.merge` again. In the case of `infer run
--flavors -- buck ...`, this would create regular symlinks instead of the much
faster multilinks.
Reviewed By: jberdine
Differential Revision: D5574020
fbshipit-source-id: df710ca
Summary:
Do not use the deprecated (and slower) `#infer` flavor. Instead, `infer-run`
runs capture with the `#infer-capture-all` flavor, followed by merging targets,
followed by the analysis.
Move the call to `MergeCapture` around to make this change easier.
Reviewed By: mbouaziz
Differential Revision: D5547199
fbshipit-source-id: 53c9996
Summary:
Conversion and reformat of infer source using ocamlformat
auto-formatting tool.
Current status:
- Because Reason does not handle docstrings, the output of the
conversion is not 'Warning 50'-clean, meaning that there are
docstrings with ambiguous placement. I'll need to manually fix
them just before landing.
Reviewed By: jvillard
Differential Revision: D5225546
fbshipit-source-id: 3bd2786
Summary:
Thanks to the logging introduced in D5293334 (or because of, depending on your liking of spam), I noticed that there lots of errors being logged of the form
Couldn't read multilink file '/home/jul/infer/infer/tests/codetoanalyze/cpp/errors/infer-out/attributes/d0/multilink.txt': /home/jul/infer/infer/tests/codetoanalyze/cpp/errors/infer-out/attributes/d0/multilink.txt: No such file or directory
I don't think it makes sense to care about multilink files except when `--merge` is specified.
Also introduced a .mli and self-documented a function.
Reviewed By: akotulski
Differential Revision: D5310129
fbshipit-source-id: c3a6276
Summary:
Change the API of `Logging` wrt to writing to files and to the console (see
changes in logging.mli).
Write only to one log file: infer-out/log. Prefix each line with the kind of
warning and the PID of the process emitting it. Writing with `O_APPEND` is
atomic so the file should not get garbled by concurrent writes. To get the
output of a single process, find out which one interests you by looking at
infer-out/log, then `grep ^[<PID>] infer-out/log`.
Introduce 3 log levels for debug output and command-line options to set them
for various categories individually.
Change tons of `"\n"` to `"@\n"` so the `Format` module is aware of newlines
without us having to look through every character of every logged string for
`\n` characters.
Reviewed By: mbouaziz
Differential Revision: D5165317
fbshipit-source-id: 93c922f
Summary:
Try and enforce the following rules:
- stderr is for updating the user about progress or errors
- Introduce Logging.progress that outputs to stderr, but honours --quiet
- Logging.stderr is as before
- Logging.out now prints to stderr (or to log files as before if set up) and
not stdout. If some information should go on stdout then the user should be
able to rely on it (ie, it's not just some progress message). For now only
the summary of the errors is printed on stdout by default.
- Logging.err* functions are gone. If the error is user-visible, it should be
Logging.stderr, or `failwith`. If not, go to the same log file as other
output, which personally I find much more convenient than having to dig through
2 log files every time I'm looking for some output.
Reviewed By: jberdine
Differential Revision: D5095720
fbshipit-source-id: 68999c9
Summary:
Use In_channel and Out_channel operations instead of those in Pervasives. Don't
use physical equality on values that aren't heap-allocated since it doesn't help
the compiler generate faster code and the semantics is unspecified. Also use
phys_equal for physical equality.
Reviewed By: sblackshear
Differential Revision: D4232459
fbshipit-source-id: 36fcfa8
Summary:
Utils contains definitions intended to be in the global namespace for
all of the infer code-base, as well as pretty-printing functions, and
assorted utility functions mostly for dealing with files and processes.
This diff changes the module opened into the global namespace to
IStd (Std conflict with extlib), and moves the pretty-printing
definitions from Utils to Pp.
Reviewed By: jvillard
Differential Revision: D4232457
fbshipit-source-id: 1e070e0
Summary:
Merging the results directories of targets on buck projects involved creating symbolic links into buck-out.
The bulk of files are .attr files: one per procedure. Creating these links can be a bottleneck, and the merge phase can be slower than the analysis phases on projects with many procedures.
This diff introduces multilinks to speed up merge.
A multilink is a file `multilink.txt` containing a sequence of paths
```
path/to/file1.ext
path/to/file2.ext
...
```
A multilink file is a compact way to represent a link for each entry.
This diff creates a multilink file for each `attributes/dir` directory, instead of one symbolic link for each file.
Reviewed By: jberdine
Differential Revision: D4067428
fbshipit-source-id: 911f8a9
Summary:
This change introduces a new binary, called `InferStatsAggregator`, that once invoked, aggregates
together all the stats generated by the single invocations of frontend/backend/reporting, that can
be used for performance measurements.
Reviewed By: jvillard
Differential Revision: D3317000
fbshipit-source-id: 61ec615
Summary:
Reimplement command line options in preparation for uniformly passing
options from the top-level infer driver that invokes a build command
through the build system to the descendant infer processes.
All command line options of all executables are collected into Config,
and declared using a new CommandLineOption module that supports
maintining backward compatibility with the current command line
interface. Very few values representing command line options are
mutable now, as they are set once during parsing but are constant
thereafter. All ordering dependencies are contained within the
implementation of Config, and the implementation of Config is careful to
avoid unintended interactions and ordering dependencies between options.
Reviewed By: jvillard
Differential Revision: D3273345
fbshipit-source-id: 8e8c6fa
Summary:public
Eliminate the use of the -open Utils command line option passed to the compiler in favor of `open! Utils` in each source file. While slightly convenient, this option causes more headaches than it is worth with other tools e.g. merlin.
Reviewed By: jvillard
Differential Revision: D3168193
fb-gh-sync-id: 4285ef6
fbshipit-source-id: 4285ef6
Summary:public
The merge option -m would link every file in a captured results dir.
This includes the .start file, which would compromise the subsequent checks for modified files during reactive analysis.
Now only link files inside directories inside the results dir (don't link any files in the first 2 levels).
Reviewed By: jberdine
Differential Revision: D3155819
fb-gh-sync-id: 8ad180f
fbshipit-source-id: 8ad180f
Summary:public
It could happen that the modification times of symbolic links and
original files checked by reactive mode could be exactly the same. In
this case, some files would perpetually be re-analyzed.
This diff creates symlinks and sets their accessed and modified times
to 1 second in the future so that strict timestamp checking is robust.
Reviewed By: cristianoc
Differential Revision: D3098451
fb-gh-sync-id: 3724468
fbshipit-source-id: 3724468