Summary: Experiments suggest that infer does not take advantage of hardware threads and using more CPUs than the number of physical cores actually hurts performance. On Linux, `ProcessPool` now by default uses only the same number of workers as physical CPUs and pins them evenly.
Reviewed By: ngorogiannis
Differential Revision: D20193171
fbshipit-source-id: f8b9f55bf
Summary: The refactor from using 1 pipe for all worker-to-master communication to `n` (one per worker) introduced the possibility of starving workers because the master process read all the messages from one pipe (refreshing the file descriptors to read from with `Unix.select`) before moving to the next one. These changes aim to prevent that by reading one message from all available pipes before refreshing the file descriptors to read from.
Reviewed By: ngorogiannis
Differential Revision: D20194924
fbshipit-source-id: 91a0fbc47
Summary: Make the `ProcessPool` use one pipe per worker for worker-to-master communication.
Reviewed By: ngorogiannis
Differential Revision: D20158845
fbshipit-source-id: dc15607f8
Summary: When the progress bar was off (`--no-progress-bar`) the `TaskBar` didn't log updates but the workers still sent them to the master process. Now, the workers no longer send updates to the master process when the progress bar is off.
Reviewed By: ngorogiannis
Differential Revision: D20140577
fbshipit-source-id: 560d56991
Summary: When a worker fails because it can't a get the lock of a `Procname` it will include it in the exception that it throws so the `RestartScheduler` can record it as a dependency. Then when scheduling a new work item from `RestartScheduler.next` it will check if this dependency is already met, if it isn't it will not schedule the `Procname` yet.
Reviewed By: ngorogiannis
Differential Revision: D19820331
fbshipit-source-id: b48cacc9a
Summary: The main job the schedulers do is building their work queues. That's being performed before the workers are forked which means they get copied into all of them. These changes push the initialization of the schedulers just after the forking takes place.
Reviewed By: ngorogiannis
Differential Revision: D19769741
fbshipit-source-id: 0b20ddd5c
Summary:
The RestartScheduler needs to know if the worker finished it's task
because:
1. there was no more work to do or
2. found that a needed Procname was already taken (this part is not yet implemented)
This need was addressed by (i) making the functions that the workers execute
return a value of task_result.t intead of unit and (ii) adding a
constructor to the worker_message.t (FinishedTask).
Reviewed By: ngorogiannis
Differential Revision: D19467783
fbshipit-source-id: a76b02b6c
Summary:
Infer would crash when started in a context where it already had
children it didn't know about.
Reviewed By: martintrojer
Differential Revision: D19177589
fbshipit-source-id: 0c8831597
Summary:
This diff enables parsing and auto-formatting documentation
comments (aka docstrings).
I have looked at this entire diff and manually made some changes to
improve the formatting. In some cases it looked like it would take too
much time, or benefit from someone more familiar with the code doing
it, and I instead disabled auto-formatting docstrings in those files.
Also, there are some source files where the docstrings are invalid,
and some where the structure detected by the parser appears not to
match what was intended. Auto-formatting has been disabled for these
files.
Reviewed By: ezgicicek
Differential Revision: D18755888
fbshipit-source-id: 68d72465d
Summary:
- Convert `task_generator` into a module of `ProcessPool` and collect inside the two combinators which were in semi-random places.
- Make `SyntacticCallGraph` export a `task_generator` as opposed to a call-graph builder.
- Separate `target` type and put it in its own module to avoid dependency cycles.
Reviewed By: skcho
Differential Revision: D18425718
fbshipit-source-id: 7957edac8
Summary:
This diff fixes a data race in ProcessPool: out channel flush was
outside of the critical section.
Reviewed By: ezgicicek
Differential Revision: D17853991
fbshipit-source-id: ac0fd2a69
Summary: This diff fixes a potential race in writing to pipe channel.
Reviewed By: ngorogiannis
Differential Revision: D17710123
fbshipit-source-id: 477492750
Summary:
The code was already trying to do that but failing. Now it works.
This revealed a slight bug where the progress bar would always stop at
N-1/N 99% jobs. Fixed by moving the progress bar updates *after* the
operation that might decrease the number of jobs left.
Reviewed By: mityal
Differential Revision: D17423978
fbshipit-source-id: fc32db5f3
Summary:
Modify the scheduler to collect results from children at the end of the
parallel execution. Use this to collect backend stats and log their
aggregated sum.
Reviewed By: ezgicicek
Differential Revision: D16358867
fbshipit-source-id: 775792ef7
Summary:
Move control of the number of remaining task from the taskbar [1] to each task generator [2]. This means that the call graph scheduler can count all procedures in mutually-recursive cycles as dealt with when only those procedures are left.
[1] : `infer/src/base/TaskBar.ml`
[2] : type defined in `/infer/src/base/ProcessPool.ml`
Reviewed By: ngorogiannis
Differential Revision: D16071497
fbshipit-source-id: aa9436638
Summary:
A more dynamic scheduling scheme will potentially run into the situation where no new work packets can be scheduled, but more work will be possible to schedule in the future, perhaps when some dependent work packet finishes being analysed.
The current implementation prevents that, as it expects that if a worker goes idle, it stays idle.
The changes here address this in two parts:
- the `select` call is always given a finite timeout. If given an infinite timeout, we will not be able to poll the task generator for more work, where none were previously possible.
- when the `select` call times out without updates, check if there is an idle child, and if so if the task generator has more work right now.
See also ProcessPool.mli for comments.
Reviewed By: mbouaziz
Differential Revision: D15197749
fbshipit-source-id: babe5da8e
Summary:
Before moving to any kind of non-trivial scheduling, we need to change the Tasks interface.
In particular, it's too restrictive to expect that the tasks to be scheduled are provided as a list before starting execution. For example, dynamic scheduling does not fit the bill here. Also, the list expectation means all scheduling work has to be done up front.
The solution here is to move to a `Sequence`-like interface with one difference:
- The function returning the next task expects a task option argument. That argument is the task that was just finished (if any) by the worker expecting new work. This will be useful for things like task dependencies (for instance, a procedure has been analysed, and can be marked so).
Reviewed By: mbouaziz
Differential Revision: D15181613
fbshipit-source-id: 21f3ba825
Summary:
We can fill the gaps in the trace now: they correspond to processes waiting on
pipes. This suggests a more efficient protocol would help perf, at least on the
small example I tried. Anyhow, it shows it's useful to trace pipe operations.
Some small gaps remain but they look like they could be explained by rounding errors.
Reviewed By: mbouaziz
Differential Revision: D9934437
fbshipit-source-id: 1d5f53a6d
Summary:
This adds an option `--trace-events` that generates a Chrome trace event[1] to
quickly visualise the performance of infer.
Reviewed By: mbouaziz
Differential Revision: D9831599
fbshipit-source-id: 96a33c627
Summary:
For some unexplained reason, some of the functions registered in the Epilogues would sometimes be executed several times. I could not figure out why.
This diff fixes that, but also has more explainable benefits:
- Do not run epilogues registered in the parent in the children. Previously it
would do so, but probably only if the children registered some epilogue given
that `at_exit` must be called again once on the child (but the value of the ref
in `Pervasives` would not have been reset).
- Unified behaviour for early and late epilogues given that we now handle both of these directly
We already have all the control needed to run epilogues when needed: we know
when infer exits, and we know when children processes exit.
Reviewed By: mbouaziz
Differential Revision: D9752046
fbshipit-source-id: 13af40081
Summary:
Print the following for each source file to analyse in non-interactive mode:
```
path/to/source_file.c starting
[...]
path/to/source_file.c DONE in <time>
```
This should help diagnose when infer is stuck. It also logs this information to
the log file regardless of the form of the progress bar.
Also add a `--progress-bar-style` option to allow the user to force a
particular rendering: plain (as above), multiline (The Glorious One), or auto
(selection depends on whether infer is connected to a TTY on stdin *and*
stderr).
Reviewed By: mbouaziz
Differential Revision: D9120509
fbshipit-source-id: 4b43b7464
Summary:
Previously we would block indefinitely waiting for that child to send us an
update. Now errors in the child are caught and the main process dies, taking
everyone down with it.
When a child dies, it sends a "Crash" message to the parent, unless it died
receiving a signal (like a segmentation fault, or an external signal). In that
case, the OCaml runtime won't get a chance to notice its death and send the
"Crash" message. Thus, always check if some child has died from under us as
well.
Reviewed By: mbouaziz
Differential Revision: D8577095
fbshipit-source-id: 519992b
Summary:
Replace the previous outputting of "." and "F" with an actual progress bar and
a multiline display of what procedure each process is currently busy analysing.
Observe:
```lang=text
Found 19 source files to analyze in /home/jul/code/openssl-1.1.0d/infer-out
7/19 [######################......................................] 36%
⊢ [ 1.14s] crypto/mem.c: CRYPTO_malloc
⊢ [ 1.68s] crypto/o_time.c: julian_adj
⊢ [ 0.50s] crypto/mem.c: CRYPTO_zalloc
⊢ [ 1.80s] crypto/o_str.c: OPENSSL_strlcpy
```
This works by setting up a worker pool (as before) that waits to receive jobs
(not as before: we used to fork for each new job). Unix pipes are used for
communication.
The new worker pool can be used to experiment with other concurrency models,
such as reviving per-procedure-parallelism, or making sure each procedure is
analysed only once.
Perf tests indicate that this version is no slower than the previous one,
either on laptops or devserver: about 3% worse user time but ~40% better system time.
This new version forks <jobs> processes whereas the previous version would
fork `O(number of source files)` times.
`infer -j 1` shows a progress bar that doesn't update timing info (because it
would need a second process to do that).
Reviewed By: mbouaziz
Differential Revision: D8517507
fbshipit-source-id: c8ca104
Summary:
Change the license of the source code from BSD + PATENTS to MIT.
Change `checkCopyright` to reflect the new license and learn some new file
types.
Generated with:
```
git grep BSD | xargs -n 1 ./scripts/checkCopyright -i
```
Reviewed By: jeremydubreil, mbouaziz, jberdine
Differential Revision: D8071249
fbshipit-source-id: 97ca23a
Summary:
Also, always log failures.
This also shows that the dead code detection does not detect dead exceptions :/
Reviewed By: jeremydubreil
Differential Revision: D6796843
fbshipit-source-id: 3d0ff5c
Summary:
This commit augments codes to clean up temporary files generated by the clang frontend.
Currently clang frontend leaves large number of temporary files int the tmp directory, and it could be seen for example by running these command:
```
git clean -xdf infer/models/
mkdir /tmp/infer
env TMPDIR=/tmp/infer make infer_models
ls -l /tmp/infer
```
P.S. Analyzing real project could easily cause each infer capture to leave hundreds of files in /tmp
P.P.S. There are 11 total references to Filename.temp_file however, other 9 seems don't leak temporary files in such large scale (at least not when using the clang frontend).
Closes https://github.com/facebook/infer/pull/816
Reviewed By: jvillard
Differential Revision: D6385311
Pulled By: dulmarod
fbshipit-source-id: f7956b0
Summary:
Upgrade ocamlformat to 0.3, and (necessarily) base to v0.10.0.
- Fix accumulated mis-formatting
- Update opam.lock to unbreak clean build
- Update to base v0.10.0
- Update opam.lock for base
- Update offline opam repo
- Everyone should already have removed their ocamlformat pin
- ocamlformat 0.3 supports output to stdout natively
- bump version of ocamlformat
Reviewed By: jeremydubreil
Differential Revision: D6636741
fbshipit-source-id: 41a56a8
Summary:
Install ocamlformat from github as part of `make devsetup`, and use it
for formatting OCaml (and jbuild) code.
Reviewed By: jvillard
Differential Revision: D6092464
fbshipit-source-id: 4ba0845
Summary: With Logging.exit you have more control of the code that invokes exit, for example when forking and running certain functions that may in turn invoke exit, and you want to handle the execution flow differently - like invoking certain callbacks before exiting, or not exiting at all.
Reviewed By: jvillard
Differential Revision: D5746914
fbshipit-source-id: 596fba1
Summary:
Conversion and reformat of infer source using ocamlformat
auto-formatting tool.
Current status:
- Because Reason does not handle docstrings, the output of the
conversion is not 'Warning 50'-clean, meaning that there are
docstrings with ambiguous placement. I'll need to manually fix
them just before landing.
Reviewed By: jvillard
Differential Revision: D5225546
fbshipit-source-id: 3bd2786
Summary:
Change the API of `Logging` wrt to writing to files and to the console (see
changes in logging.mli).
Write only to one log file: infer-out/log. Prefix each line with the kind of
warning and the PID of the process emitting it. Writing with `O_APPEND` is
atomic so the file should not get garbled by concurrent writes. To get the
output of a single process, find out which one interests you by looking at
infer-out/log, then `grep ^[<PID>] infer-out/log`.
Introduce 3 log levels for debug output and command-line options to set them
for various categories individually.
Change tons of `"\n"` to `"@\n"` so the `Format` module is aware of newlines
without us having to look through every character of every logged string for
`\n` characters.
Reviewed By: mbouaziz
Differential Revision: D5165317
fbshipit-source-id: 93c922f
Summary:
This will be needed higher up in the stack because the new `ProcessPool` module
will need to call into `Logging` to refresh the logging formatters to get the
right PID when writing to the log file.
+remove dead code `iter_parallel`
Reviewed By: jberdine
Differential Revision: D5165130
fbshipit-source-id: 95c949b