Summary:
We cannot necessarily know if the previous capture phase was java or
clang, hence to be on the safe side we should always merge the global
tenvs when merging Buck targets.
Reviewed By: ngorogiannis
Differential Revision: D19175685
fbshipit-source-id: 8e7492e14
Summary:
Old versions of sawja/javalib got the line numbers slightly wrong. The workaround was to do a regexp search in the source file for the right line.
My understanding is that this is no longer necessary. This diff removes it.
Reviewed By: jvillard
Differential Revision: D19033415
fbshipit-source-id: 2da19d66d
Summary:
Move most of IR/Sil.ml into a new file biabduction/Predicates.ml to
reflect the fact that they are only useful for the biabduction analysis.
Unfortunately this is a huge change.
I tried to keep the change to a minimum, it's mostly about doing
s/Sil/Predicates/ in lots of places but sometimes I used the trick of
specifying parameters or return value types to avoid specifying the
module altogether. This isn't done consistently because there were just
too many places to change for poor me.
Reviewed By: ngorogiannis
Differential Revision: D19158530
fbshipit-source-id: d6dbcfe72
Summary:
Part of making Sil.ml about SIL only.
In order to not introduce a dependency istd/Pp -> base/Config, the
utilities in Pp don't know when to introduce "diff" colours. Fix it by
wrapping them in Sil using the Config option. (we may want to just kill
that option at some point).
Similarly, move stuff from Io_infer to Pp.
Reviewed By: ngorogiannis
Differential Revision: D19158534
fbshipit-source-id: 8110cb7f9
Summary:
- Remove `to_flat_string` as there is `get_field_name` that unambiguously does the same thing.
- Make `pp` print only the field in all languages.
- Fix `to_full_string` so that it has unified behaviour across java/clang and so that it doesn't print `class Foo.x`, but rather `Foo.x`.
Reviewed By: ezgicicek
Differential Revision: D18963033
fbshipit-source-id: e2c803c7d
Summary:
Remove Clang and Java submodules of Typ.Fieldname. They are unnecessary and they reflect a fake dichotomy: there is only one fieldname type. To distinguish between fields of Java classes and other C constructs, there is a helper function provided, but the idea is simple: obtain the class type the field belongs to, and check if it's a Java class.
This diff still preserves behaviour, but removes as many functions as possible from the interface, to leave a small surface.
Reviewed By: mityal
Differential Revision: D18962423
fbshipit-source-id: ffe6933ee
Summary:
In order to handle the example added:
changed domain of `MethodCalled`
from `CreatedLocation -> (IsBuildCalled X IsChecked X Set(MethodCall))`
to `(CreatedLocation X IsBuildCalled) -> (IsChecked X Set(MethodCall))`
This avoids joining of two method calls where one is build-called and the other is not, e.g.,
```
if(b) {
o.build();
} else {
// no build call
}
```
changed domain of `NewDomain`
from `Created X MethodCalled`
to `(Created X MethodCalled) X (Created X MethodCalled)`
One is for no returned memory and the other is returned memory. This keeps precision some join
points of branches, e.g.,
```
if(b) {
return;
} else {
// no return
}
```
Reviewed By: ezgicicek
Differential Revision: D18909768
fbshipit-source-id: c39d1a1ef
Summary: It is not used anywhere and there are no plans to revive it. Kill it!
Reviewed By: skcho
Differential Revision: D18934719
fbshipit-source-id: b9b069b96
Summary:
This diff enables parsing and auto-formatting documentation
comments (aka docstrings).
I have looked at this entire diff and manually made some changes to
improve the formatting. In some cases it looked like it would take too
much time, or benefit from someone more familiar with the code doing
it, and I instead disabled auto-formatting docstrings in those files.
Also, there are some source files where the docstrings are invalid,
and some where the structure detected by the parser appears not to
match what was intended. Auto-formatting has been disabled for these
files.
Reviewed By: ezgicicek
Differential Revision: D18755888
fbshipit-source-id: 68d72465d
Summary:
This was never set to true except in a wrong way in the Java frontend
(see previous diff).
Reviewed By: dulmarod
Differential Revision: D18573927
fbshipit-source-id: 4c9d1a855
Summary:
A plugin update allows infer to know when a function doesn't return
according to its attributes. This propagates this info all the way to
the attributes of each function, and then use this information in a new
pre-analysis that cuts the links to successor nodes of each `Call`
instruction to a function that does not return.
NOTE: The "no_return" `CallFlag.t` was dead code, following diffs deal
with that (by removing it).
Reviewed By: dulmarod
Differential Revision: D18573922
fbshipit-source-id: 85ec64eca
Summary:
This also prints the CFGs *after* pre-analysis for individual procedures
in infer-out/captured/<filename>/<proc>.dot. One can also look up the
CFGs before pre-analysis in infer-out/captured/proc_cfgs_frontend.dot.
Context: I want to add a pre-analysis that needs to look at proc
attributes inter-procedurally. For this to make sense it has to happen
*after* all of capture, and before analysis.
Thus, this diff brings back the lazy running of the pre-analysis like in
D15803492, except that we still make sure to run the pre-analyses
systematically regardless of the checkers being run by running the
pre-analysis from ondemand.ml. Also we don't need to re-introduce the
"did_preanalysis" proc attribute for the same reason that the
pre-analysis is now run once and for all by ondemand.ml (instead of each
individual checker back in the days).
This has the benefit of running the pre-analysis only when needed, and
the drawback that several concurrent processes analysing the same proc
descs will duplicate work. Since pre-analyses are supposed to be very
fast I assume that neither is a big deal. If they become more expensive
then the benefit gets bigger and the drawback is just the same as with
regular analyses.
Reviewed By: skcho
Differential Revision: D18573920
fbshipit-source-id: de350eaef
Summary:
This allows us to move the CFG rendering to IR/.
The parts of that file concerning CFGs and those concerning Biabduction
specs were entirely disjoint, it turns out, so that was easy.
Reviewed By: jberdine
Differential Revision: D18573924
fbshipit-source-id: 0a5ab6478
Summary: Let's make UI/Cold start parts of the message bold to get more attention.
Reviewed By: jvillard
Differential Revision: D18504857
fbshipit-source-id: b0f199f55
Summary:
We consider Java collections to be like c++ std::vectors and add models for
- `Collections.get(..)`
- `__cast`
Reviewed By: skcho
Differential Revision: D18449607
fbshipit-source-id: 448206c84
Summary:
Let's introduce a set of new cost analysis issue types that are raised when the function is statically determined to run on the UI thread. For this, we rely on the existing `runs_on_ui_thread` check that is developed for RacerD. We also update the cost summary and `jsonbug.cost_item` to include whether a method is on the ui thread so that we don't repeatedly compute this at diff time for complexity increase issues.
Note that `*_UI_THREAD` cost issues are assumed to be more strict than `*_COLD_START` reports at the moment. Next, we can also consider adding a new issue type that combines both such as `*_UI_THREAD_AND_COLD_START` (i.e. for methods that are both on cold start and run on ui thread).
Reviewed By: ngorogiannis
Differential Revision: D18428408
fbshipit-source-id: f18805716
Summary:
The order was wrong: the map from procnames to node-ids was cleaned first, but to clean the map from node-ids to nodes, we need the id and we have already removed it.
The symptom was that effectively no new leaves are created by "removing" a node, so the call graph scheduler quickly devolves to the file-based one.
Reviewed By: skcho
Differential Revision: D18448209
fbshipit-source-id: f272a8112
Summary:
There was some over-general treatment of reachability, in anticipation of changes that didn't happen.
In particular, we only need to flag/remove single nodes, as they must be leaves to be scheduled,
therefore we never need to traverse their successors, because there aren't any.
Reviewed By: jvillard
Differential Revision: D18425905
fbshipit-source-id: b86490542
Summary:
- Convert `task_generator` into a module of `ProcessPool` and collect inside the two combinators which were in semi-random places.
- Make `SyntacticCallGraph` export a `task_generator` as opposed to a call-graph builder.
- Separate `target` type and put it in its own module to avoid dependency cycles.
Reviewed By: skcho
Differential Revision: D18425718
fbshipit-source-id: 7957edac8
Summary: `Str.regexp_string` should be used to find a method name instead of `Str.regexp`.
Reviewed By: ezgicicek
Differential Revision: D18136598
fbshipit-source-id: c4b56dd64
Summary:
The wrong function was used when we tried to see if the class is
annotated with NullsafeStrict. This made it work only for non-static
methods.
Now we use the proper way.
Reviewed By: ngorogiannis
Differential Revision: D18113848
fbshipit-source-id: 02b7555be
Summary:
See explanations in D17955104.
This renames `AbstractAddress` to `AbstractValue` since they are not
necessarily addresses.
Reviewed By: ezgicicek
Differential Revision: D17955290
fbshipit-source-id: 8bb4c61f2
Summary:
- Putting test determinator in own directory
- Putting Java procname creation stuff in its own module
Reviewed By: skcho
Differential Revision: D17929885
fbshipit-source-id: 4f2578566
Summary:
By some unfortunate logic, OCaml often decides to use
`sexp_list`/`sexp_option` instead of just `list`/`option`. Sometimes
these get copy/pasted in interface files.
It would be good to tell OCaml not to do that in the first place but in
the meantime: this diff.
Reviewed By: ngorogiannis
Differential Revision: D17907938
fbshipit-source-id: 7546834a2
Summary:
Describe what the --report-*-* options actually do instead of their
outdated documentation from the time where this was
`--checkers-blacklist-regex`, `--infer-blacklist-regex` and the like.
Reviewed By: dulmarod, mityal
Differential Revision: D17906015
fbshipit-source-id: 204349e9e
Summary: Not used feature and with no obvious roadmap anymore.
Reviewed By: jberdine, jvillard
Differential Revision: D17855860
fbshipit-source-id: fd75b9d62
Summary: `litho` checker contained two checkers: required-props and graphQL field accesses. Although they use the same domain, their reporting conditions and analysis details are different. However, they were bundled into the same analysis by adding disjunctions to `exec_instr` to handle both cases. Let's separate them into two different checkers, keeping a modular transfer function and analyzer that is reused by these two checkers.
Reviewed By: skcho
Differential Revision: D17788834
fbshipit-source-id: 47d77063b
Summary:
The documentation and uses of filtering disagree. One typical usage is deduplication.
Split that where obvious, add comments where not obvious, and leave alone when obviously unrelated to deduplication.
Reviewed By: mityal
Differential Revision: D17715329
fbshipit-source-id: ec757927b
Summary: The analysis is not intra-procedural, hence we don't really read the payload. Let's remove it.
Reviewed By: ngorogiannis
Differential Revision: D17603911
fbshipit-source-id: c92b5c602
Summary: When we have an annotation like `Prop(varArg = X)` or ` ThreadSafe(enableChecks = true)`, we were not able to pick up the names of the parameters like `varArg` or `enableChecks`. This diff fixes that.
Reviewed By: skcho, ngorogiannis
Differential Revision: D17571377
fbshipit-source-id: 5293b5810
Summary:
Introduce a new experimental checker (`--impurity`) that detects
impurity information, tracking which parameters and global variables
of a function are modified. The checker relies on Pulse to detect how
the state changes: it traverses the pre and post pairs starting from
the parameter/global variable and finds where the pre and post heaps
diverge. At diversion points, we expect to see WrittenTo/Invalid attributes
containing a trace of how the address was modified. We use these to
construct the trace of impurity.
This checker is a complement to the purity checker that exists mainly
for Java (and used for cost and loop-hoisting analyses). The aim of
this new experimental checker is to rely on Pulse's precise
memory treatment and come up with a more precise im(purity)
analysis. To distinguish the two checkers, we introduce a new issue
type `IMPURE_FUNCTION` that reports when a function is impure, rather
than when it is pure (as in the purity checker).
TODO:
- improve the analysis to rely on impurity information of external
library calls. Currently, all library calls are assumed to be nops,
hence pure.
- de-entangle Pulse reporting from analysis.
Reviewed By: skcho
Differential Revision: D17051567
fbshipit-source-id: 5e10afb4f
Summary:
The code was already trying to do that but failing. Now it works.
This revealed a slight bug where the progress bar would always stop at
N-1/N 99% jobs. Fixed by moving the progress bar updates *after* the
operation that might decrease the number of jobs left.
Reviewed By: mityal
Differential Revision: D17423978
fbshipit-source-id: fc32db5f3
Summary:
Previously we would incorrectly report the time for the whole process
and this could include capture time too.
Reviewed By: mityal
Differential Revision: D17423977
fbshipit-source-id: b3ed754b3
Summary: We should be able to run this processing ast steps without running linters or capture. This also adds a new module ProcessAST to do the processing, Capture.ml should not know anything else than calling the respective modules for capture, linting or processing.
Reviewed By: ngorogiannis
Differential Revision: D17501453
fbshipit-source-id: 30adba5b1
Summary:
- Instead of merging one target DB into the main DB at a time, merge all target DBs into an in-memory DB (thus, no writing) and then dump it into the main DB at the end. This makes merging faster.
- When using the sqlite write daemon, there is no reason to drive the merge process from the master, sending each individual target to merge down the socket and doing one DB merge at a time. Here we move all the DB merging logic in the daemon, and expose a single function that does it all.
- Refactor some common functionality (notably the `iter_infer_deps` function is now in `Utils`) and remove dead files.
This can be also done using a temporary DB (which is not limited to memory) but this showed worse perf in tests than the in-memory solution as well as the current state of things (! possibly Sqlite-version related?).
Reviewed By: skcho
Differential Revision: D17182862
fbshipit-source-id: a6f81937d
Summary: This calls the method `delete_capture_and_analysis_data` introduced in D17184424 once the appropriate specs files for incremental analysis have been deleted. This fixes two bugs that I observed in incremental analysis that were arising because of stale state left in the results directory.
Reviewed By: ngorogiannis
Differential Revision: D17184424
fbshipit-source-id: d63f59db9
Summary: Use_after_free was used both for biabduction and pulse, and the biabduction version is blacklisted by default. As a result, the Pulse version was also disabled unintentionally. This changes the name of the old use_after_free so that now we can get use_after_free bugs whenever pulse is enabled.
Reviewed By: skcho
Differential Revision: D17182687
fbshipit-source-id: 539ca69de
Summary:
It uses inline record for Sil.Load and Sil.Store for preparing the
following extention.
Reviewed By: dulmarod
Differential Revision: D17161288
fbshipit-source-id: 637ea7bfa
Summary:
Implementation of write-serializer for Sqlite. Points of note:
- A Unix socket is used for communication. This avoids buffer-size limitations, as the objects we send for writing may exceed said limits.
- No daemon is used if running under buck or in genrule mode, as this usually means a single-threaded job capturing into the DB.
- When the daemon is running, read-only access is *not* enforced for other processes. This makes starting and stopping the daemon during Infer execution easier and more robust. In WAL mode this should not have any effect on performance.
- This version is not economical with connections, it uses one per query, todo.
Reviewed By: jvillard
Differential Revision: D17077183
fbshipit-source-id: fa9877d6c
Summary:
Write contention is becoming a problem in parallel capture (eg when make runs with high parallelism) or when analysis writes CFGs to the DB in parallel (eg when analysing blocks in ObC). This is believed to lead to BUSY errors in Sqlite.
This is step 1 of a process where all writes are cordoned-off in one module, and fixing the interface for that module.
Reviewed By: skcho
Differential Revision: D16985034
fbshipit-source-id: 3d7ce381b
Summary:
Currently, the specs directory is cleaned after running capture. This means that the `changed-files` are interpreted in the context of the second set of source files. Therefore if a procedure is deleted from the second set of source files, its specs file will not be deleted.
This moves the cleaning of the specs directory to before capture, to avoid this problem.
Reviewed By: ngorogiannis
Differential Revision: D17093980
fbshipit-source-id: e1a8d8a54
Summary:
This adds logging of the number of nodes in the reverse analysis call graph, and the number of these nodes that are invalidated by incremental analysis.
This data will show the precision of incremental analysis.
Reviewed By: ngorogiannis
Differential Revision: D16939101
fbshipit-source-id: 1e465f1a6
Summary:
It doesn't make sense to use incremental analysis without specifying changed files. This is a possible source of future bugs.
This commit causes infer to die if incremental analysis is used without changed files.
---
Previously:
I think this code is currently a bit brittle because the CI shadow builds sometimes use `--incremental-analysis` because they are called with the same command as the diff analysis.
I am worried that in the future the shadow builds could be broken by this, although everything looks like it works right now. This diff would prevent breaking smoke builds in future because the shadow builds don't set changed-files (afaik).
However, possibly this is not the right place to fix the problem. It might be better to change the CI, I'm not sure.
Reviewed By: mityal
Differential Revision: D16829192
fbshipit-source-id: 5ee1ce9d0
Summary:
Use whatever information we can to decide whether to use C or Java
syntax when outputting an access expression, now that we store them as
such.
Also, make cluster callbacks explicitly set the language, as this was not done before and led to some confusion (Clang being set when analysing a Java file).
Reviewed By: skcho
Differential Revision: D16884160
fbshipit-source-id: 40adf9f35
Summary: Remove duplicate `s` when printing the time taken by the analysis
Reviewed By: ngorogiannis
Differential Revision: D16710277
fbshipit-source-id: 3ba7f6693
Summary:
Add logging for the number of procedures whose summaries are invalidated by incremental analysis
This will help verify that incremental analysis is working as expected in production
Reviewed By: ngorogiannis
Differential Revision: D16686911
fbshipit-source-id: 53c89c3bb
Summary:
The models are only for biabduction so try to make that clearer in the
code and documentation.
Reviewed By: skcho
Differential Revision: D16603147
fbshipit-source-id: 4a2be53de
Summary:
Count the following:
- how many procedures were *actually* analyzed (i.e. some checkers ran
on them)
- how many times an analysis result was retrieved from the local cache
and how many times it was missed
Reviewed By: skcho
Differential Revision: D16561867
fbshipit-source-id: 8c43ce13c
Summary:
Instead of `let incr_foo () = global_stats.foo <- global_stats.foo + 1` where you have to
check that you copy/pasted the right stuff and substituted `foo`
everywhere, write `let incr_foo () = incr Fiels.summary_foo` where
there's less room for errors.
Reviewed By: artempyanykh
Differential Revision: D16561868
fbshipit-source-id: 77ea09bef
Summary: The reverse analysis call-graph is logged if `--debug-level-analysis` > 0, so that its value can be inspected for tests
Reviewed By: jvillard
Differential Revision: D16440567
fbshipit-source-id: 1ec6af1f3
Summary: This implements incremental diff analysis by deleting only the summaries that need to be re-analyzed, keeping all summaries corresponding to procedures that have not been changed (or had a callee change).
Reviewed By: jvillard
Differential Revision: D16358474
fbshipit-source-id: 660a704a0
Summary:
There was a little bit of code duplication around `analyze_proc` to deal with
the fact that we may be starting from either a proc name or a proc desc. Create
a new `callee` type that represents this more explicitly. This allows not
loading the `proc_desc` eagerly when we don't need it, although that doesn't
seem to impact perf measurably.
Reviewed By: ezgicicek
Differential Revision: D16442221
fbshipit-source-id: 8e8ebbd6b
Summary:
:
In previous commits we introduced deriving capabilities in Backend stats.
Now we can rewrite the code so that usage of all fields is enforced at compile time.
Reviewed By: jvillard
Differential Revision: D16458130
fbshipit-source-id: aef751440
Summary:
Minor improvements to a bunch of stuff
- mostly, always log backend stats and time spent analysing, regardless of `-j 1`
- output stats more like they appear in the record
- split `InferAnalyze.main` to be more readable (hopefully)
Reviewed By: mityal
Differential Revision: D16440586
fbshipit-source-id: 6f91f53cd
Summary:
It's a bit more annoying to `incr` but is more uniform with the other
`mutable` field.
Reviewed By: ngorogiannis
Differential Revision: D16359027
fbshipit-source-id: 817cd94a0
Summary:
Modify the scheduler to collect results from children at the end of the
parallel execution. Use this to collect backend stats and log their
aggregated sum.
Reviewed By: ezgicicek
Differential Revision: D16358867
fbshipit-source-id: 775792ef7
Summary:
Use it to trace summary stats. It will be used more/better in future
diffs that aggregates stats across parallel workers.
Reviewed By: ezgicicek
Differential Revision: D16358868
fbshipit-source-id: 764614153
Summary:
Summary.ml defines both a bunch of types and how to use them and a
mechanism to save and store summaries on disk while maintaining a
complex in-memory cache of what's on disk. Make the distinction clear.
Reviewed By: ngorogiannis
Differential Revision: D16358869
fbshipit-source-id: 9d4c6cb77
Summary:
Fixes#1126
Different checks contain some ad hoc places that look at this param, but there is no systematic way to suppress this.
The centralized place that is filtering results is `reporting.ml`.
Note that this diff does not remove other usages, because they do more than mere filtering results.
Reviewed By: jvillard
Differential Revision: D16339655
fbshipit-source-id: afabdc97a
Summary:
The analysis call graph is the call graph from the perspective of the analyses run in infer
This commit creates the reverse analysis call graph to be used for incremental diff analysis
Reviewed By: ezgicicek
Differential Revision: D16335938
fbshipit-source-id: 0cbab3298
Summary: The reverse call graph will be constructed by adding edges one-by-one, so expose functionality in CallGraph to add a single edge to the graph
Reviewed By: jvillard
Differential Revision: D16285016
fbshipit-source-id: 553fe1ecf
Summary:
Add a function to delete a summary from disk and caches
This is needed so that summaries corresponding to invalidated procedures can be removed (as part of incremental diff analysis)
Reviewed By: ngorogiannis
Differential Revision: D16332752
fbshipit-source-id: 7d3c7a121
Summary:
Write a function to read in the summaries from the `.specs` folder
This is needed so the reverse analysis call graph can be constructed from the summaries
Reviewed By: ngorogiannis
Differential Revision: D16282333
fbshipit-source-id: 101ce2c5b
Summary:
Move the logic that is general to any call graph from SyntacticCallGraph.ml into CallGraph.ml
This will allow the call graph logic to be re-used in a later diff
Reviewed By: ezgicicek
Differential Revision: D16265150
fbshipit-source-id: 10a067f28
Summary:
newer is better, right?
All the code changes in infer are because of core being bumped to v0.12.
Reviewed By: jberdine
Differential Revision: D16223183
fbshipit-source-id: f3c339966
Summary:
The domain supported path sensitivity wrt to a specific boolean guard `Branch.unlikely`. This isn't used in actual code, so remove it.
Also
- add an .mli to the domain;
- unabbreviate domain name to match analyser name;
- use Payload.read instead of calling Ondemand directly;
- adjust tests.
Reviewed By: mbouaziz
Differential Revision: D16203953
fbshipit-source-id: 743aa4400
Summary:
`CallGraph.ml` computes a call graph using the explicit procedure calls in the source code (ie computes a syntactic call graph)
I am going to be adding code for an 'analysis call graph' that gives the callees of a procedure from the perspective of the analyses in infer
This diff renames `CallGraph.ml` to avoid confusion with the new analysis call graph logic
Reviewed By: ngorogiannis, jvillard
Differential Revision: D16204436
fbshipit-source-id: 67bed8e28
Summary: This function is suspected to be slow, let's take a look at realtime distribution
Reviewed By: ngorogiannis
Differential Revision: D16221864
fbshipit-source-id: 2698602a9
Summary:
- Change the method `Ondemand.analyze_proc_name` so that `caller_summary` is not optional
- Introduce a new method `analyze_proc_name_no_caller` to replace `analyze_proc_name` when there is no caller
Reviewed By: ngorogiannis
Differential Revision: D16183378
fbshipit-source-id: c0c67f869
Summary: Refactor the methods `analyze_proc_desc` and `analyze_proc_name` in `ondemand.ml` so that they no longer share code
Reviewed By: ngorogiannis
Differential Revision: D16182733
fbshipit-source-id: 5aee03092
Summary: Register the callees of a procedure in the set `Summary.callee_pnames`
Reviewed By: ngorogiannis
Differential Revision: D16165016
fbshipit-source-id: 364aa948c
Summary:
Store a set callee names (`Typ.Procname.Set`) in the summary of a procedure
This will allow a call graph to be constructed showing the dependencies between procedures from the perspective of the analyses
Reviewed By: ngorogiannis
Differential Revision: D16148907
fbshipit-source-id: ab6f5d616
Summary:
The fields `tenv` and `integer_type_widths` can be obtained from the `exe_env` field of `proc_callback_args`
This commit removes the redundant fields
Reviewed By: ngorogiannis
Differential Revision: D16149520
fbshipit-source-id: d37526fd4
Summary:
Supply the caller `Summary.t` to `Ondemand.analyze_proc_name` and `Ondemand.analyze_proc_desc` instead of the caller `Procdesc.t`
This change will enable a later commit to record the procedures that are called by a procedure in its summary
Reviewed By: ngorogiannis
Differential Revision: D16148677
fbshipit-source-id: cf353e89a
Summary:
Cluster checkers call `SummaryPayload.read` but set the `caller_summary` to correspond to the same summary as gives the `callee_pname`
This change introduces a new method `read_toplevel_procedure` that does not require a `caller_summary`, to be used by the cluster checkers
Reviewed By: ngorogiannis
Differential Revision: D16131660
fbshipit-source-id: 12caa1000
Summary:
Change the datatype `ProcData` to include a field of type `Summary.t` instead of a field of type `Procdesc.t`
This will enable a later commit to supply a summary to `Ondemand.analyze_proc_desc` and `Ondemand.analyze_proc_name`
Reviewed By: ngorogiannis
Differential Revision: D16121405
fbshipit-source-id: 342374121
Summary:
`proc_desc` is an argument to the function `iterate_procedure_callbacks` in `callbacks.ml` but can always be obtained from another argument (`summary`)
This commit removes the redundant argument
Reviewed By: ngorogiannis
Differential Revision: D16107332
fbshipit-source-id: 21c21921e
Summary:
The record `proc_callback_args` (defined in `callbacks.ml`) contains the fields `proc_desc` and `summary`.
The field `proc_desc` is redundant because it can be obtained from `summary`.
This diff removes `proc_desc` and uses the summary to obtain it where needed.
Reviewed By: ngorogiannis
Differential Revision: D16090783
fbshipit-source-id: 5632d1f4a
Summary: Refactor `ondemand.ml` so that the function `analyze_proc` does not need to be passed around as a function argument
Reviewed By: ngorogiannis, jvillard
Differential Revision: D16089689
fbshipit-source-id: 97ba07619
Summary:
Move control of the number of remaining task from the taskbar [1] to each task generator [2]. This means that the call graph scheduler can count all procedures in mutually-recursive cycles as dealt with when only those procedures are left.
[1] : `infer/src/base/TaskBar.ml`
[2] : type defined in `/infer/src/base/ProcessPool.ml`
Reviewed By: ngorogiannis
Differential Revision: D16071497
fbshipit-source-id: aa9436638
Summary:
Currently, `Callbacks.analyze_procedures` creates a function to call the method `Callbacks.iterate_procedure_callbacks`. This is supplied as an argument to functions in `ondemand.ml`, so that it can be invoked. This is done to avoid a cyclic dependancy.
This diff moves the functions that `ondemand.ml` needs to call into `ondemand.ml`, avoiding the need to supply them as arguments.
Reviewed By: ngorogiannis
Differential Revision: D16028836
fbshipit-source-id: 16ae27a3e
Summary:
This can take a minute or so during which the user would have no idea
what infer is doing.
Reviewed By: ngorogiannis
Differential Revision: D16005393
fbshipit-source-id: 586812527
Summary:
No need to hide the real reason for the crash behind another crash when
trying to print the error message.
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16005394
fbshipit-source-id: dc3d9437e
Summary:
Replace Hashtbl.clear with Hashtbl.reset
This saves memory because the reset method shrinks the hash-table, whereas the clear method just empties it
Reviewed By: jvillard
Differential Revision: D16004966
fbshipit-source-id: f32b00b0f
Summary: Preanalysis is performed at the frontend now. Hence, we don't need to repeatedly check/set when/if it is performed.
Reviewed By: mbouaziz
Differential Revision: D15863175
fbshipit-source-id: f9c6b7ae1
Summary:
One "interesting" feature of the approach of merging the captured targets in Java, is that we union their type environments, as opposed to store partial tenvs together with each source file, which is the case for Clang.
This means
- the final global type environment is potentially huge because it contains all the types in all targets.
- all analysis workers start by loading that tenv in memory, meaning we consume `|size of tenv| x #cpus` memory, which can tip the balance towards OOMs
This diff attempts to economise on global tenv size. This is done by increasing sharing which is then preserved by marshalling. It's done in a brute force way, with hashtables for each struct component, and is not fully effective due to the recursion amongst types and types names, as well types appearing inside other constructs such as procnames.
This is done when calling `Tenv.store` so that
- the computation can be parallelised somewhat (capture is parallel, merging is not)
- buck caching will benefit from smaller tenvs.
This saves about 24% of total memory devoted to the type environment.
Reviewed By: mbouaziz
Differential Revision: D15840054
fbshipit-source-id: 6f03be1a4
Summary:
- Add allocation costs to `costs-report.json` and enable diffing over allocation costs.
- Also, let's be more consistent and modular in naming our cost issues.
- introduce a generic issue type `X_TIME_COMPLEXITY_INCREASE` where `X` can be one of the cost kinds. If the function is on the cold start, issue can have the `COLD_START` suffix. Similarly for infinite/zero/expensive calls.
- Change `PERFORMANCE_VARIATION` -> `EXECUTION_TIME_COMPLEXITY_INCREASE`
- Add new issue type for `ALLOCATION_COMPLEXITY_INCREASE_COLD_START` which will be enabled by default
- Refactor cost issues to be more modular and succinct. This also makes addition of a new cost kind very easy by adding the kind into the `enabled_cost_kinds` list in `CostKind.ml`
Reviewed By: mbouaziz
Differential Revision: D15822681
fbshipit-source-id: cf89ece59
Summary:
I realized that there was a discrepancy in the # of instructions between whether we run a single analysis or multiple analyses at the same time. It turns out that in biabduction, bufferoverrun and other HIL analyses we did Preanalysis step (which adds scope instructions and invokes liveness etc.) but not in others. This discrepancy results in inconsistent analysis results (e.g. in the new inefficient-keyset-iterator) that rely on instructions. We should be consistent. Hence, we now invoke Preanalysis in the frontend and remove all other uses in the rest of the checkers.
Consequently, I had to update the inefficient-keyset-checker to take the CFG resulting from Preanalysis with extra scoping instructions.
Reviewed By: mbouaziz, ngorogiannis, jvillard
Differential Revision: D15803492
fbshipit-source-id: 4e21eb610
Summary:
The synthetic methods from `topl.Property` are now nonempty: they
simulate a nondeterministic automaton.
Reviewed By: jvillard
Differential Revision: D15668471
fbshipit-source-id: 050408283
Summary:
Instrument SIL according to TOPL properties. Roughly, the
instrumentation is a set of calls into procedures that simulate a
nondeterministic automaton. For now, those procedures are NOP dummies.
Reviewed By: jvillard
Differential Revision: D15063942
fbshipit-source-id: d22c2f6fa
Summary: There can be A LOT of procedures -- currently we log two lines (started/done) for each one, when doing call graph scheduling. This leads to ridiculously long log files. Switch to only log these messages in the log file and only if we are verbose logging.
Reviewed By: jvillard
Differential Revision: D15413330
fbshipit-source-id: 6e26693e8
Summary:
Before: no links to procedure summary and nodes in header file debug html
Now: some or all of them if you are lucky enough
Reviewed By: jvillard
Differential Revision: D15279379
fbshipit-source-id: a145f9e66
Summary:
Before: they are written only when the file is fully analyzed.
Now: a first version is written as soon as the file gets analyzed so that we get links to nodes, the final version overwrites it
Reviewed By: jvillard
Differential Revision: D15279351
fbshipit-source-id: a3120aa31
Summary:
API and stub implementation for real-time logging capabilities.
Low-level implementation requires interaction with FB-specific deployment of Scribe, hence it is stubbed out.
Reviewed By: jberdine
Differential Revision: D15259559
fbshipit-source-id: 712cb99e1
Summary:
A more dynamic scheduling scheme will potentially run into the situation where no new work packets can be scheduled, but more work will be possible to schedule in the future, perhaps when some dependent work packet finishes being analysed.
The current implementation prevents that, as it expects that if a worker goes idle, it stays idle.
The changes here address this in two parts:
- the `select` call is always given a finite timeout. If given an infinite timeout, we will not be able to poll the task generator for more work, where none were previously possible.
- when the `select` call times out without updates, check if there is an idle child, and if so if the task generator has more work right now.
See also ProcessPool.mli for comments.
Reviewed By: mbouaziz
Differential Revision: D15197749
fbshipit-source-id: babe5da8e
Summary:
Before moving to any kind of non-trivial scheduling, we need to change the Tasks interface.
In particular, it's too restrictive to expect that the tasks to be scheduled are provided as a list before starting execution. For example, dynamic scheduling does not fit the bill here. Also, the list expectation means all scheduling work has to be done up front.
The solution here is to move to a `Sequence`-like interface with one difference:
- The function returning the next task expects a task option argument. That argument is the task that was just finished (if any) by the worker expecting new work. This will be useful for things like task dependencies (for instance, a procedure has been analysed, and can be marked so).
Reviewed By: mbouaziz
Differential Revision: D15181613
fbshipit-source-id: 21f3ba825
Summary:
Adds option `--summary-stats` to `infer report`.
The formatting is not perfect yet but it gives what I want.
Reviewed By: ngorogiannis
Differential Revision: D15064162
fbshipit-source-id: 56c4b4929
Summary: Using `Fields.to_list` also makes sure we don't forget fields.
Reviewed By: ezgicicek
Differential Revision: D15062353
fbshipit-source-id: aaac9be99
Summary:
TOPL properties are essentially automata, which will be modeled as a set
of procedures. The code-to-analyze makes calls into these procedures,
thereby driving the automaton. In this commit, these calls do not do
anything. The point is to prepare the hook-up mechanism.
Reviewed By: jvillard
Differential Revision: D14819650
fbshipit-source-id: d95ecdb3d
Summary:
This is in preparation of interprocedural pulse. The abstract addresses
generator keeps a reference to create fresh addresses, but that's a
piece of global state that needs to persist across ondemand analyses.
Reviewed By: jberdine
Differential Revision: D14324760
fbshipit-source-id: 5cdb1d3f5
Summary:
Instead of emitting an ad-hoc builtin on variable declaration emit a new
metadata instruction. This allows us to remove the code matching on that
ad-hoc builtin that had to be inserted in several checkers.
Inferbo & pulse used that information meaningfully and had to undergo
some minor changes to cope with the new metada instruction.
Reviewed By: ezgicicek
Differential Revision: D14833100
fbshipit-source-id: 9b3009d22
Summary:
Bundle all non-semantic-bearing instructions into a `Metadata _`
instruction in SIL.
- On a documentation level this makes clearer the distinction between
instructions that encode the semantics of the program and those that are
just hints for the various backend analysis.
- This makes it easier to add more of these auxiliary instructions in
the future. For example, the next diff introduces a new `Skip` auxiliary
instruction to replace the hacky `ExitScope([], Location.dummy)`.
- It also makes it easier to surface all current and future such
auxiliary instructions to HIL as the datatype for these syntactic hints
can be shared between SIL and HIL. This diff brings `Nullify` and
`Abstract` to HIL for free.
Reviewed By: ngorogiannis
Differential Revision: D14827674
fbshipit-source-id: f68fe2110
Summary:
`Utils.with_intermediate_temp_file_out` is conceptually simpler. Plus,
this removes a dependency on Unix.flock, which is not portable under
Windows.
Pull Request resolved: https://github.com/facebook/infer/pull/1066
Differential Revision: D14208138
Pulled By: jvillard
fbshipit-source-id: 7587007e5
Summary:
Add an option to specify some classes that we really want to warn about
with the liveness checker, even when they appear used because of the
implicit destructor call inserted by the compiler.
Reviewed By: mbouaziz
Differential Revision: D13991129
fbshipit-source-id: 7fafdba84
Summary:
Add the `--source-files-cfg` option to emit CFGs as .dot files just as if one
had run with `--debug` to begin with. The usual `--source-files-filter`
applies. For example:
```
$ cd examples
$ infer -- clang -c hello.c
$ infer --continue -- javac Hello.java
$ infer --continue -- make -C c_hello
$ infer explore --source-files --source-files-cfg --source-files-filter ".*\.c$"
hello.c
c_hello/example.c
CFGs written in /home/jul/infer.fb/examples/infer-out/captured/*/icfg.dot
```
Reviewed By: ezgicicek, mbouaziz
Differential Revision: D13973062
fbshipit-source-id: 3077e8b91
Summary:
Printing "N specs" next to function definitions in the HTML debug is
misleading because there are more checkers than just biabduction.
Reviewed By: mbouaziz
Differential Revision: D13572456
fbshipit-source-id: 209b874df
Summary: Publish solutions to the lab, and a Docker file and image to get started more quickly with infer hacking.
Reviewed By: mbouaziz
Differential Revision: D13648847
fbshipit-source-id: daf48ad03
Summary:
- `Printer.NodesHtml.start_node` prints the instructions rather than doing it in the callee
- use color class for `<LISTING>` rather than wrapping them in `<span>` (also fixes a wrong nesting between `<LISTING>` and `<span>`)
- `Summary.pp_html` is always `Black`
- New line before `<hr>` and `<LISTING>`
- `Io_infer.Html.create` takes a `SourceFile.t` rather than a `path_kind`
- typo
Reviewed By: jvillard
Differential Revision: D13572247
fbshipit-source-id: 65f57df25
Summary:
Before, the liveness pre-analysis would place extra instructions in the
CFG for either:
1. marking an `Ident.t` as dead, or
2. marking a `Pvar.t` as `= 0`
But we have no way of marking pvars dead without setting them to 0. This
is bad because setting pvars to 0 is not possible everywhere they are
dead. Indeed, we only do it when we haven't seen their address being
taken anyway. This prevents the following situation, recorded in our tests:
```
int address_taken() {
int** x;
int* y;
int i = 7;
y = &i;
x = &y;
// if we don't reason about taken addresses while adding nullify instructions,
// we'll add
// `nullify(y)` here and report a false NPE on the next line
return **x;
}
```
So we want to mark pvars as dead without nullifying them. This diff
extends the `Remove_temps` SIL instruction to accept pvars as well, and
so renames it to `ExitScope`.
Reviewed By: da319
Differential Revision: D13102953
fbshipit-source-id: aa7f03a52
Summary:
Useful to understand the changes in the pre-analysis, or to inspect the
CFG that checkers actually get.
This means that the pre-analysis always runs when we output the dotty,
but I don't really see a reason why not. In fact, we could probably
*always* store the CFGs as pre-analysed.
Reviewed By: mbouaziz
Differential Revision: D13102952
fbshipit-source-id: 89f3102ec
Summary: Experimental feature: Use memcached for summaries as a look-aside cache during analysis.
Reviewed By: jvillard
Differential Revision: D12939311
fbshipit-source-id: 9f78994e2
Summary:
Messages emitted by cost-analysis now look like the following:
Complexity of this function has **increased** from `O(1)` to `O(n)`.
Reviewed By: mbouaziz
Differential Revision: D13058008
fbshipit-source-id: 119037703
Summary:
When initialising a variable via semi-exotic means, the frontend loses
the information that the variable was initialised. For instance, it
translates:
```
struct Foo { int i; };
...
Foo s = {42};
```
as:
```
s.i := 42
```
This can be confusing for backends that need to know that `s` actually
got initialised, eg pulse.
The solution implemented here is to insert of dummy call to
`__variable_initiazition`:
```
__variable_initialization(&s);
s.i := 42;
```
Then checkers can recognise that this builtin function does what its
name says.
Reviewed By: mbouaziz
Differential Revision: D12887122
fbshipit-source-id: 6e7214438
Summary:
Seems useful to know when we're printing one instruction only, but not when we
print lots of them for readability.
Reviewed By: mbouaziz
Differential Revision: D12823481
fbshipit-source-id: 2beb339f2
Summary: First version of an analyzer collecting classes transitively touched.
Reviewed By: mbouaziz
Differential Revision: D10448025
fbshipit-source-id: 0ddfefd46
Summary:
It gets built-in integer type widths of C from the clang plugin. For Java, it uses fixed widths.
The controller you requested could not be found.: facebook-clang-plugins
Reviewed By: jvillard
Differential Revision: D10397409
fbshipit-source-id: 73958742e
Summary:
When the backend crashes we print which instruction/file/... we were analysing,
but because of recursion we can end up repeating that information all
the way to the toplevel call.
This makes sure we only print the innermost one, we don't care about the
calling context because the analysis is compositional.
Reviewed By: mbouaziz
Differential Revision: D10381141
fbshipit-source-id: 1c92bb861
Summary:
Load proc descs from the "procedures" sqlite table instead of from
file-wide cfgs stored in the "source_files" table. This removes the need
for a cache of these file-wide CFGs, which was needed because loading
them is expensive and potentially needed in case we need to load the
proc descs of several procedures in the same file. Now we can just load
the proc descs one by one and not worry about caching.
Reviewed By: jberdine
Differential Revision: D10173355
fbshipit-source-id: 665636121
Summary:
Fix the logic for computing duplicate symbols. It was broken at some point and some duplicate symbols creeped into our tests. Fix these, and add a test to avoid duplicate symbols detection to regress again.
Also, this removes one use of `Cfg.load`, on the way to removing file-wide CFGs from the database.
Reviewed By: ngorogiannis
Differential Revision: D10173349
fbshipit-source-id: a0d2365b3
Summary:
There's nothing to analyse for declared procedures, and if there is then
that's because they are defined outside the source file and should not
be analysed unless ondemand needs them.
Reviewed By: ngorogiannis
Differential Revision: D10173353
fbshipit-source-id: 39c42eb7a
Summary:
Keep `--analyzer` around for now for integrations that depend on it.
Also deprecate the `--infer-blacklist-path-regex`,
`--checkers-blacklist-path-regex`, etc. in favour of
`--report-blacklist-path-regex` which more accurately represents what these do
as of now.
Rely on the current subcommand instead of the analyzer where needed, as most of
the code already does.
Reviewed By: jeremydubreil
Differential Revision: D9942809
fbshipit-source-id: 9380e6036
Summary: If we get to that point, it means we already want to run the analysis so no need for this check.
Reviewed By: mbouaziz
Differential Revision: D9942702
fbshipit-source-id: e89e22c91
Summary: Use `PerfEvent` to record the execution time of individual checkers.
Reviewed By: jeremydubreil, mbouaziz
Differential Revision: D9832102
fbshipit-source-id: 678fca155
Summary:
This adds an option `--trace-events` that generates a Chrome trace event[1] to
quickly visualise the performance of infer.
Reviewed By: mbouaziz
Differential Revision: D9831599
fbshipit-source-id: 96a33c627
Summary:
Now we see which file/procedure/instruction is responsible for a crash in the
backend. Biabduction and eradicate not supported yet for the instruction-level
debug.
Reviewed By: mbouaziz, da319
Differential Revision: D9915666
fbshipit-source-id: 279472305
Summary:
First version of differential for costs, based on polynomial's degree's variation. The rule is very simple:
For a given polynomial that is available before and after a diff, `if degree_before > degree_after`, then the issue becomes `fixed`. Instead, `if degree_before < degree_after`, then the issue becomes `introduced`.
Reviewed By: ezgicicek
Differential Revision: D9810150
fbshipit-source-id: d08285926
Summary:
Callsites of `Reporting.log_error/warning` always use `Exceptions.Checkers`, let's simplify the API.
Under the hood it still creates an exception, but this can be cleaned up later.
Reviewed By: jeremydubreil
Differential Revision: D9799860
fbshipit-source-id: 6492a60b4
Summary:
For some unexplained reason, some of the functions registered in the Epilogues would sometimes be executed several times. I could not figure out why.
This diff fixes that, but also has more explainable benefits:
- Do not run epilogues registered in the parent in the children. Previously it
would do so, but probably only if the children registered some epilogue given
that `at_exit` must be called again once on the child (but the value of the ref
in `Pervasives` would not have been reset).
- Unified behaviour for early and late epilogues given that we now handle both of these directly
We already have all the control needed to run epilogues when needed: we know
when infer exits, and we know when children processes exit.
Reviewed By: mbouaziz
Differential Revision: D9752046
fbshipit-source-id: 13af40081
Summary:
It detaches the Summary module from BufferOverrunDomain.
Depends on D9194130
Reviewed By: jvillard
Differential Revision: D9194375
fbshipit-source-id: 30392b5ce
Summary:
Now that we got rid of dummy nodes used non-dummily (biabduction state, reporting), `pname` don't need to be an option anymore.
Let's save a boxing on all nodes.
Reviewed By: jeremydubreil
Differential Revision: D9654152
fbshipit-source-id: 83b00f239