Summary:
Also ondemand doesn't need to call Topl itself after an analysis.
Inline last call site of SummaryReporting (ondemand.ml) and delete the
file.
Reviewed By: ngorogiannis
Differential Revision: D21424386
fbshipit-source-id: 064f4e261
Summary:
- make quandary an `interprocedural`
- move quandary traces to absint/ since they are also used at least by
SIOF
- no more `Summary.OnDisk.dummy` :')
Reviewed By: ngorogiannis
Differential Revision: D21408391
fbshipit-source-id: cc53d2c6c
Summary:
Needed to move a bunch of files around to make this happen. Notably,
moving "preanal.ml" outside of checkers/ into backend/ since it needs to
modify the proc desc in the summary. Also hoisting goes to cost/.
Reviewed By: skcho
Differential Revision: D21407069
fbshipit-source-id: ebb9b78ec
Summary:
Changing inferbo required changing all the analyses that depend on it
too. This introduces a new feature of the the new framework: the ability
for the checkers to read other analyses' payloads in a more functional
way.
Reviewed By: ezgicicek, skcho
Differential Revision: D21401819
fbshipit-source-id: f9b99e344
Summary:
An easy one. Will be needed eventually to make checkers/ its own dune
library.
Reviewed By: ngorogiannis
Differential Revision: D21401818
fbshipit-source-id: 64e8a4bf4
Summary:
`RegisterCheckers` is now doing a bunch of work to convert the new-style
checkers to the good old Callbacks datatypes. Extract the conversion
code into another module before more is added. A good opportunity to
document them.
Reviewed By: ezgicicek
Differential Revision: D21401817
fbshipit-source-id: 4cba3aa7b
Summary:
This module needs to be above all checkers since it knows about all the
checkers; put it at the same level of ondemand and callbacks.
Reviewed By: ezgicicek
Differential Revision: D21401821
fbshipit-source-id: f40dba6dd
Summary: So it can be used by dune libraries without depending on backend/.
Reviewed By: dulmarod
Differential Revision: D21351908
fbshipit-source-id: d288f9179
Summary:
This turned into a pretty big refactoring: the
`IntraproceduralAnalysis.t` needs to be passed around through quite a
few functions. These functions usually depended on
tenv/proc_desc/proc_name that are already in the `analysis_data` so
refactored that too (checking that these were indeed the same as in the
`analysis_data` record (using my own eyeballs). Interestingly there is
one place where we actually turn the analysis to other proc descs than
the original one in eradicate.ml so I've added a small comment there.
Reviewed By: mityal
Differential Revision: D21351909
fbshipit-source-id: 6e6330e5b
Summary:
That function was testing if a proc is defined (ok) and if so if it also
had no summary. The second check is wrong because it could be that the
function hasn't been analysed yet. We should call "Ondemand" instead.
But, it seems like a weird check so I just deleted it and replaced is
with just the "is defined" check.
Renamed the "is_library" field in typeOrigin to "is_defined" to better
reflect the expectations. While I was at it, inlined record type
definitions in `TypeOrigin.t`. They were already inlined except for 2!
Reviewed By: dulmarod
Differential Revision: D21351453
fbshipit-source-id: 1acc7ff90
Summary:
These exceptions were caught earlier before but D21257474 made absint
log an error every time before reraising them.
The exception type had to move to IR/ or absint/, so I moved
"SchedulerTypes" to "absint/TaskSchedulerTypes" and added the restart
scheduler's exception there. There is already a "Scheduler.ml" file in
absint/ so to address the ambiguity I added "Task" in front of that one.
Reviewed By: ngorogiannis
Differential Revision: D21348593
fbshipit-source-id: 58055c9b7
Summary:
This shows how the strategy for making each analysis its own library
looks like. Starting with biabduction since it's one of the worst ones.
A checker cannot depend on ProcData (in its current form) as it contains
a Summary.t. We don't want to change all the analyzers at once so let's
make up a new type. The new type also contains callbacks for Ondemand
since a checker that is its own dune library cannot call Ondemand
directly either (because Ondemand needs Summary.OnDisk).
Once all checkers are migrated we should be able to reclaim the
callbacks set up by callbacks.ml in exchange for these new callbacks,
therefore achieving a neutral callback karma.
Reviewed By: dulmarod, skcho
Differential Revision: D21261638
fbshipit-source-id: fa91ca50f
Summary:
- needed to make biabduction/ its own library as it shouldn't depend on
backend/Printer as a result
- it's a higher level of abstraction
Changed NodePrinter to not swallow biabduction timeouts, thereby
removing its footgunness.
Reviewed By: ngorogiannis
Differential Revision: D21261645
fbshipit-source-id: 592526d85
Summary:
`State` is used by the AI framework, which isn't supposed to know about
biabduction/. Split the biabduction-specific parts into
biabduction/State.ml and keep the rest in absint/AnalysisState.ml.
Reviewed By: ngorogiannis
Differential Revision: D21257470
fbshipit-source-id: e01d1fed3
Summary:
`ProcData.t` contains a `Summary.t`. Eventually we want to fix this too
so that checkers don't depend on backend/, i.e. on all the other
checkers via Summary.ml. But in order to migrate progressively we can
first migrate absint/ and one step on the way is for it to not know what
kind of analysis data it is passing around.
This extra flexibility only costs us passing an extra `Procdesc.t` in a
couple more functions so it's actually not a bad change in itself.
Reviewed By: ngorogiannis
Differential Revision: D21257466
fbshipit-source-id: a91f7b191
Summary:
This is needed to make absint/ its own dune library. Modules in absint/
cannot depend on modules in backend/. This is because backend/Summary.ml
depends on *all* the (inteprocedural) analyses since its stores elements
of their domains. But analyses are supposed to depend on absint/ to
use the AI framework and others. Thus absint/ cannot depend on backend/
otherwise it creates a dependency cycle.
Right now infer builds with this cycle because we consider all these
directories to be one library and there is no dependency cycle between
the individual modules, but making absint/ a library makes the situation
coarser.
One actual change in there: one function in PatternMatch was specific to
nullsafe and was moved there.
Reviewed By: ngorogiannis
Differential Revision: D21257467
fbshipit-source-id: 42af43315
Summary:
This is a step in disentangling the various analyses: that file used to
make every checker on biabduction because of a few of its functions that
use biabduction datatypes.
Split reporting.ml into:
- Reporting.ml: the functions all checkers need to report errors. This
is put in absint/ with the other files that are needed by all
checkers.
- SummaryReporting.ml: functions that need to depend on Summary.ml
(useful for later). This is put in backend/ where Summary.ml lives.
- BiabductionReporting.ml: for the biabduction analysis
The rest of the changes are renames to use the appropriate module
amongst the above.
Reviewed By: ngorogiannis
Differential Revision: D21257468
fbshipit-source-id: fa28cefbc
Summary:
This fixes a long-standing TODO. I re-ordered the lines by:
- first putting each entry on a single line (replacing "^J " -> " "
in emacs)
- then using emacs' M-x reverse-region
- then auto-formatting again
Reviewed By: skcho
Differential Revision: D21257475
fbshipit-source-id: 91b780a5d
Summary:
Good night, sweet prince. This was never used and hasn't seen progress
in a while.
Reviewed By: jberdine
Differential Revision: D21201932
fbshipit-source-id: e6f537b30
Summary: This diff suppresses cost issues on lambda and auto-generated procedures, since they were too noisy.
Reviewed By: ezgicicek
Differential Revision: D21153619
fbshipit-source-id: 65ad6dcc3
Summary:
When encountering a constant, pulse creates an abstract value (a
variable) to represent it, and remembers that it's equal to it. The
problem is that pulse doesn't yet know how to deal with the fact that
some variables are going to be equal to each other.
This hacks around this issue in the case of constants, within the same
procedure, by remembering which constants have been assigned to which
place-holder variables, and serving those variables again when the same
constant is translated again.
Limitation: this doesn't work across procedure calls as the "constant
maps" are not saved in summaries.
Something to look out for: we don't want to make `if (p == NULL)` create
a path where `p` is invalid (we only make null invalid when we see an
assignment from 0, i.e. `p = NULL;`).
Reviewed By: ezgicicek
Differential Revision: D21089961
fbshipit-source-id: 5ebb85d0a
Summary:
One source of false positives on container races is when the container member field is initialised to a concurrent version in a constructor, but the static type of the field doesn't reflect the thread safety of it.
This solution
- tracks flows from constructors of safe data structures to abstract addresses;
- initialises the initial attribute state when analysing a non-constructor method to that achieved by all constructors/class-initializers.
- checks for that attribute when recording container accesses.
Reviewed By: jvillard
Differential Revision: D21089428
fbshipit-source-id: 02a88f6e8
Summary:
The full inventory of everything in infer-out/. The main change is
around "issues directories": instead of registering them dynamically
they are now all declared statically (well, they kind of were already in
Config.ml).
Reviewed By: ngorogiannis
Differential Revision: D20894305
fbshipit-source-id: 1a06ec09d
Summary:
This is an entry in infer-out/, we want these to be predictable, not
user-defined.
Reviewed By: ngorogiannis
Differential Revision: D20894307
fbshipit-source-id: 332f85969
Summary: Another entry to practice adding entries to ResultsDirEntryName.
Reviewed By: skcho
Differential Revision: D20894299
fbshipit-source-id: 9c387e0f3
Summary:
First real step to keep an inventory of all the entries in infer-out/ in
a single place, and associate meta-data to each entry. In particular, we
want to avoid adding things in infer-out/ without a clear idea of
whether they should be cleaned up before going into a cache, or before
an incremental analysis.
Migrate infer-out/tmp/ first just as an example and an excuse to write
the scaffolding code needed for all the other entries.
Reviewed By: skcho
Differential Revision: D20894300
fbshipit-source-id: f796fca55
Summary:
The only accurate entry it contains is the number of files analysed and
we can already get that information from the log file (and console
output).
Reviewed By: dulmarod
Differential Revision: D20894303
fbshipit-source-id: bc180015a
Summary:
I wanted to change the order in which we try loading them but it changes
analysis results too much for the cost analysis.
Reviewed By: skcho
Differential Revision: D20891172
fbshipit-source-id: f972f314e
Summary:
Simplifies the models story. The model jar is still needed to look up
fields in modelled classes (we save the .class of each model!). This is
sad as we should be able to get these from the models' tenv, which
should be more compact, but that's a bigger change.
Reviewed By: skcho
Differential Revision: D20891117
fbshipit-source-id: fcb5104ae
Summary:
Instead of looking up each proc name in models/, pre-compute the list of
models and do lookups there instead of in the filesystem.
Reviewed By: ngorogiannis
Differential Revision: D16603148
fbshipit-source-id: 5eb534a14
Summary:
This information can be useful for tooling responsible for further
processing (e.g. metric calculation and logging)
Reviewed By: artempyanykh
Differential Revision: D20914583
fbshipit-source-id: 61804d88f
Summary:
Fix all the docstrings that `odoc` or `ocamlformat` is not happy about.
Delete all `[@@ocamlformat "parse-docstring = false"]` pragmas as a
result.
Reviewed By: jberdine, ngorogiannis
Differential Revision: D20798913
fbshipit-source-id: 728d9e45c
Summary:
- Add `no_return` models for Java's `exit(...)` methods (can be extended further later on)
- handle throw-catch better by short-cutting throw nodes to not exit node but to all **catch nodes** that are reachable by the node. If there is no catch node, we short-cut to the exit node as before.
This removes a FP from deadstore tests because before we simply were not able to handle CF from throw-> catch nodes at all.
Reviewed By: skcho
Differential Revision: D20769039
fbshipit-source-id: e978f6cdb
Summary:
The attribute `[no_return]` signifies that a function doesn't return. Previously, pre-analysis had cut the links to successor nodes of such no-return function nodes. This was intended to help with suppressing reporting on unreachable paths for some analyses. However, this results in having these nodes as dangling, with no connection to exit nodes.
This diff additionally shortcuts these no-return function nodes to exit node. This would allow us to enhance inter-procedural analyses like pulse to kepp track of paths that do not return since we will be keeping their connections at exit node rather than completely cutting them of as before. It would also allow us to assume that all paths start at the one start node and end at the one exit node (at least syntactically in the CFG).
Reviewed By: skcho
Differential Revision: D20736043
fbshipit-source-id: 0eace1bdb
Summary:
This diff adds a procedure name to the head of the trace in order to distinguish issues in the same line.
"Updated Cost is ..." is changed to "Updated Cost of <proc name> is ..."
Reviewed By: ezgicicek
Differential Revision: D20672214
fbshipit-source-id: 303b4492f
Summary: Use the an LRUCache in Ondemand.LocalCache to avoid clearing it after every toplevel analysis.
Reviewed By: ngorogiannis
Differential Revision: D20281932
fbshipit-source-id: 752c8e1ea
Summary:
This is likely not the final refinement, rather one step forward.
We classify all classes by 3 categories:
- Nullsafe and 0 issues
- can add Nullsafe and will be 0 issues
- the rest (class needs improvement)
Each class will fall into exactly one category.
Error messaging is WIP, they are not intended to be surfaced to the user
just yet.
Note how this diff uses the result of the previous refactoring.
Reviewed By: artempyanykh
Differential Revision: D20512999
fbshipit-source-id: 7f462d29d
Summary:
This diff is doing three things:
1. Finishes work paved in D20115024, and applies it to nullsafe. In that diff, we hardened API for
file level analysis. Here we use this API in nullsafe, so now we can
analyze things on file-level, not only in proc-level like it was before!
2. Introduces a class-level analysis. For Nullsafe purposes, file is not
an interesting granularity, but we want to analyze a lot of things on
file level. Interesting part here is anonymous classes and how we link
them to their corresponding user-defined classes.
3. Introduces a first (yet to be improved) implementation of class-level
analysis. Namely it is "meta-issues" that tell what is going with class
on high level. For now these are two primitive issues, and we will
refine them in follow up diffs. They are disabled by default.
Follow ups include:
1. Refining semantics of meta-issues.
2. Adding other issues that we could not analyze before or analyzed not
user friendly. Most importantly, we will use it to improve reporting for
FIELD NOT INITIALIZED, which is not very user friendly exactly because
of lack of class-level aggregation.
Reviewed By: artempyanykh
Differential Revision: D20417841
fbshipit-source-id: 59ba7d2e3
Summary: Just for fun, and because killing InferPrint.ml is just so satisfying.
Reviewed By: ngorogiannis
Differential Revision: D20362643
fbshipit-source-id: 039cfec61
Summary:
Now that this module only prints report.json and costs_report.json, we
can dis-entangle the whole callback spaghetti.
Reviewed By: ngorogiannis
Differential Revision: D20362640
fbshipit-source-id: 56ffa4e08
Summary:
At this point in the stack, here's what's left in InferPrint.ml:
1. how to output report.json and costs_report.json
2. how to print summaries from the command line (`infer report`)
3. how to print --issues-tests stuff (`infer report --issues-tests`)
Keep only 1. in there. 2. goes to SpecsFiles.ml directly from infer.ml,
and 3. goes to its own module (also the fields to output in
--issues-tests get a non-poly variant).
1. does some extra stuff sometimes, eg in test-determinator mode. Keep
this for now.
Reviewed By: ngorogiannis
Differential Revision: D20362642
fbshipit-source-id: 0d9f0e8e2
Summary:
Make <infer-out>/report.json the default value for this option, as this
is what is used 99% of the time. Clean up test options using this.
Reviewed By: ngorogiannis
Differential Revision: D20362644
fbshipit-source-id: a1bb18757
Summary:
InferPrint hasn't been in charge of writing bugs.txt since forever.
This will be re-implemented as a post-processing of report.json instead
(like it is now, but in OCaml instead of python).
Reviewed By: ngorogiannis
Differential Revision: D20362641
fbshipit-source-id: 83d8cb53d
Summary:
I don't think anyone uses this. Meta-goal: cleaning up InferPrint.ml.
Measuring stats about summaries is good in principle, but we should do
it somewhere else instead of in the InferPrint callback hell. For
instance when we record each summary. Meanwhile, delete this.
Reviewed By: ngorogiannis
Differential Revision: D20362639
fbshipit-source-id: c73d431a5
Summary:
Warning: This might be a bit brutal.
PerfStats and EventLogger are pretty much subsumed by `ScubaLogging`.
It seems no one has been looking at the data they generate recently.
Let's delete them! If we need to re-implement some parts later on, let's
do that using `ScubaLogging`, which is better (eg, still produces data
when infer crashes).
Things we lose:
- errors in the clang frontend due to missing decl translation, etc.
- errors in biabduction due to timeouts, functions not found, etc.
We could also re-implement these using BackendStats and ScubaLogging
instead of brutally deleting everything.
Reviewed By: ngorogiannis
Differential Revision: D20343087
fbshipit-source-id: 90a3121ca
Summary:
The goals are to have all the checker definitions and documentation in one
place (except how to actually run them, since that's not quite the same
concept; for example inferbo is one checker but several analyses depend on its
symbolic execution), and later on to be able to link issues reported by infer
back to the checker that generated them.
This makes apparent that the documentation of our checkers is lacking,
not touching that in this diff.
Not sure if "analysis" would be a better name than "checker" at this
point? For instance "Linters" is one of the checkers, which historically
at least we have not considered to be the case.
Reviewed By: mityal
Differential Revision: D20252386
fbshipit-source-id: fc611bfb7
Summary:
Problem: `infer report <specs file name>` is called manually sometimes to see analysis results in CLI. However, giving the specs file name is sometimes annoying, because the specs file name may be quite long and include special characters sometimes.
This diff introduces `--procedures-summary` to lookup the summaries interactively in `infer explore`.
example1: There are 8 procedures that include "max" in their names, then I selected one of them by entering a number.
```
$ infer explore --procedures --procedures-filter '.*max.*' --procedures-summary
0: minmax_div_const2_Bad_FN
1: minmax_div_const_Good
2: use_int64_max_Bad
3: use_uint64_max_Good
4: use_int64_max_Good
5: minmax_div_const_Bad
6: minmax_div_const2_Good
7: use_uint64_max_Bad
Select one number (type 'a' for selecting all, 'q' for quit): 2
void use_int64_max_Bad()
Analyzed
ERRORS: BUFFER_OVERRUN_L1
WARNINGS:
FAILURE:NONE SYMOPS:0
BufferOverrunAnalysis: StackLocs: { } MemPure: { } Alias: { ret= }
BufferOverrunChecker: Safety conditions:
{ }
```
example2: If there is only one specs file that satisfies the given filter, it reports the summary of that procedure without an interaction.
```
$ infer explore --procedures --procedures-filter '.*add_in_loop_ok.*' --procedures-summary
Selected proc name: void ArrayListTest.add_in_loop_ok()
void void ArrayListTest.add_in_loop_ok()(ArrayListTest* this)
Analyzed
ERRORS:
WARNINGS:
FAILURE:NONE SYMOPS:0
BufferOverrunAnalysis: StackLocs: { } MemPure: { } Alias:
{ i=size(__new-390022197-0-1.elements), ret= }
LatestPrune: latest { i -> (5, { }, { }) by ((5, { }, { }) >= (5, { }, { })),
__new-390022197-0-1.elements -> (⊥, { }, { __new-390022197-1-1 -> length : 5 }) by ((5, { }, { }) >= (5, { }, { })) }
BufferOverrunChecker: Safety conditions:
{ }
```
Reviewed By: jvillard
Differential Revision: D20284052
fbshipit-source-id: 2131339f1
Summary:
It's a lot of code to maintain for something that no one ever uses
anymore.
Reviewed By: ngorogiannis
Differential Revision: D20282794
fbshipit-source-id: 28422c415
Summary: Restore the global state also when `RestartScheduler.ProcnameAlreadyLocked` exceptions are catched.
Reviewed By: ngorogiannis
Differential Revision: D20189524
fbshipit-source-id: 8f8de5309
Summary:
This is so that all pre-analyses are together instead of spread across
several modules.
PS: this function is the worst.
Reviewed By: ngorogiannis
Differential Revision: D19973285
fbshipit-source-id: b326e99cd
Summary:
This diff renames `ZERO_XXX` issues to more appropriately named and descriptive
`XXX_UNREACHABLE_AT_EXIT` and replaces bottom with
unreachable in cost kinds and issues.
Reviewed By: skcho
Differential Revision: D20140301
fbshipit-source-id: eb6076b30
Summary:
The issue type `ZERO_EXECUTION_TIME` actually corresponds to bottom state but has been mistakenly used to mean
- unreachable nodes (program never reaching exit state)
- having zero cost (e.g. for allocations).
Note that, for execution costs, the latter doesn't make sense since we always incur a unit cost for the start node. Hence, a function with empty body will have unit cost. For allocations or IO however, we only incur costs for specific primitives, so a function with no allocations/IO could have a zero cost. However, there is no point reporting functions with zero cost as a specific issue type. Instead, what we want to track is the former, i.e. functions whose cost becomes 0 due to program never reaching exit state.
This diff aims to split these cases into two by only reporting on the latter and adds traces to bottom/unreachable cost by creating a special category in polynomials.
Next diff will rename `ZERO_XXX` to `XXX_UNREACHABLE_AT_EXIT`.
Reviewed By: skcho
Differential Revision: D20005774
fbshipit-source-id: 46b9abd5a
Summary: The analysis time wall time was logged but not it wasn't a part of `BackendStats`. The analysis time has metric has been moved to `BackendStats` and also includes the user and sys times of the scheduler process.
Reviewed By: ngorogiannis
Differential Revision: D20115345
fbshipit-source-id: bd3f3d276
Summary:
# Current design
Infer analysis is currently two staged:
1) proc-level callbacks calculate summary, including writing down the
issues if applicable.
2) file-level callbacks (formerly cluster callbacks, see the prev diff) are executed next; they are supposed to emit
additional issues that are impossible to emit based on mere
proc-context.
Currently RacerD and Starvation use file-level callback; in near future
we plan to onboard Nullsafe checker as well.
# Problem
Contract of callback (1) is clear: given a proc and existing
summary, the checker updates it and returns a modified summary. This
summary later on gets serialized (in-memory + external) and can be consumed by
other chechers. Issues written in summary will get reported when
analysis is over.
In constrast, contract of (2) is wild west: the function returns unit.
In practice, what the checkers do is create IssueLog and serialize it to
checker-specific directory.
Then another part of program (InferPrint.ml) knows about this side
effect, reads the error log for checkers and ultimately get it reported
together with errors written at stage (1).
This is problematic because it is hard to reason about the system and it
makes onboarding new checkers to (2) error-prone.
# This diff
This diff brings (2) on par with (1): now file-level callback has a
clear contract: it should be side effect free, and the only
responsibility is to fill out and return IssueLog.
Additionally, we make the notion of "checker-specific issue directory"
an official thing, so the checker only needs to specify the name,
everything else will be made automatically by orchestation layer,
including cleanup.
# Starvation
Implementing the new contract is starvation is possible and desirable, but involved: see comment
in the code, so we leave it up to the future work to fix that.
Reviewed By: ngorogiannis
Differential Revision: D20115024
fbshipit-source-id: fb2f9b7e6
Summary: Add the wall time to the ExecutionDuration. If this is not included we are not considering off-cpu time.
Reviewed By: ngorogiannis
Differential Revision: D20099667
fbshipit-source-id: 49dbfd739
Summary:
Currently the call graph of all captured procedures is loaded and then traversed to flag reachable procedures from modified files, followed by deleting the unflagged part, and unflagging the rest. This is a bit wasteful, and doesn't lend itself nicely to constructing directly the reverse call graph, which further diffs will do.
This diff loads all captured procedures and callees in a hashconsed table, and performs a BFS from procedures in modified files, to build the call graph in one pass.
Reviewed By: fgasperij
Differential Revision: D19888965
fbshipit-source-id: eeb59356e
Summary:
To ease scheduling, it would be best to only load the procnames of procedures that are (a) defined and (b) reachable from the modified files. The frontends play various games with the DB properties:
- In Clang all methods have a CFG even if they are undefined. Also, looking for non-NULL CFG rows in the DB brings up methods unreachable from modified files (?).
- In Java, some procedures have NULL CFGs. In addition, some of those have `attr_kind!=0`.
We only load those procedures that have both non-NULL CFGs and `attr_kind!=0`. That seems to give meaningful numbers, esp. wrt reachable procedures from files.
Reviewed By: jberdine
Differential Revision: D20068376
fbshipit-source-id: 992b65b4a
Summary:
1. Some invariants are tricky enough to be documented. This is especially
important for cases related with error reporting. Lets document it.
2. Cluster callback -> File callback rename.
Reviewed By: ngorogiannis
Differential Revision: D20093932
fbshipit-source-id: e716f1f5b
Summary: Count the time used by the `RestartScheduler` for analysis (useful) and the time wasted because a a worker was not able to take a lock to have a metric to compare different versions of the scheduler. The wasted time is not actually count but it can be calculated by substracting useful time from the total time. This was implemented like these to avoid substractions that may make the floating point calculations more complicated.
Reviewed By: ngorogiannis
Differential Revision: D19969960
fbshipit-source-id: 68a1132ca
Summary: When a worker fails because it can't a get the lock of a `Procname` it will include it in the exception that it throws so the `RestartScheduler` can record it as a dependency. Then when scheduling a new work item from `RestartScheduler.next` it will check if this dependency is already met, if it isn't it will not schedule the `Procname` yet.
Reviewed By: ngorogiannis
Differential Revision: D19820331
fbshipit-source-id: b48cacc9a
Summary: Add files and procedures to the `RestartScheduler`'s work queue. This makes the chaining with the FileScheduler unnecessary so it's removed.
Reviewed By: ngorogiannis
Differential Revision: D19942354
fbshipit-source-id: 59e25c1c2
Summary:
The `--continue-analysis` option enables continuing analysis after more targets are captured by
`--continue`. For example,
```
$ infer capture -- buck build tgt1
$ infer analyze --merge
$ infer capture --continue -- bucck build tgt2
$ infer analyze --merge --continue-analyze
```
In the last analysis, it reuses the analysis results of `tgt1` from the previous analysis. If
`tgt1` and `tgt2` have a same dependency to a library, the analysis results of the library is also
reused.
Reviewed By: dulmarod
Differential Revision: D19996598
fbshipit-source-id: bb6874a6f
Summary: Change the ProcLocker implementation to use symlinks instead of files. Tests have indicated that they may use less resources.
Reviewed By: ngorogiannis
Differential Revision: D19822048
fbshipit-source-id: 991ababf2
Summary: In line with changes to loom query in D19903057, let's adjust Infer's processing of the results.
Reviewed By: martintrojer
Differential Revision: D19902933
fbshipit-source-id: 200b3a03e
Summary:
Previous implementation supported only stringy params (strings and
stringified bools). Current one exposes a proper variant `Annot.t`,
with support for all possible param values in Java except
numbers (more on that below).
This change is required for implementing `Nullsafe(LOCAL)` as the
annotation used to specify nullsafe behaviour has a more complex
structure than what we've dealt with before.
**Why support for number values was not added**: supporting numbers
requires using `int64`. Unfortunately, adding another variant `Vnum
int64` to `Annot.t` causes a runtime failure on assert in
`MaximumSharing.ml:133`. It seems that it might be enough to flip
`fail_on_nonstring` from `true` to `false`, but since this would
require additional testing and is not required for my case, I'll leave
checking this to whoever needs to use numeric annot params in future.
Reviewed By: ezgicicek
Differential Revision: D19855923
fbshipit-source-id: 878e33856
Summary: No reason to use a set when an integer will suffice. This further reduces GC churn.
Reviewed By: fgasperij
Differential Revision: D19888300
fbshipit-source-id: 9fc8c73f5
Summary: Queues are implemented using a circular array, so should be less GC-heavy than continually allocating/freeing list nodes.
Reviewed By: jberdine, fgasperij
Differential Revision: D18504104
fbshipit-source-id: 93d29c253
Summary:
Building the call graph should be done only in the scheduler process after having forked all workers. This was achieved by a lazy init pattern, whereby the first time `next` was called, it would build the call graph, on the assumption that `next` is only ever called in the scheduler after forking.
D19769741 made this compulsory regardless the scheduler by passing a thunk to `ProcessPool` which is called to obtain the actual scheduler, on the right process and after the fork.
This means we don't need the custom lazy init logic any more. In addition, that set up used a DB query to overapproximate the number of procedures to analyse, because this was supposed to be provided *before* forking. Now this is also not needed, and on top of that we can provide the exact number after building the call graph.
Reviewed By: ezgicicek
Differential Revision: D19833974
fbshipit-source-id: 7f6d51d93
Summary:
The big one:
- stop using polymorphic `<>`, `<`, `>`, ..
- add `<>` to `PolyVariantEqual` escape hatch now that `<>` is as taboo as `=`
- Interestingly, there were a lot of uses of `Z.(x < y)`, which although
they seem to use `Z.lt` actually used polymorphic comparison. The actual
comparison infix operators of `Z` are cleverly hidden in `Z.Compare`
instead, which makes them impractical to use...
Reviewed By: jberdine
Differential Revision: D19861584
fbshipit-source-id: 5dce08ad9
Summary: Instrument the ProcLocker to get the aggregated systime that is used by `lock` and `unlocks`.
Reviewed By: ngorogiannis
Differential Revision: D19814554
fbshipit-source-id: 5fd928b9c
Summary: The main job the schedulers do is building their work queues. That's being performed before the workers are forked which means they get copied into all of them. These changes push the initialization of the schedulers just after the forking takes place.
Reviewed By: ngorogiannis
Differential Revision: D19769741
fbshipit-source-id: 0b20ddd5c
Summary: We don't use allocation costs in prod at the moment. There is no plan to do so in the near future. Let's not report them anymore and also save some space in `costs-report.json`.
Reviewed By: skcho
Differential Revision: D19766828
fbshipit-source-id: 06dffa61d
Summary:
The ProcLocker uses files as locks and relies on the guarantees of the `Unix.open_file` function when using `O_CREAT` and `O_EXCL` simultaneously.
- `setup`: creates a directory for the lock files inside `infe-out` and deletes its content if it already existed.
- `clean`: does nothing for now. Any file locks that may have been left unlocked are removed by the `setup` in the next run. This way the user can see what locks were taken if the program crashes.
- `lock_exn`: try to lock the `Procname` and if it can't releases all the locks that is currently holding.
- `unlock`: removes the corresponding file.
Reviewed By: ngorogiannis
Differential Revision: D19639402
fbshipit-source-id: e02f277ff
Summary:
`replace_make_shared` is finding a constructor by iterating all fields of type structs, which may be
expensive if there are lots of fields or `replace_make_shared` is called many times with the same parameters.
Reviewed By: ngorogiannis
Differential Revision: D19446260
fbshipit-source-id: c5c279ca5
Summary: Add the interface that will be used to lock Procedures.
Reviewed By: ngorogiannis
Differential Revision: D19580578
fbshipit-source-id: b5e334b18
Summary: Currently, empty summaries are passed to the cluster callbacks. This is pointless and could potentially lead to recomputing already analysed summaries. This change passes only procnames to the callback, and `Ondemand` is used to load the summary or analyse as needed.
Reviewed By: jvillard
Differential Revision: D19544043
fbshipit-source-id: 28ab642c3
Summary: There is no need to provide type environments to cluster analysers, since the execution environment can be used to retrieve those on demand.
Reviewed By: jvillard
Differential Revision: D19543561
fbshipit-source-id: f9b064011
Summary:
The RestartScheduler needs to know if the worker finished it's task
because:
1. there was no more work to do or
2. found that a needed Procname was already taken (this part is not yet implemented)
This need was addressed by (i) making the functions that the workers execute
return a value of task_result.t intead of unit and (ii) adding a
constructor to the worker_message.t (FinishedTask).
Reviewed By: ngorogiannis
Differential Revision: D19467783
fbshipit-source-id: a76b02b6c
Summary:
The restart scheduler will now have two phases:
1. Analyze all the procs obtained from the sources.
2. Run the FileScheduler on the sources.
The second step aims to analyze only the File level analyzers
requirements.
Reviewed By: ngorogiannis
Differential Revision: D19430244
fbshipit-source-id: b4f9ee69b
Summary:
The RestartScheduler now uses it's own of_list function to build a
ProcessPool.TaskGenerator to be able to use a queue as the underlying
data structure.
Reviewed By: ngorogiannis
Differential Revision: D19390055
fbshipit-source-id: d57b493b7
Summary:
That was used only by the now-defunct Java Buck integration without
genrules.
Reviewed By: ngorogiannis
Differential Revision: D19176545
fbshipit-source-id: 8253b614a
Summary:
These changes introduce the RestartScheduler which for now is a copy of the FileScheduler:
- added it as a possible argument to the recently added `--scheduler` option.
- made the necessary changes in `InferAnalyze` to call it if it was chosen.
Reviewed By: ngorogiannis
Differential Revision: D19373505
fbshipit-source-id: 98f065057
Summary:
We cannot necessarily know if the previous capture phase was java or
clang, hence to be on the safe side we should always merge the global
tenvs when merging Buck targets.
Reviewed By: ngorogiannis
Differential Revision: D19175685
fbshipit-source-id: 8e7492e14
Summary:
Old versions of sawja/javalib got the line numbers slightly wrong. The workaround was to do a regexp search in the source file for the right line.
My understanding is that this is no longer necessary. This diff removes it.
Reviewed By: jvillard
Differential Revision: D19033415
fbshipit-source-id: 2da19d66d
Summary:
Move most of IR/Sil.ml into a new file biabduction/Predicates.ml to
reflect the fact that they are only useful for the biabduction analysis.
Unfortunately this is a huge change.
I tried to keep the change to a minimum, it's mostly about doing
s/Sil/Predicates/ in lots of places but sometimes I used the trick of
specifying parameters or return value types to avoid specifying the
module altogether. This isn't done consistently because there were just
too many places to change for poor me.
Reviewed By: ngorogiannis
Differential Revision: D19158530
fbshipit-source-id: d6dbcfe72
Summary:
Part of making Sil.ml about SIL only.
In order to not introduce a dependency istd/Pp -> base/Config, the
utilities in Pp don't know when to introduce "diff" colours. Fix it by
wrapping them in Sil using the Config option. (we may want to just kill
that option at some point).
Similarly, move stuff from Io_infer to Pp.
Reviewed By: ngorogiannis
Differential Revision: D19158534
fbshipit-source-id: 8110cb7f9
Summary:
- Remove `to_flat_string` as there is `get_field_name` that unambiguously does the same thing.
- Make `pp` print only the field in all languages.
- Fix `to_full_string` so that it has unified behaviour across java/clang and so that it doesn't print `class Foo.x`, but rather `Foo.x`.
Reviewed By: ezgicicek
Differential Revision: D18963033
fbshipit-source-id: e2c803c7d
Summary:
Remove Clang and Java submodules of Typ.Fieldname. They are unnecessary and they reflect a fake dichotomy: there is only one fieldname type. To distinguish between fields of Java classes and other C constructs, there is a helper function provided, but the idea is simple: obtain the class type the field belongs to, and check if it's a Java class.
This diff still preserves behaviour, but removes as many functions as possible from the interface, to leave a small surface.
Reviewed By: mityal
Differential Revision: D18962423
fbshipit-source-id: ffe6933ee
Summary:
In order to handle the example added:
changed domain of `MethodCalled`
from `CreatedLocation -> (IsBuildCalled X IsChecked X Set(MethodCall))`
to `(CreatedLocation X IsBuildCalled) -> (IsChecked X Set(MethodCall))`
This avoids joining of two method calls where one is build-called and the other is not, e.g.,
```
if(b) {
o.build();
} else {
// no build call
}
```
changed domain of `NewDomain`
from `Created X MethodCalled`
to `(Created X MethodCalled) X (Created X MethodCalled)`
One is for no returned memory and the other is returned memory. This keeps precision some join
points of branches, e.g.,
```
if(b) {
return;
} else {
// no return
}
```
Reviewed By: ezgicicek
Differential Revision: D18909768
fbshipit-source-id: c39d1a1ef
Summary: It is not used anywhere and there are no plans to revive it. Kill it!
Reviewed By: skcho
Differential Revision: D18934719
fbshipit-source-id: b9b069b96
Summary:
This diff enables parsing and auto-formatting documentation
comments (aka docstrings).
I have looked at this entire diff and manually made some changes to
improve the formatting. In some cases it looked like it would take too
much time, or benefit from someone more familiar with the code doing
it, and I instead disabled auto-formatting docstrings in those files.
Also, there are some source files where the docstrings are invalid,
and some where the structure detected by the parser appears not to
match what was intended. Auto-formatting has been disabled for these
files.
Reviewed By: ezgicicek
Differential Revision: D18755888
fbshipit-source-id: 68d72465d
Summary:
This was never set to true except in a wrong way in the Java frontend
(see previous diff).
Reviewed By: dulmarod
Differential Revision: D18573927
fbshipit-source-id: 4c9d1a855
Summary:
A plugin update allows infer to know when a function doesn't return
according to its attributes. This propagates this info all the way to
the attributes of each function, and then use this information in a new
pre-analysis that cuts the links to successor nodes of each `Call`
instruction to a function that does not return.
NOTE: The "no_return" `CallFlag.t` was dead code, following diffs deal
with that (by removing it).
Reviewed By: dulmarod
Differential Revision: D18573922
fbshipit-source-id: 85ec64eca
Summary:
This also prints the CFGs *after* pre-analysis for individual procedures
in infer-out/captured/<filename>/<proc>.dot. One can also look up the
CFGs before pre-analysis in infer-out/captured/proc_cfgs_frontend.dot.
Context: I want to add a pre-analysis that needs to look at proc
attributes inter-procedurally. For this to make sense it has to happen
*after* all of capture, and before analysis.
Thus, this diff brings back the lazy running of the pre-analysis like in
D15803492, except that we still make sure to run the pre-analyses
systematically regardless of the checkers being run by running the
pre-analysis from ondemand.ml. Also we don't need to re-introduce the
"did_preanalysis" proc attribute for the same reason that the
pre-analysis is now run once and for all by ondemand.ml (instead of each
individual checker back in the days).
This has the benefit of running the pre-analysis only when needed, and
the drawback that several concurrent processes analysing the same proc
descs will duplicate work. Since pre-analyses are supposed to be very
fast I assume that neither is a big deal. If they become more expensive
then the benefit gets bigger and the drawback is just the same as with
regular analyses.
Reviewed By: skcho
Differential Revision: D18573920
fbshipit-source-id: de350eaef
Summary:
This allows us to move the CFG rendering to IR/.
The parts of that file concerning CFGs and those concerning Biabduction
specs were entirely disjoint, it turns out, so that was easy.
Reviewed By: jberdine
Differential Revision: D18573924
fbshipit-source-id: 0a5ab6478
Summary: Let's make UI/Cold start parts of the message bold to get more attention.
Reviewed By: jvillard
Differential Revision: D18504857
fbshipit-source-id: b0f199f55
Summary:
We consider Java collections to be like c++ std::vectors and add models for
- `Collections.get(..)`
- `__cast`
Reviewed By: skcho
Differential Revision: D18449607
fbshipit-source-id: 448206c84
Summary:
Let's introduce a set of new cost analysis issue types that are raised when the function is statically determined to run on the UI thread. For this, we rely on the existing `runs_on_ui_thread` check that is developed for RacerD. We also update the cost summary and `jsonbug.cost_item` to include whether a method is on the ui thread so that we don't repeatedly compute this at diff time for complexity increase issues.
Note that `*_UI_THREAD` cost issues are assumed to be more strict than `*_COLD_START` reports at the moment. Next, we can also consider adding a new issue type that combines both such as `*_UI_THREAD_AND_COLD_START` (i.e. for methods that are both on cold start and run on ui thread).
Reviewed By: ngorogiannis
Differential Revision: D18428408
fbshipit-source-id: f18805716
Summary:
The order was wrong: the map from procnames to node-ids was cleaned first, but to clean the map from node-ids to nodes, we need the id and we have already removed it.
The symptom was that effectively no new leaves are created by "removing" a node, so the call graph scheduler quickly devolves to the file-based one.
Reviewed By: skcho
Differential Revision: D18448209
fbshipit-source-id: f272a8112
Summary:
There was some over-general treatment of reachability, in anticipation of changes that didn't happen.
In particular, we only need to flag/remove single nodes, as they must be leaves to be scheduled,
therefore we never need to traverse their successors, because there aren't any.
Reviewed By: jvillard
Differential Revision: D18425905
fbshipit-source-id: b86490542
Summary:
- Convert `task_generator` into a module of `ProcessPool` and collect inside the two combinators which were in semi-random places.
- Make `SyntacticCallGraph` export a `task_generator` as opposed to a call-graph builder.
- Separate `target` type and put it in its own module to avoid dependency cycles.
Reviewed By: skcho
Differential Revision: D18425718
fbshipit-source-id: 7957edac8
Summary: `Str.regexp_string` should be used to find a method name instead of `Str.regexp`.
Reviewed By: ezgicicek
Differential Revision: D18136598
fbshipit-source-id: c4b56dd64
Summary:
The wrong function was used when we tried to see if the class is
annotated with NullsafeStrict. This made it work only for non-static
methods.
Now we use the proper way.
Reviewed By: ngorogiannis
Differential Revision: D18113848
fbshipit-source-id: 02b7555be
Summary:
See explanations in D17955104.
This renames `AbstractAddress` to `AbstractValue` since they are not
necessarily addresses.
Reviewed By: ezgicicek
Differential Revision: D17955290
fbshipit-source-id: 8bb4c61f2
Summary:
- Putting test determinator in own directory
- Putting Java procname creation stuff in its own module
Reviewed By: skcho
Differential Revision: D17929885
fbshipit-source-id: 4f2578566
Summary:
By some unfortunate logic, OCaml often decides to use
`sexp_list`/`sexp_option` instead of just `list`/`option`. Sometimes
these get copy/pasted in interface files.
It would be good to tell OCaml not to do that in the first place but in
the meantime: this diff.
Reviewed By: ngorogiannis
Differential Revision: D17907938
fbshipit-source-id: 7546834a2
Summary:
Describe what the --report-*-* options actually do instead of their
outdated documentation from the time where this was
`--checkers-blacklist-regex`, `--infer-blacklist-regex` and the like.
Reviewed By: dulmarod, mityal
Differential Revision: D17906015
fbshipit-source-id: 204349e9e
Summary: Not used feature and with no obvious roadmap anymore.
Reviewed By: jberdine, jvillard
Differential Revision: D17855860
fbshipit-source-id: fd75b9d62
Summary: `litho` checker contained two checkers: required-props and graphQL field accesses. Although they use the same domain, their reporting conditions and analysis details are different. However, they were bundled into the same analysis by adding disjunctions to `exec_instr` to handle both cases. Let's separate them into two different checkers, keeping a modular transfer function and analyzer that is reused by these two checkers.
Reviewed By: skcho
Differential Revision: D17788834
fbshipit-source-id: 47d77063b
Summary:
The documentation and uses of filtering disagree. One typical usage is deduplication.
Split that where obvious, add comments where not obvious, and leave alone when obviously unrelated to deduplication.
Reviewed By: mityal
Differential Revision: D17715329
fbshipit-source-id: ec757927b
Summary: The analysis is not intra-procedural, hence we don't really read the payload. Let's remove it.
Reviewed By: ngorogiannis
Differential Revision: D17603911
fbshipit-source-id: c92b5c602
Summary: When we have an annotation like `Prop(varArg = X)` or ` ThreadSafe(enableChecks = true)`, we were not able to pick up the names of the parameters like `varArg` or `enableChecks`. This diff fixes that.
Reviewed By: skcho, ngorogiannis
Differential Revision: D17571377
fbshipit-source-id: 5293b5810
Summary:
Introduce a new experimental checker (`--impurity`) that detects
impurity information, tracking which parameters and global variables
of a function are modified. The checker relies on Pulse to detect how
the state changes: it traverses the pre and post pairs starting from
the parameter/global variable and finds where the pre and post heaps
diverge. At diversion points, we expect to see WrittenTo/Invalid attributes
containing a trace of how the address was modified. We use these to
construct the trace of impurity.
This checker is a complement to the purity checker that exists mainly
for Java (and used for cost and loop-hoisting analyses). The aim of
this new experimental checker is to rely on Pulse's precise
memory treatment and come up with a more precise im(purity)
analysis. To distinguish the two checkers, we introduce a new issue
type `IMPURE_FUNCTION` that reports when a function is impure, rather
than when it is pure (as in the purity checker).
TODO:
- improve the analysis to rely on impurity information of external
library calls. Currently, all library calls are assumed to be nops,
hence pure.
- de-entangle Pulse reporting from analysis.
Reviewed By: skcho
Differential Revision: D17051567
fbshipit-source-id: 5e10afb4f
Summary:
The code was already trying to do that but failing. Now it works.
This revealed a slight bug where the progress bar would always stop at
N-1/N 99% jobs. Fixed by moving the progress bar updates *after* the
operation that might decrease the number of jobs left.
Reviewed By: mityal
Differential Revision: D17423978
fbshipit-source-id: fc32db5f3
Summary:
Previously we would incorrectly report the time for the whole process
and this could include capture time too.
Reviewed By: mityal
Differential Revision: D17423977
fbshipit-source-id: b3ed754b3
Summary: We should be able to run this processing ast steps without running linters or capture. This also adds a new module ProcessAST to do the processing, Capture.ml should not know anything else than calling the respective modules for capture, linting or processing.
Reviewed By: ngorogiannis
Differential Revision: D17501453
fbshipit-source-id: 30adba5b1
Summary:
- Instead of merging one target DB into the main DB at a time, merge all target DBs into an in-memory DB (thus, no writing) and then dump it into the main DB at the end. This makes merging faster.
- When using the sqlite write daemon, there is no reason to drive the merge process from the master, sending each individual target to merge down the socket and doing one DB merge at a time. Here we move all the DB merging logic in the daemon, and expose a single function that does it all.
- Refactor some common functionality (notably the `iter_infer_deps` function is now in `Utils`) and remove dead files.
This can be also done using a temporary DB (which is not limited to memory) but this showed worse perf in tests than the in-memory solution as well as the current state of things (! possibly Sqlite-version related?).
Reviewed By: skcho
Differential Revision: D17182862
fbshipit-source-id: a6f81937d
Summary: This calls the method `delete_capture_and_analysis_data` introduced in D17184424 once the appropriate specs files for incremental analysis have been deleted. This fixes two bugs that I observed in incremental analysis that were arising because of stale state left in the results directory.
Reviewed By: ngorogiannis
Differential Revision: D17184424
fbshipit-source-id: d63f59db9