Summary:
At some point it was thought that we can assume that any annotation starting with "On" means the method is on the UI thread.
That's too imprecise and has led to false positives and negatives. Restrict to a well-known safe set.
Reviewed By: ezgicicek
Differential Revision: D17769376
fbshipit-source-id: 0f8fee059
Summary:
Currently, lock state is a map from locks to stacks of lock acquisitions.
Since we now have the separate acquisitions component, we no longer need to
remember the stack of acquisitions for a lock. Instead we only need a lock count,
thus reducing the memory footprint.
At the same time, change acquire and release so that they make one tree operation per
component (map/acquisition) as opposed to two (search/update) operations.
Reviewed By: ezgicicek
Differential Revision: D17736727
fbshipit-source-id: 7579eb61e
Summary:
If we acquire n nested locks we end up with n critical pairs, and for each pair the held-locks component goes up linearly, thus total space/time is O(n^2).
If the sets of held-locks are constructed with maximal sharing (intuitively, if they derive from the same set by additions) then the total space/time is O(n logn).
To do this, we must avoid constructing a new set every time we ask what the currently held locks are. Here we maintain (care is needed in the presence of recursive locks) that set across lock acquisitons and removals.
Reviewed By: skcho
Differential Revision: D17736252
fbshipit-source-id: 0f9b292c4
Summary:
The old domain keeps two sets:
- `events` are things (including lock acquisitions) which eventually happen during the execution of a procedure.
- `order` are pairs of `(lock, event)` such that there is a trace through the procedure which at some point acquires `lock` and before releasing it performs `event`.
A deadlock would be reported if for two procedures, `(lock1,lock2)` is in `order` of procedure 1 and `(lock2,lock1)` is in `order` of procedure 2. This condition/domain allowed for the false positive fixed in the tests, as well as was unwieldy, because it required translating between the two sets.
The new domain has only one set of "critical pairs" `(locks, event)` such that there is a trace where `event` occurs, and *right before it occurs* the locks held are exactly `locks` (no over/under approximation). This allows keeping all information in one set, simplifies the procedure call handling and eliminates the known false positive.
Reviewed By: mityal
Differential Revision: D17686944
fbshipit-source-id: 3c68bb957
Summary:
Previously deduplication was always on which is not great for testing.
Also split tests so that we can still test deduplication separately.
Reviewed By: mityal
Differential Revision: D17686877
fbshipit-source-id: 280d91473
Summary:
The documentation and uses of filtering disagree. One typical usage is deduplication.
Split that where obvious, add comments where not obvious, and leave alone when obviously unrelated to deduplication.
Reviewed By: mityal
Differential Revision: D17715329
fbshipit-source-id: ec757927b
Summary:
Ideally the analyser should equate locks `this.x.f` and `a.x.f` in different methods if they can alias.
The heuristic removed here was rarely used and is in the way of a re-write of the analysis.
It was also badly implemented, as this should ideally be the comparison relation of `Lock`.
Reviewed By: mityal
Differential Revision: D17602827
fbshipit-source-id: 4f4576c1a
Summary: When we have an annotation like `Prop(varArg = X)` or ` ThreadSafe(enableChecks = true)`, we were not able to pick up the names of the parameters like `varArg` or `enableChecks`. This diff fixes that.
Reviewed By: skcho, ngorogiannis
Differential Revision: D17571377
fbshipit-source-id: 5293b5810
Summary:
Events can be many things, including lock acquisitions. Lock state keeps a set of events, all of which must be lock acquisitions.
Enforce this via the type checker by specialising the types so that lock state satisfies this by construction.
Reviewed By: ezgicicek
Differential Revision: D17571428
fbshipit-source-id: 2f5a33b98
Summary:
Instead of polluting the signature of trace endpoints, have
the call printer be a module argument to the functors
producing trace elements.
Reviewed By: skcho
Differential Revision: D17550111
fbshipit-source-id: ab5af94c6
Summary:
The access path format forced some weird patterns on this code, simplify
using the access expression structure.
Reviewed By: ezgicicek
Differential Revision: D16960660
fbshipit-source-id: e8faf619e
Summary:
Use whatever information we can to decide whether to use C or Java
syntax when outputting an access expression, now that we store them as
such.
Also, make cluster callbacks explicitly set the language, as this was not done before and led to some confusion (Clang being set when analysing a Java file).
Reviewed By: skcho
Differential Revision: D16884160
fbshipit-source-id: 40adf9f35
Summary:
Access paths are too coarse to properly address C/C++ instructions, and lead to false positives and negatives. Begin the process of porting the underlying domains to access expressions, in a results-preserving way. This roughly consists in:
- Adding missing functions in `AccessExpression` to mirror those in `AccessPath`.
- Replacing `AccessExpression` for `AccessPath` and removing conversions from the former to the latter except in:
- Printing functions, to ensure formatting issues won't change tests/CI.
- Reporting/deduplication still happens through access path conversion, as we need an analogue of `ModuloThis` for `AccessExpression`.
- In selected places, ignore any access type not present in `AccessPath` (ie. dereference/take address of).
Reviewed By: jberdine
Differential Revision: D16856721
fbshipit-source-id: 5e3a88b75
Summary: The pattern "check if an access has already been reported, otherwise see if it is a violation, report it, then add it to the set of reported accesses" is too much copy pasta. Push that into the reporting functions.
Reviewed By: ezgicicek
Differential Revision: D16859208
fbshipit-source-id: 5370efd41
Summary: I'm not entirely sure why this function was written in such a horrendous way :)
Reviewed By: skcho
Differential Revision: D16858396
fbshipit-source-id: d998e17f3
Summary:
Summary.ml defines both a bunch of types and how to use them and a
mechanism to save and store summaries on disk while maintaining a
complex in-memory cache of what's on disk. Make the distinction clear.
Reviewed By: ngorogiannis
Differential Revision: D16358869
fbshipit-source-id: 9d4c6cb77
Summary:
The fields `tenv` and `integer_type_widths` can be obtained from the `exe_env` field of `proc_callback_args`
This commit removes the redundant fields
Reviewed By: ngorogiannis
Differential Revision: D16149520
fbshipit-source-id: d37526fd4
Summary:
Cluster checkers call `SummaryPayload.read` but set the `caller_summary` to correspond to the same summary as gives the `callee_pname`
This change introduces a new method `read_toplevel_procedure` that does not require a `caller_summary`, to be used by the cluster checkers
Reviewed By: ngorogiannis
Differential Revision: D16131660
fbshipit-source-id: 12caa1000
Summary:
Change the datatype `ProcData` to include a field of type `Summary.t` instead of a field of type `Procdesc.t`
This will enable a later commit to supply a summary to `Ondemand.analyze_proc_desc` and `Ondemand.analyze_proc_name`
Reviewed By: ngorogiannis
Differential Revision: D16121405
fbshipit-source-id: 342374121
Summary:
The record `proc_callback_args` (defined in `callbacks.ml`) contains the fields `proc_desc` and `summary`.
The field `proc_desc` is redundant because it can be obtained from `summary`.
This diff removes `proc_desc` and uses the summary to obtain it where needed.
Reviewed By: ngorogiannis
Differential Revision: D16090783
fbshipit-source-id: 5632d1f4a
Summary:
Bundle all non-semantic-bearing instructions into a `Metadata _`
instruction in SIL.
- On a documentation level this makes clearer the distinction between
instructions that encode the semantics of the program and those that are
just hints for the various backend analysis.
- This makes it easier to add more of these auxiliary instructions in
the future. For example, the next diff introduces a new `Skip` auxiliary
instruction to replace the hacky `ExitScope([], Location.dummy)`.
- It also makes it easier to surface all current and future such
auxiliary instructions to HIL as the datatype for these syntactic hints
can be shared between SIL and HIL. This diff brings `Nullify` and
`Abstract` to HIL for free.
Reviewed By: ngorogiannis
Differential Revision: D14827674
fbshipit-source-id: f68fe2110
Summary:
Add support for GuardedBy: we deviate from the spec as follows:
- No warnings issued for any access within a private method, unless that method is called from a public method and the lock isn't held when the access occurs.
- Warnings are suppressed with the general RacerD mechanism, ie `ThreadSafe(enableChecks=false)`
- GuardedBy warnings override thread-safety violation warnings on the same access, because GuardedBy has a clearer and simpler contract.
Also, some simplifications, cleanups and perf improvements (eg avoid unreportable procs at the top level as opposed to on each of their accesses).
Reviewed By: jeremydubreil
Differential Revision: D14506161
fbshipit-source-id: b7d794051
Summary:
Context: "quandary" traces optimise for space by only storing a call site (plus analysis element) in a summary, as opposed to a list of call sites plus the element (i.e., a trace). When forming a report, the trace is expanded to a full one by reading the summary of the called function, and then matching up the current element with one from the summary, iterating until the trace cannot be expanded any more. In the best case, this can give a quadratic saving, as a real trace gets longer the higher one goes in the call stack, and therefore the total cost of saving that trace in each summary is quadratic in the length of the trace. Quandary traces give a linear cost.
HOWEVER, these have been a source of many subtle bugs.
1. The trace expansion strategy is very arbitrary and cannot distinguish between expanded traces that are invalid (i.e., end with a call and not an originating point, such as a field access in RacerD). Plus the strategy does not explore all expansions, just the left-most one, meaning the left most may be invalid in the above sense, but another (not left-most) isn't even though it's not discovered by the expansion. This is fixable with major surgery.
2. All real traces that lead to the same endpoint are conflated -- this is to save space because there may be exponentially many such traces. That's OK, but these traces may have different locking contexts -- one may take the lock along the way, and another may not. The expansion cannot make sure that if we are reporting a trace we have recorded as taking the lock, will actually do so. This has resulted in very confusing race reports that are superficially false positives (even though they point to the existence of a real race).
3. Expansion completely breaks down in the java/buck integration when the trace goes through f -> g -> h and f,g,h are all in distinct buck targets F,G,H and F does not depend directly on H. In that case, the summary of h is simply not available when reporting/expanding in f, so the expanded trace comes out as truncated and invalid. These are filtered out, but the filtering is buggy and kills real races too.
This diff completely replaces quandary traces in RacerD with plain explicit traces.
- This will incur the quadratic space/time cost previously saved. See test plan: there is indeed a 30% increase in summary size, but there is no slowdown. In fact, on openssl there is a 10-20% perf increase.
- For each endpoint, up to a single trace is used, as before, so no exponential explosion. However, because there is no such thing as expansion, we cannot get it wrong and change the locking context of a trace.
- This diff is emulating the previous reporting format as much as possible to allow good signal from the CI. Further diffs up this stack will remove quandary-trace specific things, and simplify further the code.
- 2 is not fully addressed -- it will require pushing the `AccessSnapshot` structure inside `TraceElem`. Further diffs.
Reviewed By: jberdine
Differential Revision: D14405600
fbshipit-source-id: d239117aa