You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

59 lines
1.9 KiB

(*
* Copyright (c) 2018-present, Facebook, Inc.
*
* This source code is licensed under the MIT license found in the
* LICENSE file in the root directory of this source tree.
*)
open! IStd
[racerd] replace quandary traces with explicit ones Summary: Context: "quandary" traces optimise for space by only storing a call site (plus analysis element) in a summary, as opposed to a list of call sites plus the element (i.e., a trace). When forming a report, the trace is expanded to a full one by reading the summary of the called function, and then matching up the current element with one from the summary, iterating until the trace cannot be expanded any more. In the best case, this can give a quadratic saving, as a real trace gets longer the higher one goes in the call stack, and therefore the total cost of saving that trace in each summary is quadratic in the length of the trace. Quandary traces give a linear cost. HOWEVER, these have been a source of many subtle bugs. 1. The trace expansion strategy is very arbitrary and cannot distinguish between expanded traces that are invalid (i.e., end with a call and not an originating point, such as a field access in RacerD). Plus the strategy does not explore all expansions, just the left-most one, meaning the left most may be invalid in the above sense, but another (not left-most) isn't even though it's not discovered by the expansion. This is fixable with major surgery. 2. All real traces that lead to the same endpoint are conflated -- this is to save space because there may be exponentially many such traces. That's OK, but these traces may have different locking contexts -- one may take the lock along the way, and another may not. The expansion cannot make sure that if we are reporting a trace we have recorded as taking the lock, will actually do so. This has resulted in very confusing race reports that are superficially false positives (even though they point to the existence of a real race). 3. Expansion completely breaks down in the java/buck integration when the trace goes through f -> g -> h and f,g,h are all in distinct buck targets F,G,H and F does not depend directly on H. In that case, the summary of h is simply not available when reporting/expanding in f, so the expanded trace comes out as truncated and invalid. These are filtered out, but the filtering is buggy and kills real races too. This diff completely replaces quandary traces in RacerD with plain explicit traces. - This will incur the quadratic space/time cost previously saved. See test plan: there is indeed a 30% increase in summary size, but there is no slowdown. In fact, on openssl there is a 10-20% perf increase. - For each endpoint, up to a single trace is used, as before, so no exponential explosion. However, because there is no such thing as expansion, we cannot get it wrong and change the locking context of a trace. - This diff is emulating the previous reporting format as much as possible to allow good signal from the CI. Further diffs up this stack will remove quandary-trace specific things, and simplify further the code. - 2 is not fully addressed -- it will require pushing the `AccessSnapshot` structure inside `TraceElem`. Further diffs. Reviewed By: jberdine Differential Revision: D14405600 fbshipit-source-id: d239117aa
6 years ago
val default_pp_call : Format.formatter -> CallSite.t -> unit
(** A powerset domain of traces, with bottom = empty and join = union *)
module type FiniteSet = sig
include AbstractDomain.FiniteSetS
val with_callsite : t -> CallSite.t -> t
(** Push given callsite onto all traces in set. Cf [TraceElem.with_callsite] *)
end
module type Element = sig
include PrettyPrintable.PrintableOrderedType
val pp_human : Format.formatter -> t -> unit
(** Pretty printer used for trace construction; [pp] is used for debug output. *)
[racerd] replace quandary traces with explicit ones Summary: Context: "quandary" traces optimise for space by only storing a call site (plus analysis element) in a summary, as opposed to a list of call sites plus the element (i.e., a trace). When forming a report, the trace is expanded to a full one by reading the summary of the called function, and then matching up the current element with one from the summary, iterating until the trace cannot be expanded any more. In the best case, this can give a quadratic saving, as a real trace gets longer the higher one goes in the call stack, and therefore the total cost of saving that trace in each summary is quadratic in the length of the trace. Quandary traces give a linear cost. HOWEVER, these have been a source of many subtle bugs. 1. The trace expansion strategy is very arbitrary and cannot distinguish between expanded traces that are invalid (i.e., end with a call and not an originating point, such as a field access in RacerD). Plus the strategy does not explore all expansions, just the left-most one, meaning the left most may be invalid in the above sense, but another (not left-most) isn't even though it's not discovered by the expansion. This is fixable with major surgery. 2. All real traces that lead to the same endpoint are conflated -- this is to save space because there may be exponentially many such traces. That's OK, but these traces may have different locking contexts -- one may take the lock along the way, and another may not. The expansion cannot make sure that if we are reporting a trace we have recorded as taking the lock, will actually do so. This has resulted in very confusing race reports that are superficially false positives (even though they point to the existence of a real race). 3. Expansion completely breaks down in the java/buck integration when the trace goes through f -> g -> h and f,g,h are all in distinct buck targets F,G,H and F does not depend directly on H. In that case, the summary of h is simply not available when reporting/expanding in f, so the expanded trace comes out as truncated and invalid. These are filtered out, but the filtering is buggy and kills real races too. This diff completely replaces quandary traces in RacerD with plain explicit traces. - This will incur the quadratic space/time cost previously saved. See test plan: there is indeed a 30% increase in summary size, but there is no slowdown. In fact, on openssl there is a 10-20% perf increase. - For each endpoint, up to a single trace is used, as before, so no exponential explosion. However, because there is no such thing as expansion, we cannot get it wrong and change the locking context of a trace. - This diff is emulating the previous reporting format as much as possible to allow good signal from the CI. Further diffs up this stack will remove quandary-trace specific things, and simplify further the code. - 2 is not fully addressed -- it will require pushing the `AccessSnapshot` structure inside `TraceElem`. Further diffs. Reviewed By: jberdine Differential Revision: D14405600 fbshipit-source-id: d239117aa
6 years ago
val pp_call : Format.formatter -> CallSite.t -> unit
end
module type TraceElem = sig
type elem_t
(** An [elem] which occured at [loc], after the chain of steps (usually calls) in [trace]. *)
type t = private {elem: elem_t; loc: Location.t; trace: CallSite.t list}
(** Both [pp] and [pp_human] simply call the same function on the trace element. *)
include Element with type t := t
val make : elem_t -> Location.t -> t
[racerd] replace quandary traces with explicit ones Summary: Context: "quandary" traces optimise for space by only storing a call site (plus analysis element) in a summary, as opposed to a list of call sites plus the element (i.e., a trace). When forming a report, the trace is expanded to a full one by reading the summary of the called function, and then matching up the current element with one from the summary, iterating until the trace cannot be expanded any more. In the best case, this can give a quadratic saving, as a real trace gets longer the higher one goes in the call stack, and therefore the total cost of saving that trace in each summary is quadratic in the length of the trace. Quandary traces give a linear cost. HOWEVER, these have been a source of many subtle bugs. 1. The trace expansion strategy is very arbitrary and cannot distinguish between expanded traces that are invalid (i.e., end with a call and not an originating point, such as a field access in RacerD). Plus the strategy does not explore all expansions, just the left-most one, meaning the left most may be invalid in the above sense, but another (not left-most) isn't even though it's not discovered by the expansion. This is fixable with major surgery. 2. All real traces that lead to the same endpoint are conflated -- this is to save space because there may be exponentially many such traces. That's OK, but these traces may have different locking contexts -- one may take the lock along the way, and another may not. The expansion cannot make sure that if we are reporting a trace we have recorded as taking the lock, will actually do so. This has resulted in very confusing race reports that are superficially false positives (even though they point to the existence of a real race). 3. Expansion completely breaks down in the java/buck integration when the trace goes through f -> g -> h and f,g,h are all in distinct buck targets F,G,H and F does not depend directly on H. In that case, the summary of h is simply not available when reporting/expanding in f, so the expanded trace comes out as truncated and invalid. These are filtered out, but the filtering is buggy and kills real races too. This diff completely replaces quandary traces in RacerD with plain explicit traces. - This will incur the quadratic space/time cost previously saved. See test plan: there is indeed a 30% increase in summary size, but there is no slowdown. In fact, on openssl there is a 10-20% perf increase. - For each endpoint, up to a single trace is used, as before, so no exponential explosion. However, because there is no such thing as expansion, we cannot get it wrong and change the locking context of a trace. - This diff is emulating the previous reporting format as much as possible to allow good signal from the CI. Further diffs up this stack will remove quandary-trace specific things, and simplify further the code. - 2 is not fully addressed -- it will require pushing the `AccessSnapshot` structure inside `TraceElem`. Further diffs. Reviewed By: jberdine Differential Revision: D14405600 fbshipit-source-id: d239117aa
6 years ago
val map : f:(elem_t -> elem_t) -> t -> t
val get_loc : t -> Location.t
(** Starting location of the trace: this is either [loc] if [trace==[]], or the head of [trace]. *)
val make_loc_trace : ?nesting:int -> t -> Errlog.loc_trace
val with_callsite : t -> CallSite.t -> t
(** Push given callsite onto trace, extending the call chain by one. *)
(** A powerset of traces. *)
module FiniteSet : FiniteSet with type elt = t
end
(* The [compare] function produced ignores traces but *not* locations *)
module MakeTraceElem (Elem : Element) : TraceElem with type elem_t = Elem.t
(* The [compare] function produced ignores traces *and* locations -- it is just [Elem.compare] *)
module MakeTraceElemModuloLocation (Elem : Element) : TraceElem with type elem_t = Elem.t