Summary:
This continues work of getting rid of using low-level Annot.Item.t in
favor of a new, more specific and flexible data structure.
Migrating this code further to NullsafeRules is bit tricky right now, so
let's defer it.
Reviewed By: ngorogiannis
Differential Revision: D17549479
fbshipit-source-id: 418b4b394
Summary:
This diff also introduces "subtyping function" into NullsafeRule, which
will be the core check for other rules we will introduce in follow up
diffs
Reviewed By: ngorogiannis
Differential Revision: D17500466
fbshipit-source-id: 5821caa6e
Summary:
As suggested by artempyanykh:
1. Since we recently introduced InferredNullability, AnnotatedNullability deserves its own class which now plays nicely with its counterpart.
2. AnnotatedType is more specific then NullsafeType
Reviewed By: artempyanykh
Differential Revision: D17547988
fbshipit-source-id: 785def23a
Summary:
The LLVM semantics and translation was not consistently treating the
1-bit word value condition as signed or unsigned.
Reviewed By: jberdine
Differential Revision: D17605766
fbshipit-source-id: 77edf63b7
Summary:
Previously the LLVM semantics did the phi instructions at the head of a
block as part of executing the branch into that block. This looked a bit
weird, but had the advantage that the semantics knew which block was
being jumped from, which is necessary to run the phi instructions.
However, it meant that the rules for doing phi instructions would need
to show up with each branching construct. It was also annoying for the
LLVM->llair proof, since the phis are removed and their effect happens as
a distinct step from the branch.
Here we add a distinct Phi_ip instruction pointer to indicate that the
phi instructions at the start of the block should execute next, and then
be incremented to the usual numeric instruction pointer that points to
the non-phi instructions. The Phi_ip contains the identity of the
previous block.
Reviewed By: jberdine
Differential Revision: D17452416
fbshipit-source-id: 78fef7cca
Summary:
Give the llair semantics observable side effects (writes to global
variables) and a semantic function mirroring the LLVM semantics. Start
sketching out the LLVM/llair translation equivalence proof in a top-down
way from the obvious statement of equality of the semantics.
Reviewed By: jberdine
Differential Revision: D17399654
fbshipit-source-id: 2170678a8
Summary:
The simple LLVM semantics steps one instruction at a time, but the
generated llair does whole blocks at a time, since many individual LLVM
instructions can become a single llair expression. We add a bigger-step
LLVM semantics that does whole blocks at a time (except that it also
stops at function calls, since those end blocks in llair). The steps in
this bigger-step semantics should be at the same granularity as the
llair steps, making it easier to verify the translation.
We add a notion of observation to the LLVM semantics (right now, just
global variable writes) and use that to define two top-level semantic
functions, which we prove to be equivalent.
Reviewed By: jberdine
Differential Revision: D17396016
fbshipit-source-id: ee632fb92
Summary: The analysis is not intra-procedural, hence we don't really read the payload. Let's remove it.
Reviewed By: ngorogiannis
Differential Revision: D17603911
fbshipit-source-id: c92b5c602
Summary:
This diff avoids giving the top value to unknown globals in Java,
because they harm precision of the cost checker. Instead, it doesn't
subst the global symbols at function calls.
Reviewed By: ezgicicek
Differential Revision: D17498714
fbshipit-source-id: d1215b3aa
Summary:
This diff adds an eval mode for the substitutions of the cost results, in order to avoid precision
loss by joining two symbols.
The usual join of two different symbolic values, `s1` and `s2`, becomes top due to the limitation of
our domain. On the other hand, in the new eval mode, it returns an upperbound `s1+s2`, because the
cost values only care about the upperbounds.
Reviewed By: ezgicicek
Differential Revision: D17573400
fbshipit-source-id: 2c84743d5
Summary:
This was causing a crash, because when trying to create a procname from a block at that point we don't have the block return type, which is needed for the name. I don't understand why BlockDecl doesn't contain the type, but I looked again and it doesn't (also in clang). So in general we need to pass it from the context, but that's not possible in this case.
Also, one could argue that such a block is not a method from the struct, since it's just a block that is assigned to a field as initialization.
Reviewed By: skcho
Differential Revision: D17575197
fbshipit-source-id: 3974ead3f
Summary:
This diff allows domains to specify which abstract states can or can't
be merged together by the worklist. In particular, this is needed for
relational domains to ensure that Hoare triples are joined only when
they share a precondition.
Reviewed By: jberdine
Differential Revision: D17571148
fbshipit-source-id: d9345fdc9
Summary: When we have an annotation like `Prop(varArg = X)` or ` ThreadSafe(enableChecks = true)`, we were not able to pick up the names of the parameters like `varArg` or `enableChecks`. This diff fixes that.
Reviewed By: skcho, ngorogiannis
Differential Revision: D17571377
fbshipit-source-id: 5293b5810
Summary:
Events can be many things, including lock acquisitions. Lock state keeps a set of events, all of which must be lock acquisitions.
Enforce this via the type checker by specialising the types so that lock state satisfies this by construction.
Reviewed By: ezgicicek
Differential Revision: D17571428
fbshipit-source-id: 2f5a33b98
Summary:
Instead of polluting the signature of trace endpoints, have
the call printer be a module argument to the functors
producing trace elements.
Reviewed By: skcho
Differential Revision: D17550111
fbshipit-source-id: ab5af94c6
Summary:
This diff adds a "-prenalyze-globals" flag to all analyze targets
which, when set, computes used-globals sets for all reachable
functions and then uses that information to track only relevant
global variables at calls in the main analysis.
Reviewed By: jberdine, jvillard
Differential Revision: D17526746
fbshipit-source-id: 1a114285c
Summary:
Fixes a bug in Llair.Frontend.xlate_value where the l-val register
of LLVM instruction calls was being marked as global.
Reviewed By: jberdine
Differential Revision: D17570458
fbshipit-source-id: e1b5924e2
Summary:
This proceed the work of getting rid of Annot.Item.t.
This diff:
- Moves "check assignment rule" to recently supported NullsafeRules
- Implements their own "check overannotated" (defers consolidating this
check into NullsafeRule for the future diffs).
Note that we don't need PropagatesNullable logic anymore because it is
already ported to NullsafeType (return value will be marked as Nullable
in NullsafeType)!
implicit_nullable (a.k.a. Void types) will require a follow up diff to
model.
Reviewed By: artempyanykh
Differential Revision: D17499246
fbshipit-source-id: 14b473f29
Summary:
In nutshell, Nullsafe is driven by relatively simple set of rules.
It is currently not well reflected in code: we are duplicating the same logic in different places, which is:
- error prone (we need to adjust ensure all places are addressed if a new feature is introduced)
- complicates understanding of nullsafe
Consolidating checks will simplify introducing Unknown Nullability and
strict/partial check modes.
## this diff
This diff does it for one particular check.
See follow up diffs re that proceed consolidation.
## future diffs
Future diffs will:
- consolidate other checks that use 'assignment rule'
- introduce other rules, most notably 'dereference rule' and
'inheritance rule'
Reviewed By: artempyanykh
Differential Revision: D17498630
fbshipit-source-id: 079d36518
Summary:
Now, after series of modifications with TypeAnnotation, we are ready to
rename it to reflect what it means in the code.
See the documentation in the class for details.
Also:
- renamed methods for clarity
- added some documentation
- renamed local variables around the usages
Reviewed By: jvillard
Differential Revision: D17480799
fbshipit-source-id: d4408887a
Summary:
This continues work for eliminating Annot.Item.t from Nullsafe low-level
code.
The introduced function `from_nullsafe_type` is called when we infer
initial type of the equation based on the function or field formal signature.
Before that, we did it via reading the annotation directly, which
complicates the logic and making introducing Unknown nullability tricky.
## Clarifying the semantics of PropagatesNullable
This diff also clarifies (and changes) the behavior of PropagatesNullable params.
Previously, if the return value of a function that has PropagatesNullable params was
annotated as Nullable, nullsafe was effectively ignoring PropagatesNullable effect.
This is especially bad because one can add Nullable annotation based on the logic "if the function can return `null`, it should be annotated with Nullable`.
In the new design, there is no possibility for such a misuse: the code that
applies the rule "any param is PropagatesNullable hence the return
value is nullable even if not explicitly annotated" lives in NullsafeType.ml, so
this will be automatically taken into account.
Meaning that now we implicitly deduce Nullable annotation for the return value, and providing it explicitly as an alternative that does not change the effect.
In the future, we might consider annotating the return value with `Nullable` explicit.
Reviewed By: jvillard
Differential Revision: D17479157
fbshipit-source-id: 66c2c8777
Summary:
In the cost checker, the range of selected control variables are used to estimate the number of loop iteration. However, sometimes the ranges of control variables are not related to how many times the loop iteration. This diff strengthens the condition for them as:
1. integers from `size` models
2. integers constructed from `+` or `-`
3. integers constructed from `*`
For the last one, the loop iteration is likely to be log scale of the range of the control variable:
```
while (i < c) {
i *= 2;
}
```
We will address this in the future.
Reviewed By: ezgicicek
Differential Revision: D17365796
fbshipit-source-id: c1e709ae8
Summary: Our annotation parameter parsing is too primitive to identify `resType` and before we only assumed that all Prop's can be set by any of the two suffixes: `Attr` and `Res`. After talking to Litho team, there is 3 more additions to these suffixes: `Dip`, `Sip`, and `Px`.
Reviewed By: ngorogiannis
Differential Revision: D17528482
fbshipit-source-id: 8d7f49130
Summary: Before, we were mistakenly checking any annotation that ends with Prop such as TreeProp. This was wrong. Instead, we should only check Prop as adviced by the Litho team.
Reviewed By: ngorogiannis
Differential Revision: D17527769
fbshipit-source-id: b753dd87a
Summary:
Introduce a new experimental checker (`--impurity`) that detects
impurity information, tracking which parameters and global variables
of a function are modified. The checker relies on Pulse to detect how
the state changes: it traverses the pre and post pairs starting from
the parameter/global variable and finds where the pre and post heaps
diverge. At diversion points, we expect to see WrittenTo/Invalid attributes
containing a trace of how the address was modified. We use these to
construct the trace of impurity.
This checker is a complement to the purity checker that exists mainly
for Java (and used for cost and loop-hoisting analyses). The aim of
this new experimental checker is to rely on Pulse's precise
memory treatment and come up with a more precise im(purity)
analysis. To distinguish the two checkers, we introduce a new issue
type `IMPURE_FUNCTION` that reports when a function is impure, rather
than when it is pure (as in the purity checker).
TODO:
- improve the analysis to rely on impurity information of external
library calls. Currently, all library calls are assumed to be nops,
hence pure.
- de-entangle Pulse reporting from analysis.
Reviewed By: skcho
Differential Revision: D17051567
fbshipit-source-id: 5e10afb4f
Summary:
As per previous diff, attempt to allocate fewer strings. This doesn't
seem to affect perf although allocating less might reduce memory
pressure.
Reviewed By: mityal
Differential Revision: D17423973
fbshipit-source-id: e2e37b071
Summary:
My spidey senses were tingling. Next diff uses the `pp` functions
everywhere it was kind of obvious how to change the code to do so. It
doesn't improve perf but is less clowny that way. It might lessen memory
pressure since allocating strings is expensive and this code was doing a
lot of it.
Reviewed By: ngorogiannis
Differential Revision: D17450324
fbshipit-source-id: 632cee584
Summary:
The code was already trying to do that but failing. Now it works.
This revealed a slight bug where the progress bar would always stop at
N-1/N 99% jobs. Fixed by moving the progress bar updates *after* the
operation that might decrease the number of jobs left.
Reviewed By: mityal
Differential Revision: D17423978
fbshipit-source-id: fc32db5f3
Summary:
Previously we would incorrectly report the time for the whole process
and this could include capture time too.
Reviewed By: mityal
Differential Revision: D17423977
fbshipit-source-id: b3ed754b3
Summary: A year after the last upgrade of Buck, it's time to upgrade again.
Reviewed By: artempyanykh
Differential Revision: D17527266
fbshipit-source-id: 57e7f157d
Summary: We should be able to run this processing ast steps without running linters or capture. This also adds a new module ProcessAST to do the processing, Capture.ml should not know anything else than calling the respective modules for capture, linting or processing.
Reviewed By: ngorogiannis
Differential Revision: D17501453
fbshipit-source-id: 30adba5b1
Summary:
`ModeledRange` represents how many times the interval value can be updated by modeled functions. This
domain is to support the case where there are mismatches between value of a control variable and
actual number of loop iterations. For example,
```
while((c = file_channel.read(buf)) != -1) { ... }
```
the loop will iterates as the file size, but the control variable `c` does not have that value. In
these cases, it assigns a symbolic value of the file size to the modeled range of `c`, then which
is used when calculating the overall cost.
Reviewed By: jvillard
Differential Revision: D17476621
fbshipit-source-id: 9a81376e8
Summary:
Fixes a bug where are all calls are treated as intrinsics in used
globals analysis, since exec_intrinsic is invoked at _all_ calls
to determine which are intrinsic, not only at call sites known to
target intrinsics.
Reviewed By: jberdine
Differential Revision: D17499406
fbshipit-source-id: 41f7621f2
Summary:
While the symbolic heap analysis ends its search upon hitting the
bound on recursion depth, the used-globals analysis should instead
simply skip recursive calls beyond the depth. Note that this is
unsound for arbitrary abstract domains, however, and the flag
controlling this feature should be used with caution.
Note that procedure calls are still not handled correctly, since
Used_globals.exec_intrinsic does not properly check whether callees
are intrinsic. A forthcoming commit will fix that, as well.
Reviewed By: jberdine
Differential Revision: D17479753
fbshipit-source-id: aa92e0ef3
Summary:
Include global variables used in function callees in used globals
analysis. Also adds support for arbitrary changes to symbolic
state while resolving callees in other analyses.
Reviewed By: jberdine
Differential Revision: D17479352
fbshipit-source-id: e3cd9f179
Summary:
1/ Nikos Gorogiannis pointed out that
- for highly reused public types, records (especially when >= 3 params) are generally more readable than tuples.
- Records simplify code modifications, especially adding new fields. And we are going to add some, namely param flags, in the future.
2/ Let's make the fact that annotated signature is deprecated more
visible; it will also simplify searching for usages when we will be
getting rid of them.
Reviewed By: ngorogiannis
Differential Revision: D17475033
fbshipit-source-id: 7740c979b
Summary:
- Instead of merging one target DB into the main DB at a time, merge all target DBs into an in-memory DB (thus, no writing) and then dump it into the main DB at the end. This makes merging faster.
- When using the sqlite write daemon, there is no reason to drive the merge process from the master, sending each individual target to merge down the socket and doing one DB merge at a time. Here we move all the DB merging logic in the daemon, and expose a single function that does it all.
- Refactor some common functionality (notably the `iter_infer_deps` function is now in `Utils`) and remove dead files.
This can be also done using a temporary DB (which is not limited to memory) but this showed worse perf in tests than the in-memory solution as well as the current state of things (! possibly Sqlite-version related?).
Reviewed By: skcho
Differential Revision: D17182862
fbshipit-source-id: a6f81937d
Summary:
Replace custom version reporting support using a shell script with
code using dune's Build_info API.
Note that after this diff, the executables under _build/<context> are
not version-stamped, but those under _build/_install are. The symlinks
in bin point to the latter, stamped, exes.
Reviewed By: bennostein
Differential Revision: D16985446
fbshipit-source-id: 7afac87be
Summary:
`get_field_annotation` is (together with
`get_modelled_annotated_signature`) an entry point when Nullsafe fetches
annotation information.
In follow up diffs we are going to utilize added information; see also
TODO in the code
Reviewed By: ngorogiannis
Differential Revision: D17475034
fbshipit-source-id: dab77bc7b
Summary:
"Unannotated" is misleading and ambiguous concept, it can have different
meanings depending on agreements.
The current logic treats them as Nonnull, which is exactly what we want
to preserve.
(If we need to partially model some functions where we don't have
opinion on some of types in the signature, we can explicitly model
unknown nullability later on).
Note that I am not aiming for substantial refactoring of
modelsTables.ml; the scope of this diff is merely to clarify things.
Reviewed By: ngorogiannis
Differential Revision: D17449347
fbshipit-source-id: 43c798ce7
Summary:
This function is the main entry point for getting annotated signature
for nullsafe.
We will modify it and its callees in follow up diffs to migrate other
features of Annot.items to specialized types.
Reviewed By: ngorogiannis
Differential Revision: D17448082
fbshipit-source-id: be00b4737
Summary:
Adds an abstract domain to track global variable usages, as well as supporting
changes to the frontend, IR and CLI. This analysis will support optimizations
to the main symbolic-heap analysis, but for now can be invoked independently
through the `-domain` flag on `analyze` targets of the Sledge executable.
Reviewed By: jberdine
Differential Revision: D17422212
fbshipit-source-id: 74bed0a76
Summary:
This is a central abstraction for coming future unknown nullability support.
# Context
Annot.ml is a low-level module:
- it contains lists of raw (string) annotations
- no algebraic datatypes for annotations
- it mixes annotations that Nullsafe should be aware of with all sorts of other annotations
- some annotations make sense for return values, some make sense for params, and some make sense for methods.
But, most importantly, it does not contain information about source of an annotation, making it hard to distinct things like "Nonnull as default" vs "Nonnull as explicitly annotated" vs "Nonnull as modelled". Ditto for nullable.
Because of this, it is tricky to introduce unknown nullability in an elegant way.
Let's get rid of using Annot.Item.t in nullsafe code in the following way:
- Move nullability information associated with the Java type to a dedicated algebraic DT.
- Split other annotations that are important for nullsafe into param flags, ret value flags, and method flags, and introduce corresponding datatypes.
# This diff
This diff introduces NullsafeType and adds this to AnnotatedSignature.
It is not used yet, hence the diff is a no-op.
In future diffs, we are going to (see also TODOs in the code):
- actually use this information instead of accessing Annot.item
- add more information to AnnotatedSignature
- remove Annot.item from AnnnotatedSignature
- when this is done, introduce notion of unknown nullability.
Reviewed By: ngorogiannis
Differential Revision: D17420595
fbshipit-source-id: b30706d9b
Summary:
This diff extends the `Simple` alias domain to address Java's
temporary variables better. It now has an additional field to denote
an alias temporary variable.
Reviewed By: jvillard
Differential Revision: D17421907
fbshipit-source-id: 8b8b47461