Summary:
The problem is that `Models.is_field_nonnullable` didn't differentiate
between
- having a nullable model (in which case we want the field to be NULLABLE),
- not having a model at all (in which case we want the field to be THIRDPARTY_NONNULL).
The problem was noticed only now because previously we didn't have any
NULLABLE field models.
Reviewed By: ngorogiannis
Differential Revision: D27709508
fbshipit-source-id: b98c8f86f
Summary:
This diff evaluates a cpp vector given as a parameter symbolically. Especially, it addresses it as an array, so the cost checker can use its symbolic length correctly.
**About handling `cpp.vector_elem` field:**
The field is a virtual field of vector object that points to the array of vector elements. It was introduced in Inferbo to model semantics of vector operations.
Since many semantics of Inferbo depends on type information, it had collected type information of vector elements, whenever `cpp.vector_elem` field was introduced, as a *side-effect*. A problem is that it has *side-effect*, which means it may introduce non-deterministic analysis results depending on the type information of the virtual field.
This diff changes it not to collect the type information on `cpp.vector_elem` as a side-effect. Instead, it tries to write the information to the abstract states (abstract location) when possible.
Reviewed By: ezgicicek
Differential Revision: D27674935
fbshipit-source-id: f3d52cae7
Summary:
In D27430485 (a6ab4d38cf), we used the static cost of the callee to determine whether it was cheap/expensive. This diff improves on that by taking the whole instantiated cost of the function call (not just the callee's cost).
Also, if the callee is an unmodeled call, we consider it to be expensive as before.
Note: cost instantiation was used by hoisting. I refactored bunch of code there to reuse as much as code possible.
Reviewed By: skcho
Differential Revision: D27649302
fbshipit-source-id: 07d11f3dd
Summary: When instantiating the callee's cost, we have picked up the InferBo memory at the node corresponding to the last instruction. Instead, we should pick up right at the call instruction. Picking it up later might cause arguments to go out of scope.
Reviewed By: skcho
Differential Revision: D27652474
fbshipit-source-id: 5ab35cabb
Summary:
The output differs on Java 11 compared to Java 8: one prints an
interface, the other resolves to a class name.
Reviewed By: ezgicicek
Differential Revision: D27678552
fbshipit-source-id: c5a5d0c39
Summary:
The computation of common prefixes and suffixes was wrong. In
particular, the computation of the common suffix did not correctly
consider the common prefix. This manifested in case one entire
sequence is a suffix of the other.
Reviewed By: ngorogiannis
Differential Revision: D27564874
fbshipit-source-id: 267a75102
Summary:
If either all or no tracing is enabled, it is not necessary to compute
the module and function names for each Trace operation to determine
whether or not it is enabled. This diff optimizes this case.
Reviewed By: ngorogiannis
Differential Revision: D27564869
fbshipit-source-id: bf766eb2f
Summary:
Switch to a stronger but more expensive operation to check if a
symbolic heap is unsatisfiable.
Reviewed By: ngorogiannis
Differential Revision: D27564872
fbshipit-source-id: c8e306408
Summary:
By default compiling with `-O0` adds `optnone` annotations, which
prevents the transformations sledge explicitly performs. This diff
adds a flag to disable this uncooperative behavior.
Reviewed By: ngorogiannis
Differential Revision: D27564879
fbshipit-source-id: 061397e3f
Summary:
`Typ.equivalent` relates types that denote the same sets of values in
the semantic model, such as pointers and integers of the appropriate
size. This diff strengthens the treatment of casts between such types
in the first-order solver by translating `(s)(t)e` to `e` for
equivalent types `s` and `t`.
These casts are usually simplified out of the bitcode produced by
clang. However, code using `_Atomic(...)` leads to `load atomic` llvm
instructions that, for some reason, cast pointers to i64 and back.
Reviewed By: ngorogiannis
Differential Revision: D27564881
fbshipit-source-id: 6138eb4f1
Summary:
When solving equations such as `α[o,l) = β`, the memory theory solver
computes a term to represent the length of α. This fails if α is a
variable (which might itself be equal to a sized term elsewhere in the
formula). This diff fixes such failures by generating a fresh variable
for the length in such situations.
Reviewed By: ngorogiannis
Differential Revision: D27564871
fbshipit-source-id: e5e066b77
Summary:
It is desirable to programmatically enable backtrace printing in debug
mode, but in other modes it should not forcably disabled. For example,
after this diff, executing
```
$ OCAMLRUNPARAM=b sledge.trc ...
```
now prints a backtrace if an unhandled exception is raised.
Reviewed By: ngorogiannis
Differential Revision: D27564883
fbshipit-source-id: 66ea5efe2
Summary:
This diff `#define`s the functions to which `assert` compiles to
`abort`. This enables the analyzer to report assertion violations, and
is independent of platform.
Also add a simple test to check that it is detected correctly.
Reviewed By: ngorogiannis
Differential Revision: D27564870
fbshipit-source-id: 5517ca910
Summary:
The removal of CAMLprim left the code in need of an application of
clang-format. There are various other changes made by clang-format
which it seems ought to be rolled together into this diff.
Upstream Differential Revision: https://reviews.llvm.org/D99477
Reviewed By: ngorogiannis
Differential Revision: D27564876
fbshipit-source-id: 8e22eb099
Summary:
The CAMLprim macro has not been needed since OCaml 3.11, and is
defined to the empty string. This diff removes all instances of it.
Upstream Differential Revision: https://reviews.llvm.org/D99476
Reviewed By: ngorogiannis
Differential Revision: D27564877
fbshipit-source-id: 14745646b
Summary:
The current code does not follow the simple interface to the OCaml GC,
where GC roots are registered conservatively, only initializing
allocations are performed, etc. This is intentional, as stated in the
opening file comments. On the other hand, the current code does
register GC roots in many situations where it is not strictly
necessary. This diff omits many of them.
Upstream Differential Revision: https://reviews.llvm.org/D99475
Reviewed By: ngorogiannis
Differential Revision: D27564873
fbshipit-source-id: f9e5b183f
Summary:
Using the `cstr_to_string` function that allocates and initializes an
OCaml `string` value enables simplifications in several cases. This
change also has the effect of avoiding calling `memcpy` on NULL
pointers even if only 0 bytes are to be copied.
Upstream Differential Revision: https://reviews.llvm.org/D99474
Reviewed By: ngorogiannis
Differential Revision: D27564875
fbshipit-source-id: b316adfe1
Summary:
Using the `caml_alloc_some` and `ptr_to_option` functions that
allocate OCaml `option` values enables simplifications in many
cases. These simplifications also result in avoiding unnecessary
double initialization in many cases, so yield a minor optimization as
well.
Also, change to avoid using the old unprefixed functions such as
`alloc_small` and instead use the current `caml_alloc_small`.
A few of the changed functions were slightly rewritten in the
early-return style.
Upstream Differential Revision: https://reviews.llvm.org/D99473
Reviewed By: ngorogiannis
Differential Revision: D27564884
fbshipit-source-id: 17883785c
Summary:
In several functions an OCaml block is allocated and no further OCaml
allocation functions (or other functions that might trigger allocation
or collection) are performed before the block is fully initialized. In
these cases, it is safe and slightly more efficient to allocate an
uninitialized block.
Also, the code does not become more complex after the non-initializing
allocation, since in the case that a non-small allocation is made, the
initial values stored are definitely not pointers to OCaml young
blocks, and so initializing via direct assignment is still safe. That
is, in general if `caml_alloc_small` is called, initializing it with
direct assignments is safe, but if `caml_alloc_shr` is
called (e.g. for a block larger than `Max_young_wosize`), then
`caml_initialize` should be called to inform the GC of a potential
major to minor pointer. But if the initial value is definitely not a
young OCaml block, direct assignment is safe.
Upstream Differential Revision: https://reviews.llvm.org/D99472
Reviewed By: ngorogiannis
Differential Revision: D27564878
fbshipit-source-id: 20cd69e92
Summary:
Using `Store_field` to initialize fields of blocks allocated with
`caml_alloc_small` is unsafe. The fields of blocks allocated by
`caml_alloc_small` are not initialized, and `Store_field` calls the
OCaml GC write barrier. If the uninitialized value of a field happens
to point into the OCaml heap, then it will e.g. be added to a conflict
set or followed and have what the GC thinks are color bits
changed. This leads to crashes or memory corruption.
This diff fixes a few (I think all) instances of this problem. Some of
these are creating option values. OCaml 4.12 has a dedicated
`caml_alloc_some` function for this, so this diff adds a compatible
function with a version check to avoid conflict. With that, macros for
accessing option values are also added.
Upstream Differential Revision: https://reviews.llvm.org/D99471
Reviewed By: ngorogiannis
Differential Revision: D27564868
fbshipit-source-id: 1dfdd0530
Summary:
This diff pulls in upstream changes to the LLVM OCaml
bindings. Includes upstream commits:
> Add (get/set)_module_identifer functions
> Fix documentation for verify_function and const_of_int64
> DebugInfo support for OCaml bindings
> llvmbuildectomy - compatibility with ocaml bindings
> Remove ConstantPropagation
> Remove and move tests to SCCP.
Also includes updates to llvm-dune to adapt the build to the added
header file.
Reviewed By: ngorogiannis
Differential Revision: D27564717
fbshipit-source-id: af63e2aba
Summary: We have been referring to the arguments of a function call as "params". This has been bothering me. Let's fix it!
Reviewed By: ngorogiannis
Differential Revision: D27649158
fbshipit-source-id: 10e0b28cb
Summary:
To avoid too big abstract states due to instantiated templates in C++,
this diff loosens the compare functions of field names and ungated
callees.
Reviewed By: ezgicicek
Differential Revision: D27625775
fbshipit-source-id: e33e9d34c
Summary:
Nullsafe/biabduction tests were sensitive to Java version: they were recorded for Java 8 but if the machine that is used to run the tests had Java 11, tests would fail. This diff aims to resolve this issue by
- making our tests produce java8-compatible bytecode so that tests don't fail on Java 11 machines
- removing nullsafe tests that exercise obscure Java 8 behavior that cannot be alleviated with backward compatible bytecode on Java 11
- changing lambda argument printing to be Java 11 compatible
Reviewed By: martintrojer
Differential Revision: D27500731
fbshipit-source-id: 77fe302ea
Summary:
Reporting all ungated (un configed?) function calls causes many FPs. Instead, we rely on complexity analysis to determine whether a function is cheap/expensive: if the callee's complexity is not symbolic (e.g. constant), we consider it as cheap and don't keep track of it.
Note that we don't take the instantiated/modeled cost into account yet. So, if we have `foo(int n)` with complexity `O(n)`, and call it as `foo(3)`, we would still keep track of it. Similarly, if `foo` is a modeled function with constant time complexity, we would have no summary for it hence would keep track of it.
These will be improved later.
Reviewed By: skcho
Differential Revision: D27430485
fbshipit-source-id: d5f66320d
Summary:
This diff removes additional inferbo options `--bufferoverrun` from cost tests, since printing
inferbo issues is not that useful to understand cost results.
Reviewed By: ngorogiannis
Differential Revision: D27592496
fbshipit-source-id: 6ab3e6528
Summary:
Whenever an equality "t = v" (t an arbitrary term, v a variable) is
added (or "v = t"), remember the "t -> v" mapping after canonicalising t
and v. Use this to detect when two variables are equal to the same term:
`t = v` and `t = v'` now yields `v = v'` to be added to the equality
relation of variables. This increases the precision of the arithmetic
engine.
Interestingly, the impact on most code I've tried is:
1. mostly same perfs as before, if a bit slower (could be within noise)
2. slightly more (latent) bugs reported in absolute numbers
I would have expected it to be more expensive and yield fewer bugs (as
fewer false positives), but there could be second-order effects at play
here where we get more coverage. We definitely get more latent issues
due to dereferencing pointers after testing nullness, as can be seen in
the unit tests as well, which may alone explain (2).
There's some complexity when adding term equalities where the term
is linear, as we also need to add it to `linear_eqs` but `term_eqs` and
`linear_eqs` are interested in slightly different normal forms.
Reviewed By: skcho
Differential Revision: D27331336
fbshipit-source-id: 7314e127a
Summary:
It's better (=possibly more efficient) to take the opportunity to
normalize linear terms when we can instead of possibly having to apply
the same normalization over and over on individual terms until the next
round of proper normalization.
Reviewed By: skcho
Differential Revision: D27464885
fbshipit-source-id: 0dc01a089
Summary:
When we don't know the value being shifted it may help to translate
bit-shifting into multiplication by a constant as it might surface
linear terms, eg `x<<1` is `2*x`.
Reviewed By: skcho
Differential Revision: D27464847
fbshipit-source-id: 9b3b5f0d0
Summary:
The simplifications done by `simplify_shallow` are all taken care of by
`eval_const_shallow` as well, they just also happen to help when not
*all* of the term is a constant. However, they might be less
precise/efficient than in the constant case, in particular in the next
diff that translates `x << c` into `x * 2^c` when `c` is constant.
Reviewed By: skcho
Differential Revision: D27464805
fbshipit-source-id: 452bc6ab1
Summary:
On some pathological examples of crypto primitives like libsodium, later
diffs make pulse grind to a halt due to an explosion in the size of
literals. This is at least partly due to the fact the arithmetic doesn't
operate modulo 2^64.
Due to the fact the arithmetic is confused in any case when we reach
such large numbers, cap them, currently at 2^128. This removes pathological
cases for now, even now on libsodium Pulse is ~5 times faster than before!
Take this opportunity to put the modified Q/Z modules in the own files.
Reviewed By: jberdine
Differential Revision: D27463933
fbshipit-source-id: 342d941e2
Summary: Just some scaffolding to save a bit of churn from the next diff.
Reviewed By: skcho
Differential Revision: D27328348
fbshipit-source-id: 4f5bfcc65
Summary:
This was added in C++14. Was investigating how SIOF dealt with this but
it turns out it already does the right thing as the translation unit of
global variable templates shows up as the place they are instantiated
(not the one where they are declared), which works well for SIOF
checking.
Reviewed By: da319
Differential Revision: D27500998
fbshipit-source-id: b8b9b9c48
Summary:
This is better suited than the generic "cGeneral_utils", and saves
exporting one of them too.
Reviewed By: da319
Differential Revision: D27500933
fbshipit-source-id: f4224f63b
Summary:
There are not too many cases where the function name is not enough to
disambiguate a trace message, but it is still perhaps more
approachable to include the module names as well.
Reviewed By: jvillard
Differential Revision: D27396914
fbshipit-source-id: ea4c8b44f
Summary: One source of non-deterministic diff result is when there are multiple overloaded methods the cardinals of unchecked callees of which are the same. This diff tries to select one of them in a more deterministic manner.
Reviewed By: ezgicicek, ngorogiannis
Differential Revision: D27430757
fbshipit-source-id: 38ba5d8dc