Summary:
This one isn't caught because we don't destruct temporaries that are
bound to a const reference. According to the C++ standard these should
get destroyed when the const reference gets destroyed but instead we
just don't destroy them for now.
Reviewed By: mbouaziz
Differential Revision: D15760209
fbshipit-source-id: 32c935ec0
Summary:
In a next diff temporaries will get destructed at the end of their
lifetimes and that naive model would be causing false positives.
The flipside is that we lose all reports on closures for now, will need
to model them separately later.
Reviewed By: mbouaziz
Differential Revision: D15695943
fbshipit-source-id: c2c482c02
Summary:
Needed for next diff: we'll need to do 2 passes on the AST to collect
the temporaries to destroy at the end of an `ExprWithCleanups`, but the
SIL names of these temporaries are generated freshly on the fly so they
would get different names if we do it naively.
This adds a hashmap to the translation context so the temporary
corresponding to a given `MaterializeTemporyExpr` is only generated once
and then reused.
Reviewed By: mbouaziz
Differential Revision: D15674212
fbshipit-source-id: 0e16062d9
Summary:
This started as an attempt to understand how to modify the frontend to
inject destructors for C++ temporaries (see next diffs).
This diff rewrites the existing logic for computing the list of
variables that should be destroyed at the end of each statement, either
because it's the end of their syntactic scope or because control flow
branches outside of their syntactic scope.
The frontend translates a function from the last instructions to the
first, but scope computation needs to be done in the other direction, so
it's done in a separate pass *before* the main translation happens. That
first pass creates a map from statements in the AST to the list of
variables that should be destroyed at the end of these statements. This
is still the case now.
Before, that map would be computed in a bit of a weird way: scopes are
naturally a stack but instead of that the structure maintained was a
flat list + a counter to know where the current scope ended in that
list.
In this diff, redo the computation maintaining a stack of scopes
instead, which is a bit cleaner. Also treat more instructions as
introducing a new scope, eg if, for, ...
Reviewed By: mbouaziz
Differential Revision: D15674208
fbshipit-source-id: c92429e82
Summary:
Somewhat trivial: add a string to "Destruction" nodes to indicate why
they were created. Rename the main `instruction_aux` function into
`instruction_translate` (see next diff for why).
Reviewed By: mbouaziz
Differential Revision: D15674211
fbshipit-source-id: 8a7eda72c
Summary:
I rewrote the test so it doesn't need any C++ headers so that:
- it's easier to see what's going on
- it's easier to debug: the whole AST is now somewhat readable vs before
the headers made it impossibly long
Reviewed By: ezgicicek
Differential Revision: D15674213
fbshipit-source-id: d98941983
Summary:
This diff introduces a `-lib-fuzz` flag to `buck link`, which links in a
simple main that calls the LLVMFuzzerTestOneInput function, which is the
entry point of libFuzzer fuzzer.
Reviewed By: jberdine, jvillard
Differential Revision: D15821512
fbshipit-source-id: cff731ed3
Summary: This allows to match `foo<int_&>` and many other horrible names.
Reviewed By: mbouaziz
Differential Revision: D15825403
fbshipit-source-id: c892033aa
Summary:
I realized that there was a discrepancy in the # of instructions between whether we run a single analysis or multiple analyses at the same time. It turns out that in biabduction, bufferoverrun and other HIL analyses we did Preanalysis step (which adds scope instructions and invokes liveness etc.) but not in others. This discrepancy results in inconsistent analysis results (e.g. in the new inefficient-keyset-iterator) that rely on instructions. We should be consistent. Hence, we now invoke Preanalysis in the frontend and remove all other uses in the rest of the checkers.
Consequently, I had to update the inefficient-keyset-checker to take the CFG resulting from Preanalysis with extra scoping instructions.
Reviewed By: mbouaziz, ngorogiannis, jvillard
Differential Revision: D15803492
fbshipit-source-id: 4e21eb610
Summary:
Previous change to allow bitcasts in call instructions was too strict
and did not allow for indirect calls.
Reviewed By: jberdine
Differential Revision: D15803262
fbshipit-source-id: 40d828b59
Summary:
Currently printing symbolic heaps is unreadable, because there are too
many quantified variables, that are mostly just equal to other
variables.
This diff tries to replace all variables in an equivalence class with a
single variable and remove the unneccesary variables.
It also introduces two modes for printing state domains:
`-t +State_domain.pp_full` prints the state domain as is
`-t +State_domain.pp` uses the simplification before printing.
Reviewed By: jberdine
Differential Revision: D15738748
fbshipit-source-id: 7c85b580e
Summary:
Print pre- and post- conditions (aka, summaries) when analyzer hits a
function return
- plumbing the precondition through the analyzer
so that it is available when return is hit
Reviewed By: jberdine
Differential Revision: D15713725
fbshipit-source-id: b10b6206f
Summary:
This diff adds a `__llair_alloc` intrinsic which is modeled
as a non-failing malloc. Using it instead of `malloc` increases
the readbility of symbolic heaps, because it removes all the cases
where malloc failed.
Note that `assert(malloc())` does not have the desired effect.
Reviewed By: ngorogiannis
Differential Revision: D15778817
fbshipit-source-id: d02784077
Summary:
Some call instructions in LLVM bitcast the function,
for example
`%call = call i32 (i64, ...) bitcast (i32 (...)* @__llair_alloc to i32 (i64, ...)*)(i64 %conv)`
This would cause sledge to crash in LLVM when build with assertions.
Reviewed By: jberdine
Differential Revision: D15779003
fbshipit-source-id: c273f92db
Summary:
This is a simple checker that identifies inefficient uses of `keySet` iterator where (not only the key but also) the value is accessed via `get(key)`. It is more efficient to use `entrySet` iterator which already returns both key-value pairs. This optimization would get rid of many extra lookups which can be expensive.
We simply traverse the CFG starting from the loop head upwards and pick up the map that is iterated over. Then, we check in the loop nodes if there is a call to `get(...)` over this map. If, so we report.
Reviewed By: ngorogiannis
Differential Revision: D15737779
fbshipit-source-id: 702465b4e
Summary:
This diff adds a formal parameter to each non-void-returning function
to name the return value, and similarly a formal parameter for the
thrown exception value. These are interpreted as call-by-reference
parameters, so that they can be constrained in formulas to e.g. be
equal to the return value, and are still in scope when the function
returns, and so can be passed to the return block. Prior to
summarizing functions, this means that these formals need to be
tracked on the analyzer's control stack.
This will be needed to express function specs/summaries in terms of
formals, and fixes a bug where in some cases return values were not
tracked correctly.
Reviewed By: kren1
Differential Revision: D15738026
fbshipit-source-id: fff2d107c
Summary:
Previously the locals of a function were computed after backpatching
the blocks in its cfg. This resulted in loss of sharing, and incorrect
locals if queried through the parent of a block.
Reviewed By: kren1
Differential Revision: D15738027
fbshipit-source-id: d7e70530a
Summary:
Disable exceptional control flow
- treat throw as unreachable
- confidence in the correctness of the frontend's treatment of
exception handling is very low, and making summaries that are
expressive enough to talk about exceptions is a complication
that isn't needed for the first iteration
To facilitate, start on a struct that holds all the CL options.
Reviewed By: jberdine, jvillard
Differential Revision: D15713601
fbshipit-source-id: ee92dfbd8
Summary:
Move genrule capture integration logic from shell to OCaml.
Also, stop relying on side-effects of buck compilation for constructing the infer-deps.txt file used for merging. Now this is obtained by passing `--show-output` to buck, which spits out the `buck-out` output paths to the targets we asked to build.
Reviewed By: ezgicicek
Differential Revision: D15715608
fbshipit-source-id: 8fa896ba6
Summary:
The synthetic methods from `topl.Property` are now nonempty: they
simulate a nondeterministic automaton.
Reviewed By: jvillard
Differential Revision: D15668471
fbshipit-source-id: 050408283
Summary:
Instrument SIL according to TOPL properties. Roughly, the
instrumentation is a set of calls into procedures that simulate a
nondeterministic automaton. For now, those procedures are NOP dummies.
Reviewed By: jvillard
Differential Revision: D15063942
fbshipit-source-id: d22c2f6fa
Summary:
This diff adapts the test scripts to the new sledge CLI, and reworks
them to enable checking changes with respect to a baseline. In
particular, now
```
make -C test
```
has exit code 0 if the current test results match the expected ones,
and otherwise prints the diff. Also,
```
make -C test promote
```
promotes the current test results to the new baseline.
Reviewed By: kren1
Differential Revision: D15706573
fbshipit-source-id: 0cbf3231e
Summary:
Include cxa_default_handlers.cpp to bring in definitions for
__cxa_terminate_handler and __cxa_unexpected_handler.
Reviewed By: kren1
Differential Revision: D15712980
fbshipit-source-id: f536930a8
Summary:
Sometimes calls to the `abort` C stdlib function appear as `invoke`
instructions in LLVM. They should be translated to the LLAIR abort
instruction just like the non-raising `call abort` case.
Reviewed By: kren1
Differential Revision: D15706574
fbshipit-source-id: 1509ed0e3
Summary:
Most binary and ternary operations have the same type as their
arguments, so try to compute the type of arguments in these cases.
Reviewed By: kren1
Differential Revision: D15706576
fbshipit-source-id: 4749d6e32
Summary:
When multiple buck java tests use the same `buck-out` they sometimes fail. This isn't surprising, as they presumably clobber each other's output when running on the same files.
Since there is no reason to have this global, shared buck repo, create one for each test, inside the test directory. Also, clean up the Makefiles a bit -- they provide bogus compile targets, for example, and have mostly wrong source dependencies.
That done, remove the `testlock` crutch which enforces mutual exclusion between tests, from the buck/java tests.
I do not understand why the buck clang tests can share the global repo without failure, but there you go.
Reviewed By: jvillard
Differential Revision: D15579133
fbshipit-source-id: 7eff79173
Summary:
When LLVM is built with assertions, it crash
`add_sym` if you try to get the global scope of a non global value.
This patch special cases add_sym, to just do nothing when `llv` is
an `UndefValue`.
Also enhances debuging printout of transalte to include the number of
functions and globals.
Reviewed By: jvillard
Differential Revision: D15669447
fbshipit-source-id: 4b5483810
Summary:
The entry point functions are used in a couple of places, this
puts them in a single source of truth in the config file.
Reviewed By: jvillard
Differential Revision: D15651976
fbshipit-source-id: a572e8d4d
Summary:
This adds a globalopt optimization pass to sledge.
Consider code like:
```
const char *a_string = "I'm a string";
int an_int = 0;
int c() {
return an_int;
}
int main() {
char *c1 = a_string;
return c();
}
```
When compiled there are 2 levels of indirection. For example
`return an_int` Get's compiled as
```
%0 = load i32, i32* an_int1
ret i32 %0
```
Global opt reduces this (if `an_int` is internal) to just
` ret i32 0`.
Similarly and more importantly
`c1 = a_string;` get's compiled into
```
@.str = private unnamed_addr constant [13 x i8] c"I'm a string\00"
a_string = dso_local global i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i32 0, i32 0)
%c1 = alloca i8*, align 8
%0 = load i8*, i8** a_string, align 8, !dbg !25
store i8* %0, i8** %c1, align 8, !dbg !24
```
So there is a level of indirection between `c1` and `.str` where the string is stored.
With global opt, this gets reduced to:
```
@.str = private unnamed_addr constant [13 x i8] c"I'm a string\00"
%c1 = alloca i8*, align 8
store i8* getelementptr inbounds ([13 x i8], [13 x i8]* @.str, i64 0, i64 0), i8** %c1, align 8, !dbg !23
```
and `a_string` variable gets deleted.
On sledge this has the effect of reducing the complexity of the symbolic heap significantly.
Without this optimisation, running
`sledge.dbg llvm analyze -trace Domain.call global_vars.bc`
Gives prints the following segments:
```
∧ %.str -[)-> ⟨13,{}⟩
* %a_string -[)-> ⟨8,%.str⟩
* %an_int -[)-> ⟨4,0⟩
* %c1 -[)-> ⟨8,%.str⟩
* %retval -[)-> ⟨4,0⟩
```
So there are `an_int` and `a_string` segments, which are redundant.
with the optimisation, the heap looks like:
`∧ %.str -[)-> ⟨13,{}⟩ * %c1 -[)-> ⟨8,%.str⟩ * %retval -[)-> ⟨4,0⟩`,
Where we only have the `.str` segment and the `c1` segment, which are the two we need.
Reviewed By: ngorogiannis
Differential Revision: D15649195
fbshipit-source-id: 5f71e56e8
Summary: Not sure how that happens but it does. Instead of crashing, log the error and continue.
Reviewed By: martintrojer
Differential Revision: D15660008
fbshipit-source-id: c87e724d4
Summary: The previous commit broke the `--foo arg` case because it matched `--foo` in the case looking for `--foo=`.
Reviewed By: mbouaziz
Differential Revision: D15670472
fbshipit-source-id: ab81c7357
Summary: There's currently no way to skip these when they are passed to clang.
Reviewed By: martintrojer
Differential Revision: D15669132
fbshipit-source-id: be97d2638