Summary:
I realized that there was a discrepancy in the # of instructions between whether we run a single analysis or multiple analyses at the same time. It turns out that in biabduction, bufferoverrun and other HIL analyses we did Preanalysis step (which adds scope instructions and invokes liveness etc.) but not in others. This discrepancy results in inconsistent analysis results (e.g. in the new inefficient-keyset-iterator) that rely on instructions. We should be consistent. Hence, we now invoke Preanalysis in the frontend and remove all other uses in the rest of the checkers.
Consequently, I had to update the inefficient-keyset-checker to take the CFG resulting from Preanalysis with extra scoping instructions.
Reviewed By: mbouaziz, ngorogiannis, jvillard
Differential Revision: D15803492
fbshipit-source-id: 4e21eb610
Summary:
Instrument SIL according to TOPL properties. Roughly, the
instrumentation is a set of calls into procedures that simulate a
nondeterministic automaton. For now, those procedures are NOP dummies.
Reviewed By: jvillard
Differential Revision: D15063942
fbshipit-source-id: d22c2f6fa
Summary:
This was hardcoded to `true` and its purpose is unclear to me. I kill
what confuses me.
Reviewed By: jeremydubreil
Differential Revision: D15294783
fbshipit-source-id: 3c1c469ee
Summary:
TOPL properties are essentially automata, which will be modeled as a set
of procedures. The code-to-analyze makes calls into these procedures,
thereby driving the automaton. In this commit, these calls do not do
anything. The point is to prepare the hook-up mechanism.
Reviewed By: jvillard
Differential Revision: D14819650
fbshipit-source-id: d95ecdb3d
Summary:
We get messages like " object returned by `getArguments()` at line 101."
instead of " object returned by `getArguments()` could be null and is
dereferenced at line 101.". Tracking it down, it happens for
nullable-looking values, but I don't know why.
It seems that something regressed but I couldn't track it down.
So, just generate the error message in the same way as for non-nullable
objects in this case to fix the non-sensical message.
Reviewed By: jeremydubreil
Differential Revision: D14972325
fbshipit-source-id: 2a97501cc
Summary:
Increases precision a bit. I didn't observe speed problems on what I tested. (But, who knows?)
Closes https://github.com/facebook/infer/pull/799
Reviewed By: jvillard
Differential Revision: D6284206
Pulled By: rgrig
fbshipit-source-id: 6f1e8631f
Summary:
Instead of emitting an ad-hoc builtin on variable declaration emit a new
metadata instruction. This allows us to remove the code matching on that
ad-hoc builtin that had to be inserted in several checkers.
Inferbo & pulse used that information meaningfully and had to undergo
some minor changes to cope with the new metada instruction.
Reviewed By: ezgicicek
Differential Revision: D14833100
fbshipit-source-id: 9b3009d22
Summary:
Bundle all non-semantic-bearing instructions into a `Metadata _`
instruction in SIL.
- On a documentation level this makes clearer the distinction between
instructions that encode the semantics of the program and those that are
just hints for the various backend analysis.
- This makes it easier to add more of these auxiliary instructions in
the future. For example, the next diff introduces a new `Skip` auxiliary
instruction to replace the hacky `ExitScope([], Location.dummy)`.
- It also makes it easier to surface all current and future such
auxiliary instructions to HIL as the datatype for these syntactic hints
can be shared between SIL and HIL. This diff brings `Nullify` and
`Abstract` to HIL for free.
Reviewed By: ngorogiannis
Differential Revision: D14827674
fbshipit-source-id: f68fe2110
Summary:
TOPL properties are essentially automata, which specify a bad pattern.
This commit is just a parser for them.
Reviewed By: jvillard
Differential Revision: D14477671
fbshipit-source-id: c38a8ef37
Summary:
While adding a footprint frame during rearrangement, the footprint
variables should be fresh with respect to the current state too, not
only with respect to he footprint, because the frame is added to the
state.
Reviewed By: jberdine
Differential Revision: D14401026
fbshipit-source-id: 20ea4485a
Summary:
To meet the pure parts of formulas, the process was to (a) call Rename.extend
with variables occuring in similar places and (b) extract substitutions out of
those. Two matching primed vars would both be replaced by some fresh primed var.
However, equivalence classes of primed variables would *not* be replaced by
one fresh (primed) variable. Now, that should work.
Reviewed By: mbouaziz
Differential Revision: D14150192
fbshipit-source-id: 90ca9216c
Summary:
Before, the liveness pre-analysis would place extra instructions in the
CFG for either:
1. marking an `Ident.t` as dead, or
2. marking a `Pvar.t` as `= 0`
But we have no way of marking pvars dead without setting them to 0. This
is bad because setting pvars to 0 is not possible everywhere they are
dead. Indeed, we only do it when we haven't seen their address being
taken anyway. This prevents the following situation, recorded in our tests:
```
int address_taken() {
int** x;
int* y;
int i = 7;
y = &i;
x = &y;
// if we don't reason about taken addresses while adding nullify instructions,
// we'll add
// `nullify(y)` here and report a false NPE on the next line
return **x;
}
```
So we want to mark pvars as dead without nullifying them. This diff
extends the `Remove_temps` SIL instruction to accept pvars as well, and
so renames it to `ExitScope`.
Reviewed By: da319
Differential Revision: D13102953
fbshipit-source-id: aa7f03a52
Summary:
When initialising a variable via semi-exotic means, the frontend loses
the information that the variable was initialised. For instance, it
translates:
```
struct Foo { int i; };
...
Foo s = {42};
```
as:
```
s.i := 42
```
This can be confusing for backends that need to know that `s` actually
got initialised, eg pulse.
The solution implemented here is to insert of dummy call to
`__variable_initiazition`:
```
__variable_initialization(&s);
s.i := 42;
```
Then checkers can recognise that this builtin function does what its
name says.
Reviewed By: mbouaziz
Differential Revision: D12887122
fbshipit-source-id: 6e7214438
Summary:
The upcoming ocamlformat has the ability to parse and format
docstrings. This requires that the docstrings conform to the ocamldoc
spec a bit more strongly. If a docstring does not parse, it is left
alone, but if it is morally ill-formed but parses by chance, it can be
reformatted incorrectly. This patch fixes the existing instances of
this problem.
Reviewed By: mbouaziz
Differential Revision: D12911937
fbshipit-source-id: 1c2eb590b
Summary:
Seems useful to know when we're printing one instruction only, but not when we
print lots of them for readability.
Reviewed By: mbouaziz
Differential Revision: D12823481
fbshipit-source-id: 2beb339f2
Summary:
In a future commit `Attributes` will depend on `Procdesc` and that
creates a cycle for the functions concerned with specialising proc
descs, which need `Attributes`.
Reviewed By: jberdine
Differential Revision: D10173354
fbshipit-source-id: 6c4ff82f0
Summary:
This adds an option `--trace-events` that generates a Chrome trace event[1] to
quickly visualise the performance of infer.
Reviewed By: mbouaziz
Differential Revision: D9831599
fbshipit-source-id: 96a33c627
Summary:
The constructor `` `Typ`` is never used to build values. Removing type
substitutions from Sil.ml had knock-on effect on Typ.ml etc., resulting in more
deleted code around type substitutions \o/
Reviewed By: mbouaziz
Differential Revision: D9769340
fbshipit-source-id: 509cbd284
Summary:
Now that we got rid of dummy nodes used non-dummily (biabduction state, reporting), `pname` don't need to be an option anymore.
Let's save a boxing on all nodes.
Reviewed By: jeremydubreil
Differential Revision: D9654152
fbshipit-source-id: 83b00f239
Summary:
Using a dummy node here made the whole reporting wrong because it didn't fail getting a `node_key` when reporting issues from checkers not using the biabduction state.
Now that it's fixed, let's fail hard if someone ever tries again.
Reviewed By: jeremydubreil
Differential Revision: D9654137
fbshipit-source-id: c00273e53
Summary:
Separate and rename error reporting functions that use the biabduction state.
No checkers should call these functions.
Reviewed By: da319
Differential Revision: D9633579
fbshipit-source-id: 884fcee66
Summary:
After some testing, it looks like getting the pdesc via
`Ondemand.get_proc_desc` will also load models' proc descs from their
summaries, so this code should not be needed.
Reviewed By: jeremydubreil, mbouaziz, martintrojer
Differential Revision: D9197176
fbshipit-source-id: 1b8603bfa
Summary:
`Errlog` will merge similar issues (same severity, name, description) reported at the same location, so let's make sure the locaiton is mandatory.
Issues:
- errors happening in `Ondemand` still use the `State` which makes sense only for biabduction and eradicate
- a case of `NullabilitySuggest` didn't have a location, I did my best to patch it but I'm sure the location could be more precise
Reviewed By: jvillard
Differential Revision: D9332840
fbshipit-source-id: ee7898146
Summary: Keeping pushing arguments higher in the stack, `node_id_key` is not used in calls to `log_warning/error`
Reviewed By: jvillard
Differential Revision: D9332826
fbshipit-source-id: e5c48c686
Summary:
Before we would convert it to string in `Reporting` and pass it to `Errlog` which would use it only to 'log events'.
I guess the reason is that there was a cyclic dependency between `Errlog` and `clang_method_kind` defined in `ProcAttributes`.
This diff:
- moves it to its own module
- defers the conversion to string
Reviewed By: jvillard
Differential Revision: D9332819
fbshipit-source-id: 43a028b61
Summary:
- abstracted the type for a node key
- moved it to its own module with an ugly `compute` to avoid cyclic dependencies...
- renamed `node_id` to `node_id_key` where needed
- moved key computation from `State` to `Procdesc.Node`
Reviewed By: jvillard
Differential Revision: D9332803
fbshipit-source-id: fe1ae8c1c
Summary:
Use `ignore` instead, as this will warn if the argument is an arrow type,
unlike `let _ = ...`. This makes the code more future-proof: if an argument is
added to a function called in `let _ = f x` then the compiler will complain
instead of silently turning a value into a partial evaluation.
Also got rid of particularly irksome `let _ = <stuff returning unit> in` where I could.
Reviewed By: mbouaziz
Differential Revision: D9217176
fbshipit-source-id: 3be463405
Summary:
Using the proc name directly should be equivalent and does not rely on
`Summary` caching summaries for us.
Reviewed By: mbouaziz
Differential Revision: D9196503
fbshipit-source-id: dea67999a
Summary:
We want to kill the cache in `Summary`, and calling `Summary.get` relies on
that cache existing for efficiency. However in this case it's not needed
because we can pass the summary from above instead.
Reviewed By: mbouaziz
Differential Revision: D9195234
fbshipit-source-id: 5b7023242
Summary:
Exposing the in-memory cache seems dangerous. There were only 2 uses anyway:
eradicate and biabduction. I think the biabduction one is safe to remove. The
eradicate one I do not really understand as we return a different summary than
the one we cache... the tests pass though.
Reviewed By: jeremydubreil
Differential Revision: D9150167
fbshipit-source-id: cf30af232
Summary:
When we see `pthread_create(..., ..., foo, ...)`, we want to call the function
`foo` to check that its precondition is met. The initial goal was to get rid
of the uncouth call to `Summary.get` when what we really want is to analyse
`foo` instead of just betting on the fact that it has been analysed already.
Besides switching to `Ondemand.analyze_proc_name`, this also changes the
matching of the function pointer in the arguments of `pthread_create()` to
detect the common case of a constant function name. I also added tests.
Reviewed By: jeremydubreil
Differential Revision: D9195159
fbshipit-source-id: dfec79f14
Summary: Errors that include temporary variables are difficult to understand. Do not report stack variable address escape on temporary variables.
Reviewed By: jvillard
Differential Revision: D9117517
fbshipit-source-id: 9ebd75ecc
Summary: `IntLit.to_int` could raise, was not documented until recently and was not named `_exn`. Switch to option type and fix uses.
Reviewed By: jeremydubreil
Differential Revision: D8865525
fbshipit-source-id: f5ec2f221
Summary: Otherwise the dead code checker sometimes crashes with a not-totally-related error.
Reviewed By: mbouaziz
Differential Revision: D8732546
fbshipit-source-id: 65caabd
Summary: This was showing up during memory profiling with statistical-memprof. The surrounding code shows that we are trying not to allocate if nothing changes, but the `List.map` in the middle would defeat that.
Reviewed By: ngorogiannis
Differential Revision: D8661763
fbshipit-source-id: d44f7a7
Summary: One step towards elminating the unsafe cache of procedure summaries in `Summary` and only rely on the one in `Ondemand`
Reviewed By: dulmarod, jvillard
Differential Revision: D8619782
fbshipit-source-id: 4096390
Summary:
Replace the previous outputting of "." and "F" with an actual progress bar and
a multiline display of what procedure each process is currently busy analysing.
Observe:
```lang=text
Found 19 source files to analyze in /home/jul/code/openssl-1.1.0d/infer-out
7/19 [######################......................................] 36%
⊢ [ 1.14s] crypto/mem.c: CRYPTO_malloc
⊢ [ 1.68s] crypto/o_time.c: julian_adj
⊢ [ 0.50s] crypto/mem.c: CRYPTO_zalloc
⊢ [ 1.80s] crypto/o_str.c: OPENSSL_strlcpy
```
This works by setting up a worker pool (as before) that waits to receive jobs
(not as before: we used to fork for each new job). Unix pipes are used for
communication.
The new worker pool can be used to experiment with other concurrency models,
such as reviving per-procedure-parallelism, or making sure each procedure is
analysed only once.
Perf tests indicate that this version is no slower than the previous one,
either on laptops or devserver: about 3% worse user time but ~40% better system time.
This new version forks <jobs> processes whereas the previous version would
fork `O(number of source files)` times.
`infer -j 1` shows a progress bar that doesn't update timing info (because it
would need a second process to do that).
Reviewed By: mbouaziz
Differential Revision: D8517507
fbshipit-source-id: c8ca104
Summary: I don't understand what this function is for. Let's remove it.
Reviewed By: mbouaziz
Differential Revision: D8320839
fbshipit-source-id: eeb14f7
Summary: There is a number of dangling pointer dereference false positives coming from our treatment of union in c/cpp. For now, do not treat union fields as uninitialised.
Reviewed By: mbouaziz
Differential Revision: D8279802
fbshipit-source-id: a339b0e
Summary:
Now the result won't depend on the visit order of instructions
Depends on D8235834
Reviewed By: jeremydubreil
Differential Revision: D8235907
fbshipit-source-id: a6eb469
Summary:
Having the `Node` module including in the `CFG` one is confusing.
Let's keep it separate.
Reviewed By: ngorogiannis
Differential Revision: D8185754
fbshipit-source-id: 62077e6
Summary:
Change the license of the source code from BSD + PATENTS to MIT.
Change `checkCopyright` to reflect the new license and learn some new file
types.
Generated with:
```
git grep BSD | xargs -n 1 ./scripts/checkCopyright -i
```
Reviewed By: jeremydubreil, mbouaziz, jberdine
Differential Revision: D8071249
fbshipit-source-id: 97ca23a
Summary:
For now: just moving this list behind an abstract type.
Next: changing the internal representation.
Reviewed By: ngorogiannis
Differential Revision: D8140926
fbshipit-source-id: 5b959b0
Summary:
We never really need the list of nodes/succs/preds, we only need to fold over them.
This will reduce garbage for computed lists like in the Exceptional CFG or the OneInstrPerNode CFG.
Reviewed By: ngorogiannis
Differential Revision: D8185665
fbshipit-source-id: d042beb
Summary:
The main motivation is preparing for a future change.
This also reduces lifetime of potential garbage.
Reviewed By: jeremydubreil
Differential Revision: D8185648
fbshipit-source-id: 6d0a568
Summary:
Preparing for bigger changes...
- Rename `payload` field to `payloads`
- Move `payload` type to `Payloads.t`
- `SummaryPayload`s only have to implement a change on `Payloads.t` rather than `Summary.t`
Reviewed By: sblackshear
Differential Revision: D7987211
fbshipit-source-id: c9d7a74
Summary:
Attempt at a better naming scheme:
- `Specs.summary` are now `Summary.t`. The `Summary` module (replacing `Specs`) contains the summary of a procedure: the results of all the analyses, etc.
- `Summary.ml` is now `SummaryPayload.ml`. This concerns how each (AI) analysis extracts its payload from the master summary.
- Accordingly, checkers now define a `Payload` module where previously they defined a `Summary` module. The type is also cleaned up to use `t` instead of `payload`, etc.
- Cleaned up some names as a result, for instance `Specs.get_summary` -> `Summary.get`, etc.
Reviewed By: ngorogiannis
Differential Revision: D7935883
fbshipit-source-id: 1766545
Summary:
Move the biabduction-specific payloads (the "`'a spec`" stuff) from specs.ml
into a new `BiabductinoSummary` module, similar to other checkers.
Reviewed By: ngorogiannis
Differential Revision: D7935815
fbshipit-source-id: bdff3b9
Summary:
This is an attempt to make things more consistent, and maybe save some work
from the `Format` module in case flambda doesn't have our backs.
Reviewed By: jberdine
Differential Revision: D7775496
fbshipit-source-id: 59a6314
Summary:
This simplifies the frontends and backends in most cases. Before this diff,
returning `void` could be modelled either with a `None` return, or a dummy
return variable with type `Tvoid`. Now it's always the latter.
Reviewed By: sblackshear, dulmarod
Differential Revision: D7832938
fbshipit-source-id: 0a403d1
Summary: Returning the list of sub-expressions is not right and can cause assertion failures elsewhere in the frontend.
Reviewed By: dulmarod
Differential Revision: D7813493
fbshipit-source-id: 33ac9c1
Summary:
Add warning 60 (unused module) to the list of fatal warnings. Whitelisting
modules at toplevel is tricky (see inline comments) but doable.
Reviewed By: mbouaziz
Differential Revision: D7790073
fbshipit-source-id: 6f591c4
Summary:
Upgrade ocamlformat, and base which needs to be done in sync in order to build
ocamlformat, and the other deps can come for the ride.
Reviewed By: jvillard
Differential Revision: D7663537
fbshipit-source-id: 3e90970
Summary: Currently when we look for already abduced expression and find an assertion [exp|->strexp:typexp], we use typexp rather than strexp.
Reviewed By: sblackshear
Differential Revision: D7617193
fbshipit-source-id: c089720
Summary:
Now that everything can run at the same time and we have preanalyses, it can be quite hard to read debug sessions.
Here come session names!
Depends on D7607336
Reviewed By: sblackshear
Differential Revision: D7607481
fbshipit-source-id: 676af86
Summary:
There's actually a nice separation between IR/, base/, istd/, and the rest of
infer, so they can be made into separate jbuilder libraries so that the
separation remains. This helps make sense of the infer codebase.
Also:
- move everything biabduction-related out of backend/ and into a new
biabduction/ directory. This clarifies the current situation where backend/
contains a mix of analysis-independent code (still there now), and
biabduction-specific code (moved to biabduction/).
- move everything from base/ that is not infer-specific into istd/, e.g. IList.ml
- kill unused `FbTraceCalls`
- A couple of files needed to move around to complete the separation of base/ and IR/
Reviewed By: mbouaziz
Differential Revision: D7381842
fbshipit-source-id: cd86dea