Summary: More preparation for extending HIL with dereference and address of. We need left hand side of the assignment to also include dereference and address of.
Reviewed By: sblackshear
Differential Revision: D6976150
fbshipit-source-id: 47d1d76
Summary: Preparing to extend HIL with Dereference and AddressOf expressions. Next steps: (1) change SIL -> HIL translation to preserve address of and dereference; (2) adapt analyses based on HIL to make use access expressions.
Reviewed By: jeremydubreil
Differential Revision: D6961928
fbshipit-source-id: 51da919
Summary: Some tags like `Bucket` are used, but a lot are just added to the list of tags and never read.
Reviewed By: mbouaziz
Differential Revision: D6886980
fbshipit-source-id: 4474d7f
Summary:
There's a lot of code for building up and moving around `Tags`.
When working on cleaning up some of the `Errlog` code, I noticed that `Tags` were included in the JSON and wondered why.
The answer is suprisingly just one thing: only the line tags get used, and even then they are only used to decide what frame to select as the start frame for the trace (i.e., the one that is highlighted first).
That seems like overkill; starting on trace on the actual line where the error occurs, starting at the beginning of the procedure where the error occurs, or starting at the first line of the trace all seem equally reasonable.
If we are happy with any of these alternatives, we can kill `Tags` altogether and potentially save a decent amount space in our JSON artifacts.
Reviewed By: mbouaziz
Differential Revision: D6876752
fbshipit-source-id: 1580127
Summary:
- Combine two fields from ProcAttributes.t into a single field `method_kind` with more information
- New field details whether the procedure is an `OBJC_INSTANCE`, `CPP_INSTANCE`, `OBJ_CLASS`, `CPP_CLASS`, `BLOCK`, or `C_FUNCTION`
- `is_objc_instance_method` and `is_cpp_instance_method` fields no longer necessary
- Changed `is_instance` field in CMethod_signature to `method_kind` field of type ProcAttributes.method_kind
Reviewed By: dulmarod
Differential Revision: D6884402
fbshipit-source-id: 4b916c3
Summary:
- small optimization by starting deconstructing procnames/types in the dispatcher rather than the matchers
- as a consequence, returns fast for unhandled constructs like Java procnames or types
- Java is still not handled but at least does not crash
- re-enable Inferbo for Java
Reviewed By: jberdine
Differential Revision: D6912304
fbshipit-source-id: 76e95a8
Summary:
Record "capture phases" in the runstate and in the source files table of the
database. Use this instead of filesystem timestamps to decide which files need
re-analyzing in the reactive analysis.
Reviewed By: jeremydubreil
Differential Revision: D6760833
fbshipit-source-id: 7955621
Summary:
- During backend execution, infer will log detailed stats about procedure analysis
- Logging is integrated with EventLogger
- `events_to_log` field added to Stats.t record in InferPrint
- New format in InferPrint - Logs
- `format_list` type changed to have a Utils.Outfile option to support Logs format
Reviewed By: dulmarod
Differential Revision: D6834538
fbshipit-source-id: 8c847f5
Summary:
Make dead code detection part of `make test` so that dead code stops creeping
in. It's only enabled if all the analysers are enabled and if this is a
facebook build, because the dead code detection will have false positives
otherwise.
Reviewed By: mbouaziz
Differential Revision: D6807395
fbshipit-source-id: ebbd835
Summary:
I needed to do this for something, now I don't know if I want to do the thing
anymore but this seems generally useful to decrease a little bit the size of
Config.ml.
Reviewed By: sblackshear, mbouaziz
Differential Revision: D6796427
fbshipit-source-id: d9c009d
Summary:
Also, make it explicit when we load the global tenv instead of the per-file tenv.
This allows for some nice simplifications in some places, notably:
- `tenv_file` is gone from `Exe_env.file_data`
- `DB.global_tenv_fname` is no more
This will help moving the tenv from the capture/source_file/ directories on the
filesystem to the database, as keys for the relevant table are `SourceFile.t`.
Reviewed By: mbouaziz
Differential Revision: D6796594
fbshipit-source-id: 1ffd5b0
Summary:
They were constructed for each source file, and then joined into a global call
graph, only to get per-file lists of procedures. A tad wasteful.
Get this list from cfgs instead. Still record them in `exe_env` for now as
changing that code is a whole other beast.
One test falls victim of the flakiness of the analysis of recursive functions.
Reviewed By: jeremydubreil, mbouaziz
Differential Revision: D6324268
fbshipit-source-id: d5ff58f
Summary:
In preparation for getting rid of call graphs, we need to find another way to
get the list of defined procedures (which is the only place where we use the
globally-computed call graph for now).
The natural way to get the list of procedures defined in a file is to load the
cfg for that file and look at the proc names that are the keys of the cfg. This
is way too expensive, as the CFG is big. Thus, we cache this list of proc names
as another column in the SQLite database of cfgs. This gives good performance
in benchmarks.
Reviewed By: jeremydubreil
Differential Revision: D6621142
fbshipit-source-id: ed265fe
Summary: At each call to `Component$Builder.build()`, checks that the required props for `Component` have been set via prior calls to the `Builder`. Does not yet handle `Prop(optional = true)`, but will address that in a follow-up.
Reviewed By: jeremydubreil
Differential Revision: D6735524
fbshipit-source-id: 0c812fd
Summary:
Was trying to decide where to add a new Java utility function and realized that things are a bit disorganized.
Some operations on `Typ.Name.t`'s live in `Typ.Procname`, and some live inside an inner `Java` module whereas some are outside of the module with a `java_` prefix.
Let's move toward putting all Java/C/Objc/C++-specific functions in dedicated modules.
This diff does some of the work for Java.
There are Java-specific functions that operate on `Typ.Procname.t`'s that will have to be converted to work on `Typ.Procname.Java.t`'s, but changing those clients will be more involved.
Will also move C/Objc/C++ functions in a follow-up.
Reviewed By: jeremydubreil
Differential Revision: D6737724
fbshipit-source-id: cdd6e68
Summary: Use the Hashtbl functions directly as `Cfg` knows that a cfg is a hashtbl.
Reviewed By: sblackshear, jeremydubreil
Differential Revision: D6727732
fbshipit-source-id: 2cdda91
Summary:
`&::.*-->` allows to match any path end.
Used for models of `std::array` to force unmodelled functions (and types) to have a Skip summary
Depends on D6408415
Reviewed By: jvillard
Differential Revision: D6611203
fbshipit-source-id: 6663b2c
Summary:
Found the dead code with the script in the next commit, iteratively until no
warnings remained.
Methodology:
1. I kept pretty-printers for values, which can be useful to use from infer's REPL (or
when printf-debugging infer in general)
2. I kept functions that formed some consistent API (but not often, so YMMV), for instance if it looked like `Set.S`, or if it provides utility functions for stuff in development (mostly the procname dispatcher functions)
3. I tried not to lose comments associated with values no longer exported: if the value is commented in the .mli and not the .ml, I moved the comment
4. Some comments needed updating (not claiming I caught all of those)
5. Sometimes I rewrote the comments a bit when I noticed mis-attached comments
Reviewed By: mbouaziz
Differential Revision: D6723482
fbshipit-source-id: eabaafd
Summary:
In Java, static variables are distinguished by package/class:
the file where they are defined doesn't matter.
Fixes#831.
Closes https://github.com/facebook/infer/pull/833
Reviewed By: jeremydubreil
Differential Revision: D6661240
Pulled By: sblackshear
fbshipit-source-id: beeb2f9
Summary:
This avoids relying on the directories in infer-out/captured/ being created,
and instead gets the list of captured source files from the DB. This gives a
better type to clusters: `SourceFile.t` instead of `DB.source_dir`, which makes
the code a bit nicer too.
Reviewed By: jeremydubreil
Differential Revision: D6620460
fbshipit-source-id: c0edbf6
Summary: Get the error message from the database when there's an error, together with the error type.
Reviewed By: mbouaziz
Differential Revision: D6621695
fbshipit-source-id: 6bc706d
Summary:
This makes sure that sqlite doesn't hold read locks for longer than necessary,
which could starve the process of cleaning up the WAL file. This ensures that
the statement is reset as soon as we're done reading.
I haven't observed a difference with this change, and could not find evidence
that it should change something in the docs. Internet wisdom pointed at this as
a potential issue and I was observing it in another change, so it's good to
rule it out.
Reviewed By: mbouaziz
Differential Revision: D6404353
fbshipit-source-id: a123cd6
Summary: This should avoid making copies of procedure descriptions which are mutable data-stuctures.
Reviewed By: sblackshear
Differential Revision: D6658527
fbshipit-source-id: 688a142
Summary: There was several implementations of the same function accross the codebase
Reviewed By: sblackshear
Differential Revision: D6658266
fbshipit-source-id: e12507b
Summary:
Upgrade ocamlformat to 0.3, and (necessarily) base to v0.10.0.
- Fix accumulated mis-formatting
- Update opam.lock to unbreak clean build
- Update to base v0.10.0
- Update opam.lock for base
- Update offline opam repo
- Everyone should already have removed their ocamlformat pin
- ocamlformat 0.3 supports output to stdout natively
- bump version of ocamlformat
Reviewed By: jeremydubreil
Differential Revision: D6636741
fbshipit-source-id: 41a56a8
Summary:
Model for `folly::split` that handles the representation in the cpp model.
Depends on D6544992
Reviewed By: jvillard
Differential Revision: D6545006
fbshipit-source-id: 2b7a139
Summary:
Allows:
- matching function arguments with or without capturing,
- capturing part of an argument, e.g. expression only,
- optional arguments, wrapped into an OCaml option if captured.
Reviewed By: jvillard
Differential Revision: D6544992
fbshipit-source-id: a64ba45
Summary: This is to allow the bi-abduction analysis and the nullable checker for Clang languages to run together without stepping on each other toes.
Reviewed By: sblackshear
Differential Revision: D6567934
fbshipit-source-id: a318c33
Summary: This factors out some duplicated code for {,de}serializing source files.
Reviewed By: mbouaziz
Differential Revision: D6324234
fbshipit-source-id: 1741657
Summary:
Instead of storing the cfgs of source files inside their own individual files,
put them in results.db, in their own table. (that table may change in the
future to map source files to more than just their cfgs, eg their tenv as well)
Reviewed By: jberdine
Differential Revision: D6297201
fbshipit-source-id: 7fa891d
Summary: There was a back and forth conversion between `string` and `IssueType.t` which was not necessary.
Reviewed By: sblackshear
Differential Revision: D6562747
fbshipit-source-id: 70b57a2
Summary:
This diff adds a layer of report deduplication logic in addition to
the existing scheme.
Suppose issue 1 with trace1a and trace1b, and issue 2 with trace2a and
trace2b. If trace1a ends at the same location as trace2a (resp.,
trace2b) and trace1b ends at the same location as trace2b (resp.,
trace2a), then consider issues 1 and 2 to be duplicates.
This chooses to report the issue with the smaller sum of trace
lengths, breaking ties using the issue hashes, and eventually the
entire issue. Therefore there is a potential for flakiness with
respect the the choice of which report to make within a
hash-equivalence class.
Reviewed By: sblackshear
Differential Revision: D6519607
fbshipit-source-id: 63210ab
Summary: As Dulma pointed out, adding or removing paramters in a method in Objective C is changing the name of the method. Such changes should not make pre-exisiting issues reported as introduced. This diff is to prevent this by only keeping in the bug hash the part of the name that is before the first colon.
Reviewed By: dulmarod
Differential Revision: D6491215
fbshipit-source-id: 3c00fae
Summary: I always get confused by `accessPath.ml` not being next to HIL when trying to open files
Reviewed By: sblackshear
Differential Revision: D6462980
fbshipit-source-id: 8ba9b71
Summary:
It seems that the abstraction instructions were not previously added the the CFG.
This is a functional changes to make sure that the abstraction state is always added. We can simplify the code later and just run this step before storing the CFG instead of after loading them.
Reviewed By: sblackshear, jvillard
Differential Revision: D6383672
fbshipit-source-id: cedcb8a
Summary:
Deduping issues when generating a single report and then diffing the
reports can lead to introduced issues being considered duplicates of
existing issues.
Reviewed By: sblackshear
Differential Revision: D6414673
fbshipit-source-id: bba81fd
Summary:
Extends `ProcnameDispatcher` to allow matching typenames only.
There isn't much new here, mainly moving stuff so that we only have to open one module to use the operators.
Reviewed By: skcho
Differential Revision: D6408245
fbshipit-source-id: afc6533
Summary:
The diff is very big but it's mostly removing code. It was inspired by the fact that we were getting Dead Store FPs because we were modeling some functions from CoreFoundation and CoreGraphics directly as alloc in the frontend, which caused the parameters of the function to be seen as dead. See the new test.
To deal with this, if we are going to skip the function, we model it as malloc instead. Given how many models we had for those "model as malloc" functions, I removed them to rely solely on the new mechanism.
The modeling of malloc and release was still based on the old retain count implementation, even though all we do here is a malloc/free kind of analysis. I also changed
that to be actually malloc/free which removed many Assert false in the tests. CFRelease is not exactly free though, and it's possible to use the variable afterwards. So used a custom free builtin that only cares about removing the Memory attribute and focuses on minimizing Memory Leaks FPs.
Otherwise we were translating CFBridgingRelease as a special cast, and this wasn't working. To simplify this as well, I removed all the code for the special cast, and just modeled CFBridgingRelease and CFAutorelease also as free_cf, to avoid Memory Leak false positives. I also treated the cast __bridge_transfer as a free_cf model. This means we stopped trying to report Memory Leaks on those objects.
The modeling of CoreGraph release functions was done in the frontend, but seemed simpler to also simplify that code and model all the relevant functions.
Reviewed By: sblackshear
Differential Revision: D6397150
fbshipit-source-id: b1dc636
Summary: There is a lot of code to create LaTeX output of the Infer datastructures, but this does not seem to be used anymore.
Reviewed By: jvillard
Differential Revision: D6355686
fbshipit-source-id: 55de8e9
Summary:
This resolves#796 . Effectively it adds file specific suffix to name of all global initializers (so initializersof two global variable of the same name will have unique Typ.Procname). which is the same rule as currently used by constructing Procname for the static functions. However this change applies to initializers of all global variables and not just static (arguably it's a right thing. since GCC used to allow multiple global variables with the same name).
Consequences of this change that it becomes impossible to know name of generated initialization function of global ('extern') variables. However get_initializer_pname function is only referenced by the frontend (when creating initializer for the defined global variables) and by the SIOF checker.
Closes https://github.com/facebook/infer/pull/801
Reviewed By: jvillard
Differential Revision: D6335034
Pulled By: dulmarod
fbshipit-source-id: 1a92c08
Summary:
Allow capturing function arguments.
Model functions don't have to match on a list any more.
Depends on D6347829
Reviewed By: jvillard
Differential Revision: D6350628
fbshipit-source-id: e88b758
Summary: When not matching overloads, when the wrong number of arguments is given, instead of just no matching the function, we may want to fail, e.g. for internal-use functions.
Reviewed By: jvillard
Differential Revision: D6347829
fbshipit-source-id: 48f41be
Summary:
Naming a variable `_foo` makes the compiler not warn about them if they are
unused, but there are lots of instances of such variables in the code where
they are in fact used, defeating the warning and introducing confusion for
those used to this naming convention.
Basically `sed -i -e "s/ _\([a-zA-Z][a-zA-Z0-9_']*\)/ \1_/g" **/*.ml` followed
by manual fixing of compilation errors (lots of `compare__foo` ->
`compare_foo_`).
Reviewed By: mbouaziz
Differential Revision: D6358837
fbshipit-source-id: 7ffb4ac
Summary: This information is already available in the procedure name.
Reviewed By: jeremydubreil, jvillard
Differential Revision: D6119459
fbshipit-source-id: f07bfde
Summary:
First steps of a dispatcher for C++ functions/methods overloads.
For now only used on Inferbo C modeled functions so most of the features are still unused.
Reviewed By: jvillard
Differential Revision: D6336088
fbshipit-source-id: ebd5b6f
Summary:
We need to use the procedure description of the callees for lazy dynamic dispatch and for the resolution of the lambda. We may also need this information in other analyses, e.g. for RacerD. This diff makes the procedure description of the callees as part of the summary.
The procedure description has been part of the summary for a while already without noticeable decrease in performance.
Reviewed By: mbouaziz
Differential Revision: D6322038
fbshipit-source-id: 84101cb
Summary: In the translation from SIL to HIL we ignore the right-hand side expression if it consists of a single access path, e.g. unary operator. This diff preserves the right-hand side expression.
Reviewed By: sblackshear
Differential Revision: D6271814
fbshipit-source-id: c27e913
Summary:
Change ocamlformat installation procedure to use opam instead of
pinning.
Reformat all code with v0.2, which has a few improvements.
Reviewed By: jvillard
Differential Revision: D6292057
fbshipit-source-id: 759967f
Summary:
This diff adds a new way of executing blocks when they are passed as parameters to a method. So far we just skipped the block in this case.
Now we can execute it. Let's demonstrate with an example. Say we have
//foo has a block parameter that it executes in its body
foo (Block block) { block();}
// bar calls foo with a concrete block
bar() {
foo (^(){
self->x = 10;
});
};
Now, when we call the method foo with a concrete block, we create a copy of foo instantiated with the concrete block, which in itself is translated as a method with a made-up name.
The copy of foo will get a name that is foo extended with the name of the block parameter, the call to the block parameter will be replaced to a call to the concrete block, and the captured variables
of the concrete block (self in this case), will be added to the formals of the specialized method foo_block_name.
This is turned on at the moment for ObjC methods with ObjC blocks as parameters, and called with concrete blocks. Later on we can extend it to other types of methods, and to C++ lambdas, that are handled similarly to blocks.
Another extension is to check when the block has been called with nil instead of an actual block, and raise an error in that case.
After this diff, we can also model various methods and functions from the standard library that take blocks as parameters, and remove frontend hacks to deal with that.
Reviewed By: ddino
Differential Revision: D6260792
fbshipit-source-id: 0b6f22e
Summary:
When fuzzy-matching cpp names, allow to match only a prefix of
blacklist entries.
Reviewed By: da319
Differential Revision: D6233055
fbshipit-source-id: a3a4913
Summary:
When C++ functions are translated to SIL procedures, their type is C rather then C++. In RacerD, we want to treat C++ functions the same as C++ methods.
Added a function to check if the procedure is Objc/Objc++/C/C++.
Reviewed By: sblackshear
Differential Revision: D6209523
fbshipit-source-id: 293f938
Summary:
:
As we want to model many C++ methods, using a lot of matchers with `if / else if` will be tiring.
This diff introduces a dispatcher which is a nicer way to write the same thing.
No new model for now, just a refactoring.
Ideally we'd need a parser generator for C++ names...
Reviewed By: jvillard
Differential Revision: D6209234
fbshipit-source-id: 49fae5e
Summary:
If you write
```
boolean readUnderLockOk() {
synchronized (mLock) {
return mField;
}
}
```
it will be turned into
```
lock()
irvar0 = mField
unlock()
return irvar0
```
in the bytecode. Since HIL eliminates reads/writes to temporaries, it will make the above code appear to perform a read of `mField` outside of the lock.
This diff fixes the problem by forcing HIL to perform all pending reads/writes before you exit a critical section.
Reviewed By: jberdine
Differential Revision: D6138749
fbshipit-source-id: e8ad9a0
Summary: In HIL, allow deref'ing a magic address like `0xdeadbeef` for debugging purposes. Previously, we would crash on code like this.
Reviewed By: mbouaziz
Differential Revision: D6143802
fbshipit-source-id: 4151924
Summary: This check is deprecated and will be replaced by a dedicated checker to detect unitialized values.
Reviewed By: mbouaziz
Differential Revision: D6133108
fbshipit-source-id: 1c0e9ac
Summary:
Install ocamlformat from github as part of `make devsetup`, and use it
for formatting OCaml (and jbuild) code.
Reviewed By: jvillard
Differential Revision: D6092464
fbshipit-source-id: 4ba0845
Summary:
This will allow most of the checkers, except the bi-abduction, to skip the analysis on the specialized clone of the methods used to handle dynamic dispatch. Doing this, we can run the bi-abduction analysis using:
infer -a checkers --biabduction
without risk of conflicts on the resolution of dynamic dispatch.
Reviewed By: sblackshear
Differential Revision: D6052347
fbshipit-source-id: 0c75bf3
Summary: This removes cases of duplicated warnings when the dynamic dispatch handling specializes a method Infer already reported on.
Reviewed By: sblackshear
Differential Revision: D6060337
fbshipit-source-id: dbefeca
Summary:
9c7fc65 introduced a large performance regression, this diff eliminates it and a bit more.
Instead of constructing the quotiented access list map in a two-step process of first constructing a map of all accesses and then quotienting it, the quotiented map is constructed directly by using a coarser comparison function on keys. Partitioning the access map O(number of access paths) times, using an apparently expensive partition predicate, seems to be causing trouble based on rough profile data.
Reviewed By: da319
Differential Revision: D6005262
fbshipit-source-id: 077846c
Summary:
`pp_instr_list` was not tailrec causing a stack overflow on big code.
Also simplified a few things
Reviewed By: jvillard
Differential Revision: D5995451
fbshipit-source-id: 40a4911
Summary: The may alias analysis relation that the thread safety analysis uses is very specific to Java and causes many false alarms for C++ code. This diff disables it when analyzing C++ code. Improving it to avoid false negatives is left for later.
Reviewed By: sblackshear
Differential Revision: D5974182
fbshipit-source-id: 9c7fc65
Summary:
The previous domain for SIOF was duplicating some work with the generic Trace
domain, and basically was a bit confused and confusing. A sink was a set of
global accesses, and a state contains a set of sinks. Then the checker has to
needlessly jump through hoops to normalize this set of sets of accesses into a
set of accesses.
The new domain has one sink = one access, as suggested by sblackshear. This simplifies
a few things, and makes the dedup logic much easier: just grab the first report
of the list of reports for a function.
We only report on the fake procedures generated to initialise a global, and the
filtering means that we keep only one report per global.
Reviewed By: sblackshear
Differential Revision: D5932138
fbshipit-source-id: acb7285
Summary: If we know for sure we won't need to store an attribute in the DB, there's no need to compute its marshalled value.
Reviewed By: jberdine
Differential Revision: D5891050
fbshipit-source-id: cf4534e
Summary:
This adds more structure to the SQL schema backing attributes. With that, we
can transfer the logic for updating attributes in SQLite, instead of doing
optimistic concurrency in the client.
Reviewed By: jberdine
Differential Revision: D5891038
fbshipit-source-id: 6577ba2
Summary:
- use a similar key as for specs in the attributes table
- cache blob computations
- this improves memory usage a lot
Reviewed By: mbouaziz
Differential Revision: D5824177
fbshipit-source-id: c318577
Summary:
Use an SQLite database to store proc attributes, instead of files on disk.
Wrap SQLite operations in two layers:
1. `SqliteUtils` provides helper functions to make sure DB operations succeed
2. `KeyValue` provides a functor to expose a simple and type-safe key/value store backed by the SQLite DB.
Reviewed By: jberdine
Differential Revision: D5640053
fbshipit-source-id: 31050e5
Summary: Remove functions that are redundant with Base.List functions on sorted lists.
Reviewed By: mbouaziz
Differential Revision: D5931792
fbshipit-source-id: caec210
Summary: The case of closures was not considered for the convertion of SIL instructions into HIL instructions
Reviewed By: sblackshear
Differential Revision: D5929675
fbshipit-source-id: bb6920a
Summary:
Calling functions that raise exceptions (even if they get caught) may smudge
the backtraces we get from OCaml. We need to record the original backtrace
*before* calling such fuctions on the path between catching an exception and
reraising it.
Also change the heptuple returned by `Exceptions.recognize_exception` into a
record type, and make that function not raise when classifying exceptions.
Reviewed By: jberdine
Differential Revision: D5882934
fbshipit-source-id: 8e99fe8
Summary:
We need to make sure that destructors of virtual base classes are called only once. Similarly to what clang does, we have two destructors for a class: a destructor wrapper and an inner destructor.
Destructor wrapper is called from outside, i.e., when variables go out of scope or when destructors of fields are being called.
Destructor wrappers have calls to inner destructors of all virtual base classes in the inheritance as their bodies.
Inner destructors have destructor bodies and calls to destructor wrappers of fields and inner destructors of non-virtual base classes.
Reviewed By: dulmarod
Differential Revision: D5834555
fbshipit-source-id: 51db238
Summary:
The "placement new" operator `new (e) T` constructs a `T` in the pre-allocated memory address `e`.
We weren't translating the `e` part, which was leading to false positives in the dead store analysis.
Reviewed By: dulmarod
Differential Revision: D5814191
fbshipit-source-id: 05c6fa9
Summary:
Simple instance of the problem: analyzing the following program times out.
```
#include <tuple>
void foo() {
std::tuple<std::tuple<int>> x;
}
```
Replacing `std::tuple<std::tuple<int>>` by `std::tuple<int>` makes the analysis
terminate.
In the AST, both tuple<tuple<int>> and tuple<int> have the same template
specialization type: "Pack" (which means we're supposed to go look into the
arguments of the template to get their values). This is not information enough
and that's the plugin fault.
On the backend side, this means that two types have the same Typ.Name.t, namely
"std::tuple<_>", so they collide in the tenv. The definition of
tuple<tuple<int>> is the one making it into the tenv. One of the fields of the
corresponding CxxRecord is of type "tuple<int>", which we see as the same
"tuple<_>", which causes the loop.
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz
Differential Revision: D5775840
fbshipit-source-id: 0528604
Summary:
The reporting phases iterates over each procedure summary and print all the issues from each procedure.
That's nice because we don't have to build a big list of the issues in-memory, but it's not so nice if you want to ouput the reports in a certain order or de-duplicate them.
This diff builds the in-memory list and outputs the issues afterward. By itself, this isn't very useful. But in the near future it will allow us to:
- Group all of the issues from the same file (finally!!!)
- Get rid of duplicate issues on multiple instantiations of the same C++ template
- Probably other cool stuff too
Reviewed By: jeremydubreil, mbouaziz
Differential Revision: D5816646
fbshipit-source-id: 799bcd0
Summary: Try to preserve the original backtrace. Introduce `reraise` in the global namespace.
Reviewed By: jberdine
Differential Revision: D5804121
fbshipit-source-id: 0947a47
Summary:
We supported globals as sources before, but we did so by allowing ClangTrace etc. to match against any access path in the footprint of the trace.
This is very powerful/flexible, but it's ultimately not a good idea because it leads to traces that are hard to read.
This is because a footprint source doesn't have any information about its provenance: we might know that the value came from a global, but we don't know where the read occurred.
The mechanism for handling procedure calls as sources already knows how to solve this problem.
This diff implements globals as sources as a special case of procedure call sources instead.
This will give us much nicer traces with full provenance of the read from the global.
Reviewed By: mbouaziz
Differential Revision: D5772299
fbshipit-source-id: 491ae81
Summary: This is almost equivalent to the previous one except in the case where new and old are both undefined: before, we would just pick "old", but now we pick the biggest according to their source files. I think the previous behaviour was a bug because it was more non-deterministic.
Reviewed By: jberdine
Differential Revision: D5649481
fbshipit-source-id: aeb527d
Summary:
- failwith police: no more `failwith`. Instead, use `Logging.die`.
- Introduce the `SimpleLogging` module for dying from modules where `Logging`
cannot be used (usually because that would create a cyclic dependency).
- always log backtraces, and show backtraces on the console except for usage errors
- Also point out in the log file where the toplevel executions of infer happen
Reviewed By: jeremydubreil
Differential Revision: D5726362
fbshipit-source-id: d7a01fc
Summary:
Instead of a whitelist and blacklist and default issue types and default
blacklist and filtering, consider a simpler semantics where
1. checkers can be individually turned on or off on the command line
2. most checkers are on by default
3. `--no-filtering` turns all issue types on, but they can then be turned off again by further arguments
This provides a more flexible CLI and is similar to other options in the infer
CLI, where "global" behaviour is generally avoided.
Dynamically created checkers (eg, AL linters) cause some complications in the
implementation but I think the semantics is still clear.
Also change the name of the option to mention "issue types" instead of
"checks", since the latter can be confused with "checkers".
Reviewed By: jberdine
Differential Revision: D5583238
fbshipit-source-id: 21de476
Summary:
Use jbuilder to build infer instead of ocamlbuild. This is mainly to get faster builds:
```
times in 10ms, ±differences measured in speedups, 4 cores
| | ocb total | jb | ±total | ocb user | jb | ±user | ocb cpu | jb | ±cpu | ocb sys | jb | ±sys |
|-----------------------------------+-----------+------+--------+----------+------+-------+---------+-----+------+---------+------+------|
| byte from scratch | 6428 | 2456 | 2.62 | 7743 | 6662 | 1.16 | 138 | 331 | 2.40 | 1184 | 1477 | 0.80 |
| native from scratch | 9841 | 4289 | 2.29 | 9530 | 8834 | 1.08 | 110 | 245 | 2.23 | 1373 | 1712 | 0.80 |
| byte after native | 29578 | 1602 | 18.46 | 4514 | 4640 | 0.97 | 170 | 325 | 1.91 | 543 | 576 | 0.94 |
| change infer.ml byte | 344 | 282 | 1.22 | 292 | 215 | 1.36 | 96 | 99 | 1.03 | 040 | 066 | 0.61 |
| change infer.ml native | 837 | 223 | 3.75 | 789 | 174 | 4.53 | 98 | 99 | 1.01 | 036 | 47 | 0.77 |
| change Config.ml byte | 451 | 339 | 1.33 | 382 | 336 | 1.14 | 97 | 122 | 1.26 | 056 | 80 | 0.70 |
| change Config.ml native | 4024 | 1760 | 2.29 | 4585 | 4225 | 1.09 | 127 | 276 | 2.17 | 559 | 644 | 0.87 |
| change cFrontend_config.ml byte | 348 | 643 | 0.54 | 297 | 330 | 0.90 | 96 | 67 | 0.70 | 038 | 102 | 0.37 |
| change cFrontend_config.ml native | 1480 | 584 | 2.53 | 1435 | 906 | 1.58 | 106 | 185 | 1.75 | 136 | 178 | 0.76 |
#+TBLFM: $4=$2/$3;f2::$7=$5/$6;f2::$10=$9/$8;f2::$13=$11/$12;f2
50 cores
| | ocb total | jb | ±total | ocb user | jb | ±user | ocb cpu | jb | ±cpu | ocb sys | jb | ±sys |
|---------------------+-----------+------+--------+----------+------+-------+---------+----+------+---------+------+------|
| byte from scratch | 9114 | 2061 | 4.42 | 9334 | 5133 | 1.82 | | | 0/0 | 2566 | 1726 | 1.49 |
| native from scratch | 13481 | 3967 | 3.40 | 12291 | 7608 | 1.62 | | | 0/0 | 3003 | 2100 | 1.43 |
| byte after native | 3467 | 1476 | 2.35 | 5067 | 3912 | 1.30 | | | 0/0 | 971 | 801 | 1.21 |
#+TBLFM: $4=$2/$3;f2::$7=$5/$6;f2::$10=$9/$8;f2::$13=$11/$12;f2
```
Menu:
1. Write a jbuild file, autogenerated from jbuild.in because we need to fill in
some information at build-time (really, at configure time, but TODO), such as
whether or not clang is enabled.
2. Nuke lots of stuff from infer/src/Makefile that is now in the jbuild file
3. The jbuild file lives in infer/src/ so it can see all the sources. If we put it somewhere else, eg, infer/, then `jbuilder` scans too many files (all irrelevant) and takes 2.5s to start instead of .8s. Adding irrelevant directories to jbuild-ignore does not help.
4. jbuilder does not support subdirectories, so resort to listing all the
source files in the generated jbuild (only source directories need to be
manually listed in jbuild.in though). Still, the generated .merlin is wrong
and makes merlin find source files in _build, so manually tune it to get
good merlin support. We also lose some of merlin for unit tests as it
cannot see their build artefacts anymore.
5. checkCopyright gets its own jbuild because it's standalone. Also, remove
some deprecation warnings in checkCopyright due to the new version of Core from
a while ago.
6. Drop less-used Makefile features (they had regressed anyway) such as
building individual modules. Also, building mod_dep.pdf now takes all the
source files available so they better build (before, it would only take the
source files from the config, eg with or without clang) (that's pretty minor).
7. The toplevel is now built as a custom toplevel because that was easier. It
should soon be even easier: https://github.com/janestreet/jbuilder/issues/210
8. Move BUILTINS.mli to BUILTINS.ml because jbuilder is not happy about
interface files without implementations.
In particular, I did not try to migrate too much of the Makefile logic to jbuilder,
more can be done in the future.
Reviewed By: jberdine
Differential Revision: D5573661
fbshipit-source-id: 4ca6d8f
Summary: Useful for identifying user-controlled array accesses that could lead to buffer overflows
Reviewed By: mbouaziz
Differential Revision: D5520985
fbshipit-source-id: 92984f6
Summary: This is unused, as far as I can tell. If we want to revive it, we can do it in AL or as a simple checker; it certainly doesn't require the full might of bi-abduction.
Reviewed By: jeremydubreil
Differential Revision: D5556325
fbshipit-source-id: e3895c2
Summary:
Record the list of access paths (if any) used in the index expression for each array access.
This will make it possible to use array accesses as sinks in Quandary
Reviewed By: jeremydubreil
Differential Revision: D5531356
fbshipit-source-id: 8204909
Summary:
It's nice to have "raw" as the default kind of access path, since it's used much more often than the abstraction.
This is also a prereq for supporting index expressions in access paths, since we'll need mutual recursion between accesses and access paths.
Reviewed By: jeremydubreil
Differential Revision: D5529807
fbshipit-source-id: cb3f521
Summary:
Bumps facebook-clang-plugins to a version that outputs sizeof() info in bytes and not bits.
update-submodule: facebook-clang-plugins
Reviewed By: akotulski
Differential Revision: D5526747
fbshipit-source-id: 6019542
Summary: Using a dedicated abstract domain, like Quandary does, is more suitable for taint analysis.
Reviewed By: sblackshear
Differential Revision: D5473794
fbshipit-source-id: c917417
Summary:
Both `stringWithUTF8String` and `stringWithString` implements copy semantics that copies the content of their parameter into a newly allocated buffer. We modeled this as pointer assignment in the past, which means that once we write
```
NSString* foo() {
char buf[...];
...
return [NSString stringWithUTF8String:buf];
}
```
We are going to get a spurious stack variable address escape report because local pointer `buf` is assigned to the newly created string and the string gets returned.
This diff tries to address the issue by heap-allocating a buffer and `memcpy` the contents in `stringWithUTF8String` and `stringWithString`. But this change will create another problem: the allocated buffer will be reported as leaked by the backend, while in reality those buffers won't actually be leaked as they are allocated in a region that will be periodically autoreleased. To suppress spurious memory leak FPs, I added another attribute `Awont_leak` that will suppress the leakage report on any expressions that get tagged with it.
Reviewed By: jeremydubreil
Differential Revision: D5403084
fbshipit-source-id: df6de7f
Summary:
Pretty basic: warn when we see an assignment instruction `x = ...` and `x` is not live in the post of the instruction.
Only enabled for Clang at the moment because linters already warn on this for Java. But we can enable it later if we want to (should be fully generic).
Reviewed By: jeremydubreil
Differential Revision: D5450439
fbshipit-source-id: 693514c
Summary:
This just makes the warnings silent for now. We may improve the analysis to check if the null check on the captured fields are consistent with the annotation on the corresponding parameters.
Eradicate also has the same issue. I added a test to outline this. The biabduction analysis will also probably fail on the same of annotation lookup. We may want implement the proper fix at the level of `Annotation.field_has_annot`.
Reviewed By: sblackshear
Differential Revision: D5419243
fbshipit-source-id: 6460de8
Summary:
Conversion and reformat of infer source using ocamlformat
auto-formatting tool.
Current status:
- Because Reason does not handle docstrings, the output of the
conversion is not 'Warning 50'-clean, meaning that there are
docstrings with ambiguous placement. I'll need to manually fix
them just before landing.
Reviewed By: jvillard
Differential Revision: D5225546
fbshipit-source-id: 3bd2786
Summary: This will allow to replace type vars into concrete types in expressions.
Reviewed By: jvillard, mbouaziz
Differential Revision: D5209276
fbshipit-source-id: c1650f8
Summary:
The thread safety domain manipulates access paths that are a variable
followed by a sequence of field or index accesses. Some expressions
from C++ code do not fit that form, such as cases where subtraction of
an offset from a pointer is used to obtain another pointer, whose
fields are then accessed. Previously the analyzer would crash on such
expressions. This diff partially treats them by introducing dummy
variables.
Reviewed By: da319
Differential Revision: D5343567
fbshipit-source-id: f73b520
Summary:
Indexing into a string literal expression would generate a fresh
variable on every application of a transformer. This violated
finiteness of the domain, and caused divergence.
Reviewed By: da319
Differential Revision: D5342951
fbshipit-source-id: e95e84e
Summary:
Change the API of `Logging` wrt to writing to files and to the console (see
changes in logging.mli).
Write only to one log file: infer-out/log. Prefix each line with the kind of
warning and the PID of the process emitting it. Writing with `O_APPEND` is
atomic so the file should not get garbled by concurrent writes. To get the
output of a single process, find out which one interests you by looking at
infer-out/log, then `grep ^[<PID>] infer-out/log`.
Introduce 3 log levels for debug output and command-line options to set them
for various categories individually.
Change tons of `"\n"` to `"@\n"` so the `Format` module is aware of newlines
without us having to look through every character of every logged string for
`\n` characters.
Reviewed By: mbouaziz
Differential Revision: D5165317
fbshipit-source-id: 93c922f
Summary: Allow type variables in `Typ.desc`. It will be used to store template type arguments.
Reviewed By: jberdine
Differential Revision: D5154757
fbshipit-source-id: 55b8e81
Summary:
ThreadSafety.may_alias crashed on C++ code because it assumed Java
field names.
Reviewed By: sblackshear
Differential Revision: D5147284
fbshipit-source-id: d10841f
Summary: The previous error message recommended annotating the method in question with `GuardedBy`, which doesn't actually work.
Reviewed By: jeremydubreil
Differential Revision: D5149661
fbshipit-source-id: d935aec
Summary:
A recent diff tried to replace `L.out "error message"; assert false` with
`failwith "error message"` but infer relies on the type of raised exceptions to
sometimes keep going. A more careful change will be needed but in the meantime
restore the old behaviour.
Reviewed By: jberdine
Differential Revision: D5112969
fbshipit-source-id: 713fe20
Summary:
Try and enforce the following rules:
- stderr is for updating the user about progress or errors
- Introduce Logging.progress that outputs to stderr, but honours --quiet
- Logging.stderr is as before
- Logging.out now prints to stderr (or to log files as before if set up) and
not stdout. If some information should go on stdout then the user should be
able to rely on it (ie, it's not just some progress message). For now only
the summary of the errors is printed on stdout by default.
- Logging.err* functions are gone. If the error is user-visible, it should be
Logging.stderr, or `failwith`. If not, go to the same log file as other
output, which personally I find much more convenient than having to dig through
2 log files every time I'm looking for some output.
Reviewed By: jberdine
Differential Revision: D5095720
fbshipit-source-id: 68999c9
Summary:
An array has a static or dynamic length (number of elements), but it also has a
stride, determined by the type of the element: `sizeof(element_type)`. We don't
have a good `sizeof()` function available on SIL types, so record that stride
in the array type.
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz
Differential Revision: D4969697
fbshipit-source-id: 98e0670
Summary:
HIL had only been tested in Java, and it made some assumptions about what array expressions look like (the LHS is always resolvable to an access path) and assignments (the LHS is always an access path) that aren't true in C.
Fixed the code so we won't crash in this case.
Thanks to jeremydubreil for catching this.
Reviewed By: jeremydubreil
Differential Revision: D5047649
fbshipit-source-id: e8484f4
Summary: A lot of C++ library functions look like this, so it's important to have.
Reviewed By: mbouaziz
Differential Revision: D5026082
fbshipit-source-id: 6f421b6
Summary:
Ran the build with -w,-32 , delete code, repeat, until a fixpoint of no more warnings is reach.
Unfortunately we cannot fatal on w32 because ppx_compare can generate dead code (eg `compare_t` and only `compare` is used).
Reviewed By: mbouaziz
Differential Revision: D4945800
fbshipit-source-id: c95afb6
Summary:
`Location.dummy` is often used in a situation where we know the source file, but not the line/column.
Use `Location.none` for this instead.
Reviewed By: jeremydubreil
Differential Revision: D4991232
fbshipit-source-id: fc361a4
Summary:
Don't store redundant information in C++ template Type.Name.t.
New signature:
```
| CppClass (qual_name, template_args)
```
For example, for `std::shared_ptr<int>`, will look like this:
```
| CppClass (["std", "shared_ptr"], Template [int])
```
While it used to be:
```
| CppClass (["std", "shared_ptr<int>"], Template (["std", "shared_ptr"], [int]))
```
Reviewed By: jberdine, mbouaziz
Differential Revision: D4834512
fbshipit-source-id: cb1c570
Summary: This function is always used in the frontend where summaries don't exist yet
Reviewed By: akotulski
Differential Revision: D4979132
fbshipit-source-id: 8d49c52
Summary:
Add `volatile` and `restrict` type qualifiers. Change `Ast_expressions.create_*_type` functions
to always get optional type quals argument.
update-submodule: facebook-clang-plugins
Reviewed By: jberdine
Differential Revision: D4969634
fbshipit-source-id: 9a63bf7
Summary:
Title.
The way types are printed is completely valid, but little weird for some C++ programmers:
`int const` - same as `const int`
`int * const` - pointer is `const`, value under it is not
`int const *` - pointer is not `const`, but the value is
`int const * const` - both pointer and value are const
Reviewed By: jberdine
Differential Revision: D4962180
fbshipit-source-id: dcb02e3
Summary:
Modify the type of `Exp.Sizeof ...` to include the value that the expression
evaluates to according to the compiler, or None if it cannot be known
statically.
Use this information in inferbo.
Mostly unused in the BiAbduction checker for now, although it could be useful
there too.
update-submodule: facebook-clang-plugins
Reviewed By: mbouaziz
Differential Revision: D4953634
fbshipit-source-id: be0999d
Summary: Sometimes reports need traces to be fully understood, but sometimes reporting where the exception takes place can save time to developers.
Reviewed By: jvillard
Differential Revision: D4914037
fbshipit-source-id: 039ab63
Summary: The purpose of the annotation reachability analysis is to report when a method annotated with `X` never calls, directly or indirectly, another method annotated with `Y`. However, there can be different call stacks following different execution paths from `X` to `Y`. Reporting more than one call stack ending with the same annotated procedure does not bring more signal to the end user. So the purpose of this diff is to avoid those duplicated reports and report at most one annotation reachability issue per end of call stack.
Reviewed By: sblackshear
Differential Revision: D4942765
fbshipit-source-id: 46325a7
Summary: Sawja assigns them on multiple control-flow paths, so they're not SSA.
Reviewed By: peterogithub
Differential Revision: D4896745
fbshipit-source-id: c805216
Summary:
These were showing up as allocation huge amounts of memory on some analysis
profiles from OpenSSL using Spacetime. Doesn't hurt to make them allocate less.
Reviewed By: jeremydubreil
Differential Revision: D4884520
fbshipit-source-id: e79b815
Summary:
Limit the use of `SourceFile.invalid` (renamed from `SourceFile.empty`) as much
as possible. In particular, do not generate bogus procnames for external global
variables: their translation unit was set to the invalid source file, now we
distinguish between extern/non-extern global variables more explicitly.
`SourceFile.invalid` is still used in too many places to actually remove it, often as a dummy initial value that never gets used, but sometimes as an actual value... Worse, we cannot fail on all operations on `SourceFile.Invalid` yet: the `SourceFile.to_string` method is used in too many places where it could get `SourceFile.Invalid` as argument. It's easy to see where it's used by making it raise in the code, then running the tests. This results in spaghetti backtraces that are hard to trace back to a root cause.
Reviewed By: akotulski, jeremydubreil
Differential Revision: D4860019
fbshipit-source-id: 45be040
Summary: This is required to upgrade OCaml as our ancient Reason is not available on 4.04.0.
Reviewed By: yunxing
Differential Revision: D4851582
fbshipit-source-id: 994a9a8
Summary:
Make it possible to write one model which will be used by all template instantiations.
There is one big missing piece: infer never tries to do template instantiation by itself. With current code, it's possible to use generic models
as long as header contains `__infer_generic_model` annotation (see the test as an example).
This is not viable to modify all headers with this annotation hence infer will try to do template instantiation for generic models in later diffs.
Reviewed By: jberdine
Differential Revision: D4826365
fbshipit-source-id: 2233e42
Summary:
Initial version of naming, required for generic models. It's simply non mangled name stripped from any template arguments.
This makes it impossible to have two generic models with
1. different template arguments
2. different overloads (function with same name, but different types of arguments)
Reviewed By: jberdine
Differential Revision: D4826358
fbshipit-source-id: 42ac763
Summary: This will avoid the redefine this Map and Set module as pretty printable when used to create abstract domains.
Reviewed By: sblackshear
Differential Revision: D4811849
fbshipit-source-id: e2f6763
Summary: This representation is more natural and results in less `List.rev` calls.
Reviewed By: jberdine
Differential Revision: D4762629
fbshipit-source-id: 481cbe4
Summary:
Don't pass names as strings in clang frontend. Instead use QualifiedCppName which preserves
each identifier of qualified name.
Done by
1. change return type of `Cast_utils.get_qualified_name` to return `QualifiedCppName.t`
2. change types in `Typ.Name.t` and `Typ.Procname.t` to use qualified names where applicable
3. Keep changing the code until it compiles
Reviewed By: jberdine
Differential Revision: D4754242
fbshipit-source-id: 9d723cb
Summary:
Improve type of `Fieldname.t` in `Clang` variant - make it store qualified classname and method name.
Based on those changes, fix matching in `Errdesc` to use `QualifiedCppName.Match` instead of string comparisons
Reviewed By: jberdine
Differential Revision: D4746735
fbshipit-source-id: 6f52413
Summary:
Split Fieldname.t into `Java` and `Clang`. Each of them have different naming conventions and this way it's easier to differentiate between them.
Make `Java` variant store string instead of mangled since mangled part was always empty
Changes to `Clang` variant are coming in the next diff
Reviewed By: jeremydubreil
Differential Revision: D4746708
fbshipit-source-id: c5858a8
Summary: Add `QualifiedCppName.t` and some functions to manipulate it. More places will start using this type (such as `Procnames` or `Typ.Name`) in later diff
Reviewed By: jberdine
Differential Revision: D4738991
fbshipit-source-id: 8f20dd6
Summary: There was a lot of indirection going on in `Typ.Name` type definition. Inline all those indirections into single variant type
Reviewed By: jberdine
Differential Revision: D4737644
fbshipit-source-id: c5e181b
Summary:
One limitation of Eradicate is that certain nullability patterns are not expressible using simply the `Nullable` annotation.
One such pattern is using the knowledge that a function returns null when passed null, but returns an object otherwise.
The annotation `PropagatesNullable` is a variant of `Nullable` applied to parameters when their value propagates to the return value.
A method annotated
```
B m(PropagatesNullable A x) { return x == null ? x : B(x); }
```
indicates that `m` returns null if `x` is null, or an object of class `B` if the argument is not null.
Examples with multiple parameters are in the test cases.
This diff builds some infrastructure for annotation transformers: the example above represents the identity function on nullability annotations.
Reviewed By: jvillard
Differential Revision: D4705938
fbshipit-source-id: 9f6194e
Summary: That tuple has 3 elements already, there may be 4th element coming.
Reviewed By: mbouaziz
Differential Revision: D4721342
fbshipit-source-id: cba44ef
Summary: Fail early when there is no registered callbacks to run the analysis of a procedure on-demand
Reviewed By: sblackshear
Differential Revision: D4573726
fbshipit-source-id: a8ee74b
Summary:
Procnames files are now reversed qualifier lists with `#` as separator (instead of `::` which needs to be escaped in bash).
Because of the mechanism that is used to obtain qualifiers, it also affects naming for ObjC classes.
Examples:
```
std::unique_ptr<int>::get -> get#unique_ptr<int>#std#__MANGLED,...__ // C++ method
folly::split -> split#folly#__MANGLED,..._ // function within namespace
NSNumber numberWithBool: -> numberWithBool:#NSNumber#class // ObjC method
```
Reviewed By: jvillard
Differential Revision: D4689701
fbshipit-source-id: c3acfc6
Summary:
Add a new command-line option `--per-procedure-parallelism`, to change the granularity of parallelism of the analysis from file to procedure.
This is intended for `--reactive` mode where e.g. a single file is changed and the analysis currently uses just one core.
When the option is used, the Makefile mechanism is replaced by using forking instead.
The parent process does as little allocation as possible, to avoid taxing the kernel.
Caveats:
- Not active in Java, (issues with camlzip).
- Not active in checkers, yet.
Example use:
```
infer --reactive --changed-files-index index.txt --per-procedure-parallelism -- analyze
```
Reviewed By: jberdine
Differential Revision: D4634884
fbshipit-source-id: e358c18
Summary:
Provide proper constructor functions for all Typenames following `Typename.Java` module.
Always use those constructor functions.
Reviewed By: jeremydubreil
Differential Revision: D4673943
fbshipit-source-id: 81625c2
Summary:
It used to be string which:
1. Doesn't have enough information for parametric models
2. Doesn't have good type
Changing this blows up in clang frontend, but I think it's for the better
Reviewed By: jberdine
Differential Revision: D4667633
fbshipit-source-id: 9f61bf1
Summary: I encountered cases where the class name part of the method name was passed as `(None, "package.Class")` instead of `("package", "Class")` and therefore incorrectly failing some inequality checks
Reviewed By: sblackshear
Differential Revision: D4662617
fbshipit-source-id: 98ee3e3
Summary:
Given two analysis results, it's now possible to compare them with the following command:
infer --diff --report-current reportA.json --report-previous reportB.json --file-renamings file_renamings.json
this command will then generate 3 files in `infer-out/differential/{introduced, fixed, preexisting}.json`, whose meaning is the following:
- `introduced.json` has all issues in `current` that are not in `previous`
- `fixed.json` has all issues in `previous` that are not in `current`
- `preexisting.json` has all issues that are in both `current` and `previous`
The json files generated can then be used to categorise results coming from incremental analyses of a codebase.
Reviewed By: jvillard
Differential Revision: D4482517
fbshipit-source-id: 1f7df3e
Summary: Make backend aware of some template instantiation arguments for template classes.
Reviewed By: jberdine
Differential Revision: D4421338
fbshipit-source-id: f7d72b4
Summary:
For writes of serialized data, write directly to the file instead of using a temporary one, and lock the file before writing.
Also added an `update` function to the API, to update an existing version of the data file instead of just replacing it with a new value.
Reviewed By: jberdine
Differential Revision: D4619958
fbshipit-source-id: 9642408
Summary:
Polymorphic models, and type environment refinements, need mutual
references between general types and struct types.
Reviewed By: cristianoc
Differential Revision: D4620076
fbshipit-source-id: f9d01e6
Summary:
Instead of translating all structs/c++ classes and putting them into type environment, translate ones that are used. It now follows similar mechanism to ondemand function translation. This change should significantly decrease disk space/memory usage to store type environments
+ small change to fix build
Reviewed By: dulmarod
Differential Revision: D4597723
fbshipit-source-id: c8b0365
Summary:
Reimplement whitelists as a match against a single regexp. This allows one to
precompile the whitelist regexp to make fast check against a whitelist of fuzzy
qualifiers, instead of checks linear in the number of items in the whitelist.
Reviewed By: akotulski
Differential Revision: D4588278
fbshipit-source-id: 3bac614
Summary:
This gives a way for users to flag safe methods regarding SIOF, for instance if
the problematic paths in the method cannot happen before `main()` has started.
Reviewed By: akotulski
Differential Revision: D4578700
fbshipit-source-id: 6542dcf
Summary: In C++ there are types that contain `<>` in their names (templates). When printing type to `html` those should be escaped
Reviewed By: jeremydubreil
Differential Revision: D4572506
fbshipit-source-id: a180537
Summary: This fixes a wrong level of indirection when performing the type substitution.
Reviewed By: sblackshear
Differential Revision: D4521008
fbshipit-source-id: 7324ea6
Summary: Being forced to separately define `pp_element`/`pp_key` is uneccessary and makes it more cumbersome to create a set/map from an existing module that already defines `pp`.
Reviewed By: jeremydubreil
Differential Revision: D4517308
fbshipit-source-id: 9b17c9c
Summary: This should fix the issue with broken invariants when the method specialization on pointer ends up doing a substitution on non pointer types
Reviewed By: sblackshear
Differential Revision: D4487232
fbshipit-source-id: f3fce84
Summary:
Remove the remaining uses of polymorphic equality `=`.
In case of basic types, this is replaced by String.equal or Int.equal.
In case of `= []`, this is replaced by `List.is_empty`.
In case of `= None`, this is replaced by `is_none`.
In case of a datatype definition such as `type a = A | B`,
a `compare_a` function is defined by adding `type a = A | B [@deriving compare]`
and a `equal_a` function is defined as `let equal_a = [%compare.equal : a]`.
In case of comparison with a polymorphic variant `= `Yes`, the equality
defined in `PVariant.(=)` is used. Typically, `open! Pvariant` is added
at the beginning of the file to cover all the uses.
Reviewed By: jberdine
Differential Revision: D4456129
fbshipit-source-id: f31c433
Summary:
Races on volatile fields are less concerning than races on non-volatile fields because at least the read/write won't result in garbage.
For now, let's de-prioritize these writes by ignoring them.
Reviewed By: peterogithub
Differential Revision: D4434023
fbshipit-source-id: 05043ba
Summary:
Make the html output available to checkers when -g is used on the command-line.
A checker needs to call a function to start and finish the processing of each node,
and add prints during the processing.
This diff illustrates the case for Eradicate, by adding printing of the pre-state
and post-states.
Reviewed By: sblackshear
Differential Revision: D4421379
fbshipit-source-id: 67501ba
Summary: The logic for filtering reports based on their buckets lives in InferPrint, so this code isn't doing anything.
Reviewed By: jvillard
Differential Revision: D4379966
fbshipit-source-id: 5a69304
Summary: Adding the information that a procedure has been modelled as part of the attributes, during the translation, instead of getting this information from where is the summary loaded from. This is more consistent with the use of the attributes in other parts of the analysis, but is also useful in the context of the lazy dynamic dispatch algorithm where the procedures, including the models, are cloned and reanalyzed with more specialized parameters. The information about whether a procedure is a model must persist when cloning the procedures.
Reviewed By: sblackshear
Differential Revision: D4356892
fbshipit-source-id: 40ff5ca
Summary:
The two concepts are not negation of each other. The type environment created by the different frontends is not guaranteed to contain a full view of the type hierarchy. In this case, there can be holes preventing Infer to prove that `t <: t'` if the type definition between `t` and `t'` is missing. There are now two functions:
# `is_known_subtype` when the subtyping relation can be proven
# `is_known_not_subtype` when it can be proven that there is no subtyping relation between two types
This diff is intended to make no functional changes but to add functionality to detect cast error angelically, i.e. assuming that the program is probably fine where there is not enough information to prove the cast error.
Reviewed By: jberdine
Differential Revision: D4345803
fbshipit-source-id: 39b79bc
Summary:
The specialization of the methods based on the type of the arguments should only be performed when the type is an object type. This should in theory be always the case according to the Java semantics but the previous version of the code was relying on Infer to be correct all the way down the the method call:
Before this diff, the analysis on examples like this:
String foo(Object object) {
object.toString();
}
String bar() {
int[] array = {1, 2, 3};
foo(array);
}
This is a legit code that Infer is getting wrong because Java objects are translated as C objects instead of objects containing a C-style object. There may be other issues like this so it is safer to filter out the types when performing the substitution.
Reviewed By: jberdine
Differential Revision: D4345760
fbshipit-source-id: 1c74593
Summary: pattern matching we had before allowed many unintended functions to pass (such as `max_element`). Make matching much more strict
Reviewed By: jvillard
Differential Revision: D4313428
fbshipit-source-id: 189c522
Summary:
Most of the time code outside of project root is not interesting to the user - it's either system library or infer C++ model. Skip all of them.
Previous logic was doing something similar, but in more selective way.
I also need this change for D4313428
Reviewed By: jvillard
Differential Revision: D4339298
fbshipit-source-id: c7b5544
Summary:
This will simplify the InferPrint logic of checking what should/should-not be reported.
I will remove the issue names in Localise in a next diff.
Reviewed By: ddino
Differential Revision: D4334327
fbshipit-source-id: ebcfd6c
Summary:
We've had some issues with names like `arr$` appearing in error reports.
Any identifier name that contains $ cannot have come from source code because it is not a legal Java identifier.
This change should stop these reports because Errdesc.ml refuses to use temporary var names in error reports.
Reviewed By: jeremydubreil
Differential Revision: D4322305
fbshipit-source-id: 16237fe
Summary: Allow backend to trigger compilation of extra files when it needs them. This will allow infer to capture less files initally and possibly speed up compilation
Reviewed By: cristianoc, jberdine
Differential Revision: D4231581
fbshipit-source-id: 181abea
Summary:
Use In_channel and Out_channel operations instead of those in Pervasives. Don't
use physical equality on values that aren't heap-allocated since it doesn't help
the compiler generate faster code and the semantics is unspecified. Also use
phys_equal for physical equality.
Reviewed By: sblackshear
Differential Revision: D4232459
fbshipit-source-id: 36fcfa8
Summary:
Utils contains definitions intended to be in the global namespace for
all of the infer code-base, as well as pretty-printing functions, and
assorted utility functions mostly for dealing with files and processes.
This diff changes the module opened into the global namespace to
IStd (Std conflict with extlib), and moves the pretty-printing
definitions from Utils to Pp.
Reviewed By: jvillard
Differential Revision: D4232457
fbshipit-source-id: 1e070e0
Summary: Globals that are constexpr-initializable do not participate in SIOF.
Reviewed By: sblackshear
Differential Revision: D4277216
fbshipit-source-id: fd601c8
Summary:
Functions related to source files were already namespaced by `source_file_` prefix. Make separate module for them.
In high level it replaces all `source_file_` with `SourceFile.` and then fixes all remaining compilation errors
Reviewed By: jvillard
Differential Revision: D4299053
fbshipit-source-id: 20b1d39
Summary: This is very useful to debug issues that have to do with types, for example the cast errors
Reviewed By: sblackshear
Differential Revision: D4289790
fbshipit-source-id: ef5a8bf
Summary:
Remember which globals are static locals.
It's useful to distinguish those from global variables in objc and in the SIOF
checker. Previously in ObjC we would accomplish that by looking at the name of
the variable, but that wouldn't work reliably in C++. Keep the old method around for
now as the way we deal with static locals in ObjC needs some fixing.
Reviewed By: akotulski
Differential Revision: D4198993
fbshipit-source-id: 357dd11
Summary:
`DB.source_file_to_string` is very easy to misuse and it shouldn't even exist.
In preparation for that day, replace most of `source_file_to_string` with `source_file_pp`
Reviewed By: jvillard
Differential Revision: D4258390
fbshipit-source-id: 447cf5a
Summary:
Currently cfg nodes are written into dot files in whatever order they
appear in a hash table. This seems unnecessarily sensitive, so this
diff sorts the nodes.
Reviewed By: dulmarod
Differential Revision: D4232377
fbshipit-source-id: a907cc6
Summary:
introduce `AttributesTable.load_defined_attributes` which will return proc attributes only if the procedure is defined. In order to not mess up
with existing caching, create another hashmap to store those procdescs.
We need to do that because with reactive capture we no longer can assume that all proc attributes are final before analysis starts
Reviewed By: jberdine
Differential Revision: D4231575
fbshipit-source-id: e795bcb
Summary: Pure refactoring simplifying the code doing the case analysis for execturing the cast instruction.
Reviewed By: dulmarod
Differential Revision: D4215238
fbshipit-source-id: 9f0f163
Summary:
1. Always store cpp model source_file with relative path. This will make them cache friendly independent of infer location
2. Distinguish between "relative to project root" and "relative to infer models src"
3. Unify `source_file_from_path` used by java and C frontends into one function. There are no improvements to that logic yet
4. Move `is_cpp_model_file` to use `source_file` instead of `filename`
Reviewed By: jberdine
Differential Revision: D4204548
fbshipit-source-id: 6e21771
Summary: D4189956 killed the phantom space printed after types, but the Context leak message was relying on it :).
Reviewed By: jeremydubreil
Differential Revision: D4208591
fbshipit-source-id: 9f0d709
Summary: These are dangerous if you are trying to compare a type to a string, and they're also unsightly.
Reviewed By: jvillard
Differential Revision: D4189956
fbshipit-source-id: 14ce127
Summary:
SIOF is only for interactions between objects of non-POD types. Previously the
checker was also reporting for POD types.
Reviewed By: akotulski
Differential Revision: D4197620
fbshipit-source-id: 7c56571
Summary:
Record an abstraction of the bug traces in the tests. The abstraction of a
trace is the sequence of descriptions. In practice, descriptions are either
empty, or of the form "start/end/return from/call to procedure X". They seem
pretty stable.
Motivation: there is nothing testing the traces reported by Infer right now,
even though they are surfaced to developers. For instance, Quandary uses
--issues-txt instead of --issues-tests to make sure the traces do not regress.
This change would make this approach more widespread.
Reviewed By: sblackshear
Differential Revision: D4159597
fbshipit-source-id: 9c83952
Summary: The thread safety checker is run independently of other analyses, using the command "infer -a threadsafety -- <build-command>".
Reviewed By: sblackshear
Differential Revision: D4148553
fbshipit-source-id: bc7b3f9
Summary: When searching for cast errors, types that were not Java objects, e.g. arrays of primitive types were not taken into account, leading to incorrect class cast excpetion reports.
Reviewed By: sblackshear
Differential Revision: D4166184
fbshipit-source-id: 7157c95
Summary:
This adds generic support for reporting error traces as usual infer issues
traces (instead of putting them in the textual description of the error) to
Trace.ml and SinkTrace.ml.
The siof checker is made to use these new traces, and gets an improved error
message mentioning the name of the problematic global as well, which requires a
slight API change in Pvar.re.
The support in Trace.ml is incomplete: passthroughs are ignored. This missing
feature will be needed by Quandary to migrate its error messages.
Reviewed By: sblackshear
Differential Revision: D4159542
fbshipit-source-id: 8c1101d
Summary:
It was defined in two places and I'm about to add a third, so let's share
instead.
Reviewed By: sblackshear
Differential Revision: D4153420
fbshipit-source-id: 3d2c519
Summary:
Location.nLOC was introducing a lot of complexity for little benefit (and edge cases were wrong anyway).
We can restore it in some simplified way if we find that we need it
Reviewed By: jeremydubreil
Differential Revision: D4139868
fbshipit-source-id: 4f8e033
Summary:
Summaries are modified before saving from disk, for example the attributes of the postcondition can change.
I have observed flaky reports of the internal error NULL_TEST_AFTER_DEREFERENCE. Some attributes (e.g. assigned) are changed before saving, but the spec table in memory is not changed.
So in case:
1) the procedure is analyzed on-demand, then subsequent uses in the same process use the summary in memory with the unchanged attribute, and the issue is not reported.
2) the procedure is already on disk and loaded, then the loaded summary has the changed attributes, and the issue is reported.
Flakiness happens as because of parallelism, whether a procedure is analyzed already or whether it is analyzed on-demand, can change.
The normalization function can change the instrumentation of a symbolic heap because it uses the existing comparison functions, which ignore instrumentations.
So normalization can replace part of a symbolic heap with an identical one but where the instrumentation is different — this is what I have observed.
The diff uses a different comparison function where instrumentations are taken into account.
Reviewed By: jberdine
Differential Revision: D4140031
fbshipit-source-id: f4f119a
Summary: Don't use a hardcoded string, and enable reports in --issues-tests.
Reviewed By: jvillard
Differential Revision: D4110731
fbshipit-source-id: 9922557
Summary:
this makes frontends no longer depend on SymExec.ml. `ModelBuiltins` was split into two modules:
- `BuiltinDecl` with procnames for builtins (used to determine whether some function is a builtin)
- `BuiltinDefn` with implementations used by `SymExec`
- they both have similar type defined in `BUILTINS.S` which makes sure that new builtin gets added into both modules.
During the refactor I ran some scripts:
`BuiltinDecl.ml`:
let X = create_procname "X"
cat BuiltinDecl.ml | grep "create_procname" | tail -70 | awk ' { print $1,$2,$3,$4,"\42"$2"\42"} '
then manually confirm string match. Exceptions:
"__exit" -> "_exit"
"objc_cpp_throw" -> "__infer_objc_cpp_throw"
__objc_dictionary_literal
nsArray_arrayWithObjects
nsArray_arrayWithObjectsCount
`BuiltinDefn.ml`:
let X = Builtin.register BuiltinDecl.X execute_X
cat BuiltinDecl.ml | grep "create_procname" | tail -70 | awk ' { print $1,$2,$3,"Builtin.register BuiltinDecl."$2,"execute_"$2} '
then, fix all compilation problems
Reviewed By: jberdine
Differential Revision: D3951035
fbshipit-source-id: f059602
Summary:
- do a semantic analysis of each variable initializer to figure out if they need initialization
- add a flag to globals that is true when they are `constexpr`. In that case, no analysis is needed as the user + compile guarantee that it is a compile-time constant.
Reviewed By: sblackshear
Differential Revision: D4081273
fbshipit-source-id: 44dbe29
Summary:
Merging the results directories of targets on buck projects involved creating symbolic links into buck-out.
The bulk of files are .attr files: one per procedure. Creating these links can be a bottleneck, and the merge phase can be slower than the analysis phases on projects with many procedures.
This diff introduces multilinks to speed up merge.
A multilink is a file `multilink.txt` containing a sequence of paths
```
path/to/file1.ext
path/to/file2.ext
...
```
A multilink file is a compact way to represent a link for each entry.
This diff creates a multilink file for each `attributes/dir` directory, instead of one symbolic link for each file.
Reviewed By: jberdine
Differential Revision: D4067428
fbshipit-source-id: 911f8a9
Summary:
Declared and defined procedure attributes are now saved in different files (hashed_name.decl.attr and hashname.attr).
We always try to load using the filename of defined procedure attributes first,
and fall back to loading the file for declared ones if it does not exist.
The logic for replacing an existing file stays the same, with one extra thing:
when a file for a defined attribute is written, the one for the declared one
is deleted if it exists.
At the end of a capture, either a declared or a defined file exist, but not both.
The reason for this change is that when captures of different subprojects are
merged together, it can happen that a link gets created to a declared attributes
file even though a defined one exists, so the body of the procedure will not be analyzed.
After this diff, both links will be created, and the defined one will be loaded
by the back-end.
Reviewed By: dulmarod
Differential Revision: D4037423
fbshipit-source-id: 74fb7e6
Summary:
Checker for the Static Initialization Order Fiasco pattern:
https://isocpp.org/wiki/faq/ctors#static-init-order
1. Collect all globals (transitively) accessed in any given procedure.
2. Once the interprocedural analysis has finished, look at globals accessed in
initializers that do not belong to the current translation unit.
Reviewed By: sblackshear
Differential Revision: D3780266
fbshipit-source-id: 1d07161
Summary:
Create dummy functions representing the initializers of global variables. This
is so we can implement checks in the backend that can look at the initializer
expressions of global variables. We try not to create these dummy functions
when the initializer is not present, although for some reason we sometimes end
up with empty initializers.
Also add source file info to global variables in the backend (Pvar.re).
Reviewed By: sblackshear
Differential Revision: D3780238
fbshipit-source-id: 2dca87e
Summary: Nothing mutates those fields so there is no need to make them `mutable`
Reviewed By: cristianoc
Differential Revision: D4009166
fbshipit-source-id: b840a4b
Summary:
This diff removes the unused support for reporting props, which enables
refactoring so that the 'base' directory has no dependencies, and the
'IR' directory depends only on 'base'.
Reviewed By: jvillard
Differential Revision: D3981352
fbshipit-source-id: 3700a23
Summary: That data was never used and removing it can simplify frontends quite a bit.
Reviewed By: jberdine
Differential Revision: D3967389
fbshipit-source-id: d65c3da
Summary: The code has not much to do with IR and should be part of backend/ directory.
Reviewed By: sblackshear
Differential Revision: D3950834
fbshipit-source-id: 315ea19
Summary:
In order to have only InferJava depend on JBasics, do not use
JBasics.java_lang_object in the IR or backend. Note that this implies
that the Java frontend should ideally translate JBasics.java_lang_object
to Typename.Java.java_lang_Object.
Reviewed By: jeremydubreil
Differential Revision: D3956468
fbshipit-source-id: def64dd
Summary:
Change Sil.Call instruction to have only a single optional return
identifier, insted of a list. Essentially none of the code handled
multiple return identifiers. Also, add the type of the return
identitifier to Call instructions.
Reviewed By: sblackshear
Differential Revision: D3919358
fbshipit-source-id: d2d4f72
Summary:
Refactor Sil.struct_typ and associated operations into a separate
StructTyp module. This is possible now that Typ.Tstruct only carries a
type name instead of the definition directly, and is helpful to simplify
module dependencies.
Reviewed By: cristianoc
Differential Revision: D3919357
fbshipit-source-id: a37a656
Summary:
It is no longer necessary to keep the name of a struct within the
struct, as the name will just have been used to look it up.
Reviewed By: cristianoc
Differential Revision: D3919355
fbshipit-source-id: ab65168
Summary:
Pass the exe_env to checker cluster callbacks, and add it to the domain
extras for BoundedCallTree, and use the Exe_env instead of
AttributesTable to obtain the tenv.
Reviewed By: sblackshear
Differential Revision: D3921850
fbshipit-source-id: 9edf324
Summary: The Infer builtins can be used in the e2e tests, but those tests should not depend on the Infer models to avoid cyclic dependencies. This diff separates the models and the Infer builtins in two directories so that the test can depend on the builtins without depending on the models
Reviewed By: sblackshear
Differential Revision: D3929478
fbshipit-source-id: 7d0ab79
Summary:
The global reference `DB.current_source` is used internally in the module DB, by all the front-ends, and directly and indirectly by the back-end, including saving and restoring the state in case of on-demand procedure calls. In particular, it is heavily used in printing functions.
This diff cleans up the flow of information about what the current file is, making it explicit, and removes the reference.
Reviewed By: jberdine
Differential Revision: D3901247
fbshipit-source-id: ef596bd
Summary:
This diff removes the redundancy in the representation of types where
struct types could be represented either directly using Tstruct or
indirectly using Tvar to refer to the type environment. A consequence
is that it is much harder to construct large type values.
Reviewed By: sblackshear, cristianoc
Differential Revision: D3839753
fbshipit-source-id: cf04ea5
Summary:
Rename Typ.mk_struct to internal_mk_struct, and add Tenv.mk_struct that
ensures types are added to the environment under the right name.
Reviewed By: cristianoc
Differential Revision: D3791865
fbshipit-source-id: fd4b667