Summary:
Model array length in Java as returning an unknown interval [0, +inf] for now.
Ideally, we can deal with the size in a more precise manner in the future like in InferBo.
Reviewed By: skcho
Differential Revision: D19312123
fbshipit-source-id: 8c51059a4
Summary: Pulse doesn't care about exceptions yet. With Exceptional CFG, java analysis takes a lot of time due to having many disjuncts. Let's use Normal CFG for now.
Reviewed By: jvillard
Differential Revision: D19194479
fbshipit-source-id: f94bb6078
Summary:
In order to improve the impurity analysis, this diff adds models for
- `hasNext()` and - `Object.equals()` modeled as returning a non-deterministic value (havoc_id)
- `next()` modeled as `StdVector.get` with a fresh index
- `iterator` modeled as just returning the underlying list
Reviewed By: jvillard
Differential Revision: D19177392
fbshipit-source-id: 0babb037a
Summary:
Remove Clang and Java submodules of Typ.Fieldname. They are unnecessary and they reflect a fake dichotomy: there is only one fieldname type. To distinguish between fields of Java classes and other C constructs, there is a helper function provided, but the idea is simple: obtain the class type the field belongs to, and check if it's a Java class.
This diff still preserves behaviour, but removes as many functions as possible from the interface, to leave a small surface.
Reviewed By: mityal
Differential Revision: D18962423
fbshipit-source-id: ffe6933ee
Summary:
That class does some complicated accounting of memory that depends on
whether the string is "small", "medium" or "large". In the latter case
it does its own ref-counting and copy-on-write to save memory, and that
trips up pulse. Pretending all strings are small avoids that issue.
Reviewed By: skcho
Differential Revision: D18909030
fbshipit-source-id: 1c14d909b
Summary:
Including the current call state is useful because the contradiction
sometimes refers to abstract values that have been materialised since
the last call state so we cannot make sense of them unless we print the
current call state.
Reviewed By: skcho
Differential Revision: D18908424
fbshipit-source-id: 297f397a6
Summary:
- Do most of the work of `solve_arithmetic_constraints` inside `subst_attribute` instead, since we need to re-use the latter function for post-conditions where the first function is not appropriate.
- When substituting arithmetic constraints, we refine arithmetic information (both concrete intervals and inferbo), which can lead to inconsistent states. Instead of recording the new arithmetic facts by returning a new current state, just act as a map on attributes. This is to enable doing the point above.
- All this lead to a somewhat messy refactoring...
- Rename `CannotApplyPre` to `Contradiction` since it's used for post-conditions as well now
Reviewed By: skcho
Differential Revision: D18889120
fbshipit-source-id: d81647143
Summary:
After passing a `PRUNE` instruction we can refine the current inferbo
intervals for the values involved.
Reviewed By: ezgicicek
Differential Revision: D18889103
fbshipit-source-id: b521046aa
Summary:
Every time we add an arithmetic information, add the corresponding
inferbo one.
Reviewed By: skcho
Differential Revision: D18888863
fbshipit-source-id: ab4afd372
Summary:
Finally use information from the inferbo intervals in pulse's domain to
make decisions about whether conditionals are feasible or not.
Reviewed By: skcho
Differential Revision: D18811193
fbshipit-source-id: d80a28657
Summary:
Refine the type of inferbo intervals attributes to "pure" (non-bottom)
ones. This is because were we to get a Bottom value from inferbo we
should stop the abstract execution instead of recording it in the state.
Reviewed By: ezgicicek
Differential Revision: D18811165
fbshipit-source-id: fff8664b7
Summary:
Similarly to binops, let inferbo evaluate unary operations and record
the results.
Reviewed By: ezgicicek
Differential Revision: D18811146
fbshipit-source-id: 8f9e16bbd
Summary:
This diff enables parsing and auto-formatting documentation
comments (aka docstrings).
I have looked at this entire diff and manually made some changes to
improve the formatting. In some cases it looked like it would take too
much time, or benefit from someone more familiar with the code doing
it, and I instead disabled auto-formatting docstrings in those files.
Also, there are some source files where the docstrings are invalid,
and some where the structure detected by the parser appears not to
match what was intended. Auto-formatting has been disabled for these
files.
Reviewed By: ezgicicek
Differential Revision: D18755888
fbshipit-source-id: 68d72465d
Summary:
The introduction of inferbo intervals as pulse attributes creates the
first relational attributes. To make sense of inferbo intervals
appearing in summaries when in a caller context, we need to substitute
the abstract values they contain in the callee with the abstract values
they correspond to in the caller.
This has a significant consequence: we have to delay the check that
arithmetic constraints in the callee are satisfiable at the call-site
until *after* we have discovered all the relationships between callee
values and caller values from the heap. To solve this, we now run an
arithmetic constraints check *after* having materialised all the
addresses.
We also need to translate the abstract values in the attributes in the
post before recording them in the caller, for the same reasons.
Quite some code in this diff is concerned with substituting pulse values
inside inferbo intervals. There is a complication there too: even after
having discovered relationships between caller and callee abstract
values induced by the heap shapes, there could be abstract values in the
callee's attributes that we haven't seen yet. We need to make up new
values for these in the caller, so this substitution has to return a
potentially extended substitution.
Reviewed By: skcho
Differential Revision: D18745695
fbshipit-source-id: 077ae7670
Summary:
This gets rid of false positives when something invalid (eg null) is
passed by reference to an initialisation function. Havoc'ing what the
contents of the pointer to results in being optimistic about said
contents in the future.
Also surprisingly gets rid of some FNs (which means it can also
introduce FPs) in the `std::atomic` tests because a path condition
becomes feasible with havoc'ing.
There's a slight refinement possible where we don't havoc pointers to
const but that's more involved and left as future work.
Reviewed By: skcho
Differential Revision: D18726203
fbshipit-source-id: 264b5daeb
Summary:
It's a well-known fact that pulse should know too. To avoid splitting
the abstract state systematically, only act if we know the pointer is
exactly 0 to avoid reporting a nullptr dereference on `free(x)`.
Reviewed By: ezgicicek
Differential Revision: D18708575
fbshipit-source-id: 1cc3f6908
Summary:
Turns out code uses atomics in important places, modelling it removes
FPs.
The tests are copied from biabduction and adapted and extended a bit. I
didn't implement compare_exchange primitives for now (plus, giving them
a sequential semantics like in biabduction is probably a bit cheeky).
Reviewed By: skcho
Differential Revision: D18708576
fbshipit-source-id: a3581b8a4
Summary: This extends semantics of binary operator for BoItv. If there is no known interval value for a pulse value, it returns a symbolic value of the pulse value.
Reviewed By: jvillard
Differential Revision: D18726768
fbshipit-source-id: ed8ecf78b
Summary:
This diff adds inferbo's interval values to pulse's attributes. The added values will be used to
filter out infeasible passes in the following diffs.
Reviewed By: jvillard
Differential Revision: D18726667
fbshipit-source-id: c1125ac6e
Summary: Rather than repeatedly matching actuals, let's use `ProcnameDispatcher.ModeledCall` to pick up the actual arguments with their corresponding values. This simplifies the models.
Reviewed By: jvillard
Differential Revision: D18685855
fbshipit-source-id: 7788bd8bb
Summary:
This was never set to true except in a wrong way in the Java frontend
(see previous diff).
Reviewed By: dulmarod
Differential Revision: D18573927
fbshipit-source-id: 4c9d1a855
Summary:
Note: Disabled by default.
Having some support for values, we can report when a null or constant
value is being dereferenced. The particularity here is that we don't
report when 0 is a possible value for the address, or even if we know
that the value of the address can only be 0 in that branch! Instead, we
allow ourselves to report only when we the address has been *set* to
NULL (or any constant).
This is in line with how pulse deals with other issues: only report when
1. we see an address become invalid, and
2. we see the same address be used later on
Reviewed By: skcho
Differential Revision: D17665468
fbshipit-source-id: f1ccf94cf
Summary:
This was causing loads of false positives later in the stack.
Invalidating the address of the object seems to be enough here as it
doesn't break any tests.
Reviewed By: ezgicicek
Differential Revision: D18246090
fbshipit-source-id: 2ef9a6a5c
Summary:
We consider Java collections to be like c++ std::vectors and add models for
- `Collections.get(..)`
- `__cast`
Reviewed By: skcho
Differential Revision: D18449607
fbshipit-source-id: 448206c84
Summary:
When reporting null dereference it is useful to know where the null came
from.
Reviewed By: skcho
Differential Revision: D18206459
fbshipit-source-id: 0c8e6781b
Summary:
This simplifies the code overall. It also makes accessing the action of
a "trace" (which is now stored alongside it instead of deep inside it)
constant time instead of linear in the number of nested calls.
Reviewed By: skcho
Differential Revision: D18206460
fbshipit-source-id: 9546ff36f
Summary:
This adds a more interesting value domain to pulse: concrete intervals.
There are still two main limitations:
1. arithmetic operations are all over-approximated: any assignment involving arithmetic operations is replaced by non-determinism
2. abstract values that are discovered to be equal are not merged into one
Reviewed By: skcho
Differential Revision: D18058972
fbshipit-source-id: 0492a590f
Summary:
This does several things because it was hard to split it more:
1. Split most of the arithmetic reasoning to PulseArithmetic.ml. This
doesn't need to be reviewed thoroughly because an upcoming diff
changes the domain from just `EqualTo of Const.t` to an interval domain!
2. When going through a prune node intra-procedurally, abduce arithmetic
facts to the pre (instead of just propagating them). This is the "assume
as assert" trick used by biabduction 1.0 too and allows to propagate
arithmetic constraints to callers.
3. Use 2 when applying summaries by pruning specs whose preconditions
have un-satisfiable arithmetic constraints.
This changes one of the tests! Pulse now does a bit more work to find
the false positive, as can be seen in the longer trace.
Reviewed By: skcho
Differential Revision: D18117160
fbshipit-source-id: af3b2c8c0
Summary:
Instead of checking that each address in the pre that must be valid is
not invalid in the caller (and error out if it turns out it is invalid)
as we discover them, save these checks for after we are sure that the
precondition can be applied. It is in fact a bug that we can report an
error when trying to apply a precondition that is actually not
satisfiable in the current state for other reasons than lifetime issues.
We still want to skip calls in case of weird issues like mismatch in
number of formals vs actuals.
This will have more obvious effects later when we also check that
arithmetic facts in preconditions are satisfied at the call site: if a
pre mandates "x=1" and "y must be valid" and we have "x=0" and "y
invalid" then we shouldn't report an error.
Reviewed By: skcho
Differential Revision: D18115229
fbshipit-source-id: ad4ce72ff
Summary:
If a precondition cannot be applied, it means that this program path
somehow doesn't make sense for the caller and so should be pruned. Right
now we just treat this as skipping over the call instead.
This will become more important when specs start mentioning arithmetic
facts that must be satisfied at the call site. As it is we will only
stop if we discover aliasing in the pre not present at the call site or
vice versa.
Reviewed By: dulmarod
Differential Revision: D18115230
fbshipit-source-id: 4f1c7a583
Summary: The way `<=` is used in `AbstractDomain` prevents infix use and forces bracketing it everywhere. Replace with simple `leq`.
Reviewed By: jvillard
Differential Revision: D18201854
fbshipit-source-id: 8175224e4
Summary: This will be more useful later when adding another one.
Reviewed By: ezgicicek, jberdine
Differential Revision: D18115231
fbshipit-source-id: a0a01901a
Summary:
The business of translating `Top/True/False` to `true/false` can be
hidden more.
Reviewed By: skcho
Differential Revision: D18115228
fbshipit-source-id: 071fcbddf
Summary:
Warning 33 (unused open) is enabled but the module open is not really
unused, it's just also opened at the top of the file...
Reviewed By: skcho
Differential Revision: D18114385
fbshipit-source-id: 2a8f9512a
Summary:
That module's interface was repeated twice to avoid exposing its
internals to PulseDomain itself. It's also quite long so it makes sense
to move it to its own file.
Reviewed By: ezgicicek
Differential Revision: D17977209
fbshipit-source-id: 56a2dac24
Summary:
Another poorman's library, this time about Pulse Domains. Also renames
`PulseDomain` to `PulseBaseDomain`.
Reviewed By: ezgicicek
Differential Revision: D17955287
fbshipit-source-id: 9c947cf98
Summary:
The name had rotten: it should be `AddrHistPair`. There is little value
of exposing the type of the pair `AbstractValue.t * ValueHistory.t`,
just inline its definition everywhere.
Reviewed By: ezgicicek
Differential Revision: D17955283
fbshipit-source-id: d145251e0
Summary:
See explanations in D17955104.
This renames `AbstractAddress` to `AbstractValue` since they are not
necessarily addresses.
Reviewed By: ezgicicek
Differential Revision: D17955290
fbshipit-source-id: 8bb4c61f2
Summary:
See explanations in D17955104. I put Attributes inside PulseAttribute
instead of creating a new file to avoid exposing more internals about
ranks.
Reviewed By: ezgicicek
Differential Revision: D17955284
fbshipit-source-id: a8719a58f
Summary:
Problem: PulseDomain.ml is pretty big, and contains lots of small
modules. The Infer build being a bit monolithic at the moment, it is
hard to split all these small modules off without creating some
confusion about which abstraction barries lay where. For instance, it's
fine to use `PulseDomain.ValueHistory` anywhere, but using `PulseDomain`
itself is sometimes bad when one should use `PulseAbductiveDomain`
instead.
Proposal: a poorman's library mechanism based on module aliasing. This
stack of diffs creates new Pulse* modules for all these small, safe to
use modules, together with `PulseBasicInterface.ml`, which aliases these
modules to remove the `Pulse` prefix. At the end of the stack, it will
contain:
```
module AbstractValue = PulseAbstractValue
module Attribute = PulseAttribute
module Attributes = PulseAttribute.Attributes
module CallEvent = PulseCallEvent
module Diagnostic = PulseDiagnostic
module Invalidation = PulseInvalidation
module Trace = PulseTrace
module ValueHistory = PulseValueHistory
```
This "interface" module can be opened in other pulse modules freely.
Reviewed By: ezgicicek
Differential Revision: D17955104
fbshipit-source-id: 13d3aa2b5
Summary:
bigmacro_bender
There are 3 ways pulse tracks history. This is at least one too many. So
far, we have:
1. "histories": a humble list of "events" like "assigned here", "returned from call", ...
2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah"
3. "traces", which combine one history with one interproc action
This diff gets rid of interproc actions and makes histories include
"nested" callee histories too. This allows pulse to track and display
how a value got assigned across function calls.
Traces are now more powerful and interleave histories and interproc
actions. This allows pulse to track how a value is fed into an action,
for instance performed in callee, which itself creates some more
(potentially now interprocedural) history before going to the next step
of the action (either another call or the action itself).
This gives much better traces, and some examples are added to showcase
this.
There are a lot of changes when applying summaries to keep track of
histories more accurately than was done before, but also a few
simplifications that give additional evidence that this is the right
concept.
Reviewed By: skcho
Differential Revision: D17908942
fbshipit-source-id: 3b62eaf78
Summary:
I dunno, seemed wrong before. About to introduce another attribute with
similar arguments so making them consistent in advance.
Reviewed By: skcho
Differential Revision: D17930349
fbshipit-source-id: 944b58bac
Summary:
- add the variable being declared so we can report it back in the trace in addition to its location
- distinguish between local vars and formals
Reviewed By: skcho
Differential Revision: D17930348
fbshipit-source-id: a5b863e64
Summary:
When we make the decision to go into a branch "v = N" where some
abstract value is compared to a constant, remember the corresponding
equality. This allows to prune simple infeasible paths
intra-procedurally.
Further work is needed to make this useful interprocedurally, for
instance either or both of these ideas could be explored:
- abduce v=N in the precondition and do not apply summaries when the
equalities in the pre are not satisfied
- prune post-conditions that lead to unsat states where a value has to
be equal to several different constants
Reviewed By: skcho
Differential Revision: D17906166
fbshipit-source-id: 5cc84abc2
Summary:
When we know "x = 3" and we have a condition "x != 3" we know we can
prune the corresponding path.
Reviewed By: skcho
Differential Revision: D17665472
fbshipit-source-id: 988958ea6
Summary:
First step in having a value domain: record concrete values. We record
them as equalities to abstract values using a new attribute `Constant`.
In a way, attributes are already our "pure" part in the "formulas" that
are pulse abstract domains, so this is reminiscent of existing
separation logic implementations. Trying to add values directly in the
"heap" part proved very cumbersome whereas this approach is very simple,
allowing us to ignore values most of the time except when we actually
care.
Reviewed By: skcho
Differential Revision: D17665473
fbshipit-source-id: b8033ad9c
Summary:
Unfortunately it is very hard to predict when
`Typ.Procname.describe` will add `()` after the function name, so we
cannot make sure it is always there.
Right now we report clowny stuff like "error while calling `foo()()`",
which this change fixes.
Reviewed By: ezgicicek
Differential Revision: D17665470
fbshipit-source-id: ef290d9c0
Summary:
Having just numbers for abstract values is a tad confusing. The change
is also needed for having actual constant values later.
Reviewed By: ezgicicek
Differential Revision: D17665469
fbshipit-source-id: 20dff7bbe
Summary:
Turns out `Memory.add_attributes` was only used to add singletons so
deleted that in the process.
Reviewed By: skcho
Differential Revision: D17627725
fbshipit-source-id: 0abe3889d
Summary:
This was bogus: when evaluating `e[e']` we were checking that `e'` is a
valid pointer.
Reviewed By: ezgicicek
Differential Revision: D17627727
fbshipit-source-id: 536384e95
Summary:
Introduce a new experimental checker (`--impurity`) that detects
impurity information, tracking which parameters and global variables
of a function are modified. The checker relies on Pulse to detect how
the state changes: it traverses the pre and post pairs starting from
the parameter/global variable and finds where the pre and post heaps
diverge. At diversion points, we expect to see WrittenTo/Invalid attributes
containing a trace of how the address was modified. We use these to
construct the trace of impurity.
This checker is a complement to the purity checker that exists mainly
for Java (and used for cost and loop-hoisting analyses). The aim of
this new experimental checker is to rely on Pulse's precise
memory treatment and come up with a more precise im(purity)
analysis. To distinguish the two checkers, we introduce a new issue
type `IMPURE_FUNCTION` that reports when a function is impure, rather
than when it is pure (as in the purity checker).
TODO:
- improve the analysis to rely on impurity information of external
library calls. Currently, all library calls are assumed to be nops,
hence pure.
- de-entangle Pulse reporting from analysis.
Reviewed By: skcho
Differential Revision: D17051567
fbshipit-source-id: 5e10afb4f
Summary:
It uses inline record for Sil.Load and Sil.Store for preparing the
following extention.
Reviewed By: dulmarod
Differential Revision: D17161288
fbshipit-source-id: 637ea7bfa
Summary: Like we removed empty edges from the `pre_heap` in D16419183, let's do the same to `post_heap`.
Reviewed By: skcho
Differential Revision: D17111336
fbshipit-source-id: c35fcbabb
Summary:
Before this diff we would record when some values came from the "address
of" logical variables. This makes no sense and also was incorrectly
marking these addresses as "written to" when they appeared in the post
of a procedure, because their attributes weren't empty (they had the
"address of stack variable" attribute).
Reviewed By: ngorogiannis
Differential Revision: D17131210
fbshipit-source-id: 6cc3c465a
Summary:
Pulse didn't treat local variables going out of scope as invalidating the corresponding address in memory. This diff fixes that by
- marking all local variables that exits the scope with the attribute `AddressOfStackVariable`
- before we write the summary for the proc, we make sure to invalidate all such addresses local to the procedure as `Invalid.` If such an address is read, then we would raise a use-after-lifetime issue.
Reviewed By: jvillard
Differential Revision: D16458355
fbshipit-source-id: 3686524cb
Summary:
A common gotcha is the new test. Model the minimum amount of
`std::basic_string` to catch it.
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121090
fbshipit-source-id: 66f06cb43
Summary:
Be more flexible in what type of function calls are allowed in `ViaCall ...` actions to be able to include models.
Also get rid of `here here` in traces /o\
As a side-effect, get more precise (=qualified) procedure names in
traces (but not in messages so as not to be too verbose).
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121092
fbshipit-source-id: fb51b02f8
Summary:
The fields `tenv` and `integer_type_widths` can be obtained from the `exe_env` field of `proc_callback_args`
This commit removes the redundant fields
Reviewed By: ngorogiannis
Differential Revision: D16149520
fbshipit-source-id: d37526fd4
Summary:
Cluster checkers call `SummaryPayload.read` but set the `caller_summary` to correspond to the same summary as gives the `callee_pname`
This change introduces a new method `read_toplevel_procedure` that does not require a `caller_summary`, to be used by the cluster checkers
Reviewed By: ngorogiannis
Differential Revision: D16131660
fbshipit-source-id: 12caa1000
Summary:
Change the datatype `ProcData` to include a field of type `Summary.t` instead of a field of type `Procdesc.t`
This will enable a later commit to supply a summary to `Ondemand.analyze_proc_desc` and `Ondemand.analyze_proc_name`
Reviewed By: ngorogiannis
Differential Revision: D16121405
fbshipit-source-id: 342374121
Summary:
The record `proc_callback_args` (defined in `callbacks.ml`) contains the fields `proc_desc` and `summary`.
The field `proc_desc` is redundant because it can be obtained from `summary`.
This diff removes `proc_desc` and uses the summary to obtain it where needed.
Reviewed By: ngorogiannis
Differential Revision: D16090783
fbshipit-source-id: 5632d1f4a
Summary:
Sometimes the post of a function call has attributes on addresses that
were mentioned in the pre but are no longer reachable in the post. We
don't want to forget these, see added test.
Reviewed By: mbouaziz
Differential Revision: D16050050
fbshipit-source-id: 1ce522b97
Summary:
Previously we would union them with the previous attributes. I don't
think that makes sense.
Also change the interface a bit in preparation for the next commit.
Reviewed By: mbouaziz
Differential Revision: D16050051
fbshipit-source-id: 2e8f88f4e
Summary:
Noticed that:
- some option was always `Some _`
- recording the post never raises `Aliasing` (only exploring the pre does)
- a mutual recursion was unused
Reviewed By: mbouaziz
Differential Revision: D16050052
fbshipit-source-id: 7f77aae08
Summary:
Similar to D16005395: `folly::Optional` has a boolean field to know if
it needs to destroy the wrapped object and pulse ignores that
completely, causing false positives each time an `Optional` is created
around something with a non-trivial destructor.
Reviewed By: mbouaziz
Differential Revision: D16030149
fbshipit-source-id: aeed4a0b3
Summary: Not sure if anyone uses this but there, now it's modelled.
Reviewed By: mbouaziz
Differential Revision: D16008162
fbshipit-source-id: f4795dcba
Summary:
Prevent false positives about variables captured by value gone out of
scope.
Reviewed By: ezgicicek
Differential Revision: D16008165
fbshipit-source-id: d70e47db4
Summary: We know how to do interprocedural calls so let's use that!
Reviewed By: mbouaziz
Differential Revision: D16008164
fbshipit-source-id: 4c34bf704
Summary:
`function::operator=` is called whenever we assign a literal lambda to a
variable, so it's pretty useful to be able to report anything on
lambdas.
Reviewed By: mbouaziz
Differential Revision: D16008163
fbshipit-source-id: a9d07668d
Summary:
The constructor of `folly::SocketAddress` conditionally deletes some
object and then makes that condition false. The destructor then does the
same. Pulse ignores conditionals so will see a double delete.
Just skip that function for now, but it should be easy for pulse to be
more correct here if it knew how to compare constant values.
Reviewed By: mbouaziz
Differential Revision: D16005395
fbshipit-source-id: 036f5091b
Summary:
Printing `Exp.Const (Cfun proc_name)` adds `_fun_` in front of the
procedure name, eg `_fun_foo` instead of `foo`. This showed up in pulse
traces.
Reviewed By: mbouaziz
Differential Revision: D16004606
fbshipit-source-id: 72ac6866f