Summary:
Because in the real semantics CFRelease can be used more than once, and also the variables can be used after CFRelease in general, modelling this as `free` causes many `USE_AFTER_FREE` errors. Now we change the model to not add the `Invalid CFree` attribute, but to just remove the `Allocated` attribute. So we can model memory leaks in the simple case of `Create` and not `CFRelease` before going out of scope, but we avoid the `USE_AFTER_FREE`.
Since the model for CFRelease now diverges from free, changed the command line option for modelling to `pulse-model-release-pattern`.
Reviewed By: jvillard
Differential Revision: D21324895
fbshipit-source-id: ab323d981
Summary:
The idea was to keep track of why we know certain facts but actually
these traces are never read. Other arithmetic facts (BoItv and the path
condition) don't have histories so remove them from concrete intervals
too.
Reviewed By: dulmarod
Differential Revision: D21303353
fbshipit-source-id: eecf07b05
Summary:
We were invalidating "*(vec.__infer_backing_array)" instead of the
address of the field itself.
Reviewed By: ezgicicek
Differential Revision: D21280357
fbshipit-source-id: 48b984800
Summary:
We model `malloc` in Objective-C as `malloc_not_fail` I think because the null case is not normally handled in iOS apps because the OS will just killed the app after giving some memory warnings.
So adding `malloc_not_fail` model to Pulse.
Reviewed By: jvillard
Differential Revision: D21278527
fbshipit-source-id: 17a5008fe
Summary:
This translates the construct `ObjCBridgedCastExpr` when the cast_kind is `OBC_BridgeTransfer`, or in syntax, the cast (`__bridge_transfer`).
This cast means that the object is passed from manual memory management to ARC, so one doesn't need to call `release` manually. It is important to model this to avoid false positives.
It translates it as a builtin that we then model in Pulse, the same way we modelled `CFBridgingRelease` which does the same thing.
The name of the builtin is `__free_cf` which is not ideal but I left it like that for compatibility with biabduction. We can change it once we remove this check from biabduction.
update-submodule: facebook-clang-plugins
Reviewed By: jvillard
Differential Revision: D21176337
fbshipit-source-id: 736ceeb9b
Summary: Iterator invalidation traces were based on vector rather than iterator itself.
Reviewed By: ezgicicek
Differential Revision: D21202047
fbshipit-source-id: 62ce8a488
Summary:
We ignored allocator models for vectors, and were not able to initialize vectors properly. This diff fixes this issue.
It also adds a test which was a FN before.
Reviewed By: skcho, jvillard
Differential Revision: D21089492
fbshipit-source-id: 6906cd1d1
Summary: `CFBridgingRelease` and `__bridge_transfer` which I'll model later, transfer the memory model from manual memory ref count to ARC (automatic ref count), so to avoid false positives this needs to be modelled. We can simply remove the Allocated attribute from the state, which means we won't try to track that memory anymore.
Reviewed By: skcho
Differential Revision: D21088218
fbshipit-source-id: 3520a0d59
Summary:
When encountering a constant, pulse creates an abstract value (a
variable) to represent it, and remembers that it's equal to it. The
problem is that pulse doesn't yet know how to deal with the fact that
some variables are going to be equal to each other.
This hacks around this issue in the case of constants, within the same
procedure, by remembering which constants have been assigned to which
place-holder variables, and serving those variables again when the same
constant is translated again.
Limitation: this doesn't work across procedure calls as the "constant
maps" are not saved in summaries.
Something to look out for: we don't want to make `if (p == NULL)` create
a path where `p` is invalid (we only make null invalid when we see an
assignment from 0, i.e. `p = NULL;`).
Reviewed By: ezgicicek
Differential Revision: D21089961
fbshipit-source-id: 5ebb85d0a
Summary: Modeling vector iterator with two internal fields: an internal array and an internal pointer. The internal array field points to the internal array field of a vector; the internal pointer field represents the current element of the array. For now `operator++` creates a fresh element inside the array.
Reviewed By: ezgicicek
Differential Revision: D21043304
fbshipit-source-id: db3be49ce
Summary:
Instead of having to remember to update both the inferbo and the concrete
intervals domains of pulse, hide these details under a unified API. This
should help the transition to adding a third(!) numerical domain later
on (pudge!).
Reviewed By: ezgicicek
Differential Revision: D21022920
fbshipit-source-id: 783157464
Summary: Unify the models of malloc and for the Create and Copy functions for Core Graphics. This add the null case from the malloc model to the Core Graphics models.
Reviewed By: jvillard
Differential Revision: D20890956
fbshipit-source-id: 278ac9d2f
Summary:
This diff lifts the `PulseAbductiveDomain.t` in `PulseExecutionState` by tracking whether the program continues the analysis normally or exits unusually (e.g. by calling `exit` or `throw`):
```
type exec_state =
| ContinueProgram of PulseAbductiveDomain.t (** represents the state at the program point *)
| ExitProgram of PulseAbductiveDomain.t
(** represents the state originating at exit/divergence. *)
```
Now, Pulse's actual domain is tracked by `PulseExecutionState` and as soon as we try to analyze an instruction at `ExitProgram`, we simply return its state.
The aim is to recover the state at the time of the exit, rather than simply ignoring them (i.e. returning empty disjuncts). This allows us to get rid of some FNs that we were not able to detect before. Moreover, it also allows the impurity analysis to be more precise since we will know how the state changed up to exit.
TODO:
- Impurity analysis needs to be improved to consider functions that simply exit as impure.
- The next goal is to handle error state similarly so that when pulse finds an error, we recover the state at the error location (and potentially continue to analyze?).
Disclaimer: currently, we handle throw statements like exit (as was the case before). However, this is not correct. Ideally, control flow from throw nodes follows catch nodes rather than exiting the program entirely.
Reviewed By: jvillard
Differential Revision: D20791747
fbshipit-source-id: df9e5445a
Summary:
Malloc returns either an allocated object or a null pointer if there is no memory available. Modelling that.
This has always been a bit contentious because this leads to NPEs that people often ignores because they don't care. But if we don't model this, then we have FPs when people do take this into account when freeing the memory.
Reviewed By: jvillard
Differential Revision: D20791692
fbshipit-source-id: 6fd259f12
Summary: Modelling `CG.*Release ` and `CFRelease` as `free`. This is what we were doing in biabduction.
Reviewed By: skcho
Differential Revision: D20767174
fbshipit-source-id: c77c1cdc6
Summary:
This models all the Create and Copy functions from CoreGraphics, examples in the tests.
These functions all allocate memory that needs to be manually released.
The modelling of the release functions will happen in a following diff. Until then, we have some false positives in the tests.
This check is currently in biabduction, and we aim to move it to Pulse.
Reviewed By: jvillard
Differential Revision: D20626395
fbshipit-source-id: b39eae2d9
Summary:
- Model `System.exit()` as early_exit and add a test
- Tweak message of methods that are impure due to having no pulse summary (and add a test)
Reviewed By: skcho
Differential Revision: D20668979
fbshipit-source-id: 6b5589aae
Summary:
- Add more naive pulse models for:
- `System.arraycopy`
- `StringBuilder.setLength`
- `StringBuilder.delete`
- Model the following as pure
- `SparseArrayCompat.valueAt`
- `File.get...`
- Add a nice test
Reviewed By: jvillard
Differential Revision: D20513397
fbshipit-source-id: 6d412d13a
Summary:
This diff naively models the following as `StdVector.push_back`:
- `StringBuilder.append`
- `String.replace`
- `Queue.poll`
It also adds a FN test for `Iterator.next`.
Reviewed By: skcho
Differential Revision: D20469786
fbshipit-source-id: 2d8e8d117
Summary: In preparation for PulseArithmetic to be something else.
Reviewed By: ezgicicek
Differential Revision: D20393928
fbshipit-source-id: d93131e12
Summary:
Adding a model for malloc: we add an attribute "Allocated". This can be used for implementing memory leaks: whenever the variables get out of scope, we can check that if the variable has an attribute Allocated, it also has an attribute Invalid CFree.
Possibly we will need more details in the Allocated attribute, to know if it's malloc, or other allocation function, but we can add that later when we know how it should look like.
Reviewed By: jvillard
Differential Revision: D20364541
fbshipit-source-id: 5e667a8c3
Summary: This diff adds a model for Java's `Object.clone()` method (similar to existing shallow_copy).
Reviewed By: jvillard
Differential Revision: D20341073
fbshipit-source-id: 30ae40fe7
Summary:
Add let*/+ syntax to `result` types to simplify all the applications of
`>>=`, `>>|` that are followed by a binding (eg `>>= fun x -> ...`) in
pulse.
Reviewed By: skcho
Differential Revision: D19940728
fbshipit-source-id: 4df159029
Summary:
Model array length in Java as returning an unknown interval [0, +inf] for now.
Ideally, we can deal with the size in a more precise manner in the future like in InferBo.
Reviewed By: skcho
Differential Revision: D19312123
fbshipit-source-id: 8c51059a4
Summary:
In order to improve the impurity analysis, this diff adds models for
- `hasNext()` and - `Object.equals()` modeled as returning a non-deterministic value (havoc_id)
- `next()` modeled as `StdVector.get` with a fresh index
- `iterator` modeled as just returning the underlying list
Reviewed By: jvillard
Differential Revision: D19177392
fbshipit-source-id: 0babb037a
Summary:
Remove Clang and Java submodules of Typ.Fieldname. They are unnecessary and they reflect a fake dichotomy: there is only one fieldname type. To distinguish between fields of Java classes and other C constructs, there is a helper function provided, but the idea is simple: obtain the class type the field belongs to, and check if it's a Java class.
This diff still preserves behaviour, but removes as many functions as possible from the interface, to leave a small surface.
Reviewed By: mityal
Differential Revision: D18962423
fbshipit-source-id: ffe6933ee
Summary:
That class does some complicated accounting of memory that depends on
whether the string is "small", "medium" or "large". In the latter case
it does its own ref-counting and copy-on-write to save memory, and that
trips up pulse. Pretending all strings are small avoids that issue.
Reviewed By: skcho
Differential Revision: D18909030
fbshipit-source-id: 1c14d909b
Summary:
Every time we add an arithmetic information, add the corresponding
inferbo one.
Reviewed By: skcho
Differential Revision: D18888863
fbshipit-source-id: ab4afd372
Summary:
It's a well-known fact that pulse should know too. To avoid splitting
the abstract state systematically, only act if we know the pointer is
exactly 0 to avoid reporting a nullptr dereference on `free(x)`.
Reviewed By: ezgicicek
Differential Revision: D18708575
fbshipit-source-id: 1cc3f6908
Summary:
Turns out code uses atomics in important places, modelling it removes
FPs.
The tests are copied from biabduction and adapted and extended a bit. I
didn't implement compare_exchange primitives for now (plus, giving them
a sequential semantics like in biabduction is probably a bit cheeky).
Reviewed By: skcho
Differential Revision: D18708576
fbshipit-source-id: a3581b8a4
Summary: Rather than repeatedly matching actuals, let's use `ProcnameDispatcher.ModeledCall` to pick up the actual arguments with their corresponding values. This simplifies the models.
Reviewed By: jvillard
Differential Revision: D18685855
fbshipit-source-id: 7788bd8bb
Summary:
This was never set to true except in a wrong way in the Java frontend
(see previous diff).
Reviewed By: dulmarod
Differential Revision: D18573927
fbshipit-source-id: 4c9d1a855
Summary:
We consider Java collections to be like c++ std::vectors and add models for
- `Collections.get(..)`
- `__cast`
Reviewed By: skcho
Differential Revision: D18449607
fbshipit-source-id: 448206c84
Summary:
The name had rotten: it should be `AddrHistPair`. There is little value
of exposing the type of the pair `AbstractValue.t * ValueHistory.t`,
just inline its definition everywhere.
Reviewed By: ezgicicek
Differential Revision: D17955283
fbshipit-source-id: d145251e0
Summary:
Problem: PulseDomain.ml is pretty big, and contains lots of small
modules. The Infer build being a bit monolithic at the moment, it is
hard to split all these small modules off without creating some
confusion about which abstraction barries lay where. For instance, it's
fine to use `PulseDomain.ValueHistory` anywhere, but using `PulseDomain`
itself is sometimes bad when one should use `PulseAbductiveDomain`
instead.
Proposal: a poorman's library mechanism based on module aliasing. This
stack of diffs creates new Pulse* modules for all these small, safe to
use modules, together with `PulseBasicInterface.ml`, which aliases these
modules to remove the `Pulse` prefix. At the end of the stack, it will
contain:
```
module AbstractValue = PulseAbstractValue
module Attribute = PulseAttribute
module Attributes = PulseAttribute.Attributes
module CallEvent = PulseCallEvent
module Diagnostic = PulseDiagnostic
module Invalidation = PulseInvalidation
module Trace = PulseTrace
module ValueHistory = PulseValueHistory
```
This "interface" module can be opened in other pulse modules freely.
Reviewed By: ezgicicek
Differential Revision: D17955104
fbshipit-source-id: 13d3aa2b5
Summary:
bigmacro_bender
There are 3 ways pulse tracks history. This is at least one too many. So
far, we have:
1. "histories": a humble list of "events" like "assigned here", "returned from call", ...
2. "interproc actions": a structured nesting of calls with a final "action", eg "f calls g calls h which does blah"
3. "traces", which combine one history with one interproc action
This diff gets rid of interproc actions and makes histories include
"nested" callee histories too. This allows pulse to track and display
how a value got assigned across function calls.
Traces are now more powerful and interleave histories and interproc
actions. This allows pulse to track how a value is fed into an action,
for instance performed in callee, which itself creates some more
(potentially now interprocedural) history before going to the next step
of the action (either another call or the action itself).
This gives much better traces, and some examples are added to showcase
this.
There are a lot of changes when applying summaries to keep track of
histories more accurately than was done before, but also a few
simplifications that give additional evidence that this is the right
concept.
Reviewed By: skcho
Differential Revision: D17908942
fbshipit-source-id: 3b62eaf78
Summary:
A common gotcha is the new test. Model the minimum amount of
`std::basic_string` to catch it.
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121090
fbshipit-source-id: 66f06cb43
Summary:
Be more flexible in what type of function calls are allowed in `ViaCall ...` actions to be able to include models.
Also get rid of `here here` in traces /o\
As a side-effect, get more precise (=qualified) procedure names in
traces (but not in messages so as not to be too verbose).
Reviewed By: mbouaziz, ngorogiannis
Differential Revision: D16121092
fbshipit-source-id: fb51b02f8
Summary:
Similar to D16005395: `folly::Optional` has a boolean field to know if
it needs to destroy the wrapped object and pulse ignores that
completely, causing false positives each time an `Optional` is created
around something with a non-trivial destructor.
Reviewed By: mbouaziz
Differential Revision: D16030149
fbshipit-source-id: aeed4a0b3
Summary: Not sure if anyone uses this but there, now it's modelled.
Reviewed By: mbouaziz
Differential Revision: D16008162
fbshipit-source-id: f4795dcba
Summary: We know how to do interprocedural calls so let's use that!
Reviewed By: mbouaziz
Differential Revision: D16008164
fbshipit-source-id: 4c34bf704
Summary:
`function::operator=` is called whenever we assign a literal lambda to a
variable, so it's pretty useful to be able to report anything on
lambdas.
Reviewed By: mbouaziz
Differential Revision: D16008163
fbshipit-source-id: a9d07668d
Summary:
The constructor of `folly::SocketAddress` conditionally deletes some
object and then makes that condition false. The destructor then does the
same. Pulse ignores conditionals so will see a double delete.
Just skip that function for now, but it should be easy for pulse to be
more correct here if it knew how to compare constant values.
Reviewed By: mbouaziz
Differential Revision: D16005395
fbshipit-source-id: 036f5091b
Summary:
[apologies for the unreviewable diff...]
Get rid of HIL expressions in pulse. This finishes the HIL -> SIL
migration. The first step made pulse start from SIL instructions but
would translate most accesses to HIL to re-use most of the existing
pulse code. This diff gets rid of the intermediate translation of SIL
expressions to HIL expressions.
Big changes:
1. `PulseOperations` mostly rewritten, driven by using `Exp.t` instead of `HilExp.AccessExpression.t` for everything.
2. Stop trying to reverse-engineer what addresses mean in terms of
access paths from program variables. Rely on the trace pointing at
the right places in the code to be enough. This is because it wasn't
that useful (and could even be misleading when wrong) but could be
prohibitively expensive in degenerate cases (eg nodes with tens of
thousands of successive array accesses...)
3. `PulseAbductiveDomain.apply_post` now returns the computed return
value instead of recording it itself.
4. Change of vocabulary: `materialize` -> `eval`, `crumb` -> `event`
5. Function calls arguments are now evaluated prior to doing anything
else, which saves everything else from having to (remember to) do
that. In particular, this changes how models look quite a bit.
Reviewed By: mbouaziz
Differential Revision: D15986373
fbshipit-source-id: 1d79935de
Summary:
Now that HIL doesn't help us anymore we need to reconstruct its mapping
"SIL logical var -> program access path". We already have everything we
need in pulse: it suffices to walk the current memory graph starting
from program variables until we find the value of the temporary we are
interested in.
This diff also builds some type machinery to make sure all accesses are
explained.
Reviewed By: mbouaziz
Differential Revision: D15824959
fbshipit-source-id: 722c81b39
Summary:
Just moving code around.
This is needed later to make some types in `PulseTrace` depend on
a new that I'll have to define in `PulseDomain`.
Also, this gives better names all around I think
Reviewed By: mbouaziz
Differential Revision: D15881281
fbshipit-source-id: e86c1472e
Summary:
Just moving code around.
This is needed later to make some types in `PulseInvalidation` depend on
a new type that I'll have to define in `PulseDomain`.
Reviewed By: mbouaziz
Differential Revision: D15824962
fbshipit-source-id: 86cba2bfb
Summary:
This is useful for the model of `exit` that returns 0 disjuncts. All
other models return 1 disjunct for now, but in the future things like
`malloc()` will need to return 2 possible states for instance.
Reviewed By: ngorogiannis
Differential Revision: D14753491
fbshipit-source-id: 3e7387d6d
Summary:
This isn't needed now that this information is recorded in
`PulseTrace.action` instead.
Reviewed By: mbouaziz
Differential Revision: D14645089
fbshipit-source-id: 9c3f38722
Summary:
For each operation on the domain, try to record what it requires of the
precondition of the function. This is akin to what happens in the
biabduction backend, hence the terminology used.
Reviewed By: jberdine
Differential Revision: D14387148
fbshipit-source-id: a61fe30c8
Summary:
Instead of emitting an ad-hoc builtin on variable declaration emit a new
metadata instruction. This allows us to remove the code matching on that
ad-hoc builtin that had to be inserted in several checkers.
Inferbo & pulse used that information meaningfully and had to undergo
some minor changes to cope with the new metada instruction.
Reviewed By: ezgicicek
Differential Revision: D14833100
fbshipit-source-id: 9b3009d22
Summary: It's all grown up now and taking quite some space in src/checkers/.
Reviewed By: skcho
Differential Revision: D14568273
fbshipit-source-id: b843c031e