- where size of integer and floating point numbers matters (overflow behavior and interpretation of conversions)
+ cons
- perf: increases size of representation of Exp, perhaps a lot
- code complexity: need to plumb through target-specific data in order to e.g. be able to create equalities at intptr type
- instructions and globals could use accurate types to replace len fields with static sizeof type
- load instructions would need accurate types on reg to create equalities between it and its value in Exec
- memcpy and memmov would need types to create equality between src and dst in Exec
- formals would need types, to create equalities between formals and actuals in Domain
- types could be useful for approximate human-readable printing for general expressions
+ to print p+o as p.f, will likely need to consult what p is equal to, to find some meaningful type, and it could easily take much more work than this to produce reliably readable results
- target-specific types and layout
+ change Typ.target into a separate module
+ construct an instance in frontend as first step
+ use it during translation
+ return it as part of program
+ pass it from Control to Domain, etc.
- function types could include the types of throw continuation args
but they are currently the same for all functions: i8*
** ? change blocks to take all free variables as args
+ currently the scope of an identifier bound by e.g. Load is the continuation of the inst as well as all the conts that it dominates, this is somewhat messy
+ build a table from blocks to conts
+ build a table from blocks to free vars
+ need a fixed-point computation for blocks to vars table
+ to xlate a block
- get the terminator
- if all the destination blocks except the current block are already in the table
* then
- xlate block itself like now
+ when get to the terminal
+ look up free vars vector of the jump destinaton in table
+ map over the vector
* if the var is the name of a PHI instr
- find and translate the arg for the src block of the jmp instr
use the find_map of find_jump_args
* else use the var
+ use this vector for the jump args
- compute the free vars of its code
- use this vector for the cont params
- add free vars to table
- add block to cont mapping to table
* else recurse over the destination blocks except the current block
+ after entry block (and recursively everything reachable from it) is xlated, map over the function block list looking up from the table to get order of conts to match order of blocks
separation between xlate_intrinsic (which translates an intrinsic function name to an expression constructor) and the Call case of xlate_instr (which translates calls to intrinsic functions to instructions) is not clear
** extract struct field names from llvm debug info
** normalize cfg
- remove unreachable blocks
- combine blocks with cmnd= []; term= Unreachable into one
- nothing prevents the compiler from generating code that directly manipulates the target-specific va_list struct, and it appears to do so at least for amd64, so the only safe approach is to use the same representation:
- add a local variable 'top' to each function with a non-entry alloca; that points to a pointer that always points to the head of the stack; initially NULL
- alloca in non-entry blocks adds an element and stores the result of alloc in it, sets next to contents of 'top', and stores result into 'top'
- function return (and other popping terminators) traverses the stack, popping elements, calling free on the slot pointers, until finding NULL in next
- one solution: pre-process llvm to remove [optnone] attributes before running scalarizer pass
** ? remove Exp.Nondet, replace with free variables
it is not obvious whether it will be simpler to use free variables instead of Nondet in the frontend, or to treat Nondet as a single-occurrence existential variable in the analyzer
** llvm bugs?
- Why aren't shufflevector instructions with zeroinitializer masks eliminated by the scalarizer pass?
** should handle equality and disequality simplification
- equalities of equalities to integers currently handled by Sh.pure
- doing it in Exp leads to violations of the subexp assertion on app1
** optimize: change Cls.t and Use.t from a list to an unbalanced tree data structure
- only need empty, add, union, map, fold, fold_map to be fast, so no need for balancing
- detecting duplicates probably not worth the time since if any occur, the only cost is adding a redundant equation to pnd which will be quickly processed
** optimize: when called from extend, norm_extend calls norm unnecessarily
** revise mli to two sections, one for a "relation" api (with merge, mem/check, etc) and one for a "formula" api (with and_, or_, etc.)
** optimize: can identity mappings in lkp be removed?
* symbolic heap
** normalize exps in terms of reps
- add operation to normalize by rewriting in terms of reps
- check for unsat
- call it in Exec.assume
** eliminate existentials
by changing Congruence reps to avoid existentials if possible and then normalizing Sh ito reps
** add exps in pure and pto (including memory siz and arr) to carrier
** Sh.with_pure is an underspeced, tightly coupled, API: replace
Sh.with_pure assumes that the replaced pure part is defined in the same vocabulary, induces the same congruence, etc. This API is fragile, and ought to be replaced with something that has simpler assumptions without imposing an excessive pessimization.
** optimize Sh.and_ with direct implementation
** perhaps it would be better to allow us and xs to intersect
but to rename xs when binding them or otherwise operating under the quantifier. But it might be an unnecessary complication to always have to deal with the potential for shadowing.
** consider how to detect unsat formulas
in relation to also wanting to express formulas in terms of congruence
class representatives in order to perform quantifier elimination. Is
there a way to detect unsat at the same time / as part of the same
normalization?
** consider hoisting existentials over disjunction:
#+BEGIN_SRC ocaml
| _ ->
let us = Set.union q1.us q2.us in
let xs1, xs, xs2 = Set.diff_inter_diff q1.xs q2.xs in
let us1 = Set.union q1.us xs in
let us2 = Set.union q2.us xs in
{ us
; xs
; cong= Congruence.true_
; pure= []
; heap= []
; djns= [[{q1 with us= us1; xs= xs1}; {q2 with us= us2; xs= xs2}]] }
| _ ->
let xs1, vs1 = Set.inter_diff q1.xs q2.us in
let xs2, vs2 = Set.inter_diff q2.xs q1.us in
let us1 = Set.union q1.us vs1 in
let us2 = Set.union q2.us vs2 in
let us = Set.union q1.us q2.us in
let xs = Set.union vs1 vs2 in
{ us
; xs
; cong= Congruence.true_
; pure= []
; heap= []
; djns= [[{q1 with us= us1; xs= xs1}; {q2 with us= us2; xs= xs2}]] }
#+END_SRC
** consider how to arrange to have a complete set of variables
at the top of formulas so that freshening wrt them is guaranteed not to clash with subformulas. This would allow removing the call to freshen_xs in rename, which is called on every subformula for every freshen/rename operation. Is it complicated to make us always include xs, as well as the us of the subformulas? That would allow the top-level us to serve as such a complete set of vars. How often would we need to compute us - xs?
** think about how to avoid having to manipulate disjunct formulas
unnecessarily, e.g. freshening, etc.
** ? should star strengthen djns with stem's cong
** optimize: refactor Sh.pure to avoid `Congruence.(and_eq true_ ...)`
** consider strengthening cong of or_ at price of freshening existentials
** consider using the append case when freshening existentials is needed
** strengthen Sh.pure_approx
* solver
** solve more existential equations in excise_exp
If sub.pure contains an equation involving an existential, add equation to min, remove the var from xs, continue. If all pure atoms normalize to true, added equations induce good existential witnesses, and excise will return them as part of min.
in calls to exec_spec, only vars in post need appear in xs, others can be existential in foot
* domain
** implement resolve_virtual to not skip virtual calls
** consider lazy renaming
- instead of eagerly constructing renaming substitutions, traverse the formula and lazily construct the renaming substitution map
- may be better in case there are many variables that do not occur in the formula
* control
** change Depths.t from environment- to state-like treatment
- currently each waiting state has an associated depths map
- the depths of all edges into a destination are joined
- could the depths be just threaded through Work.run instead?
- this would involve changing type x to Depths.t -> t -> Depths.t * t, and removing Depths.t from waiting_states
- separate joining depths from joining states
- i.e. Change to repeatedly pop edges as long as the dst is the same, and only join the states for those. This would involve keeping the waiting states in the priority queue, and removing the waiting states map entirely.
** change Work.run to move Domain.join into ~f
** canonicalize renamings in stacks
It seems possible that two edges will be distinct only due to differences between choice of fresh variable names for shadowed variables. It is not obvious that this could not lead to an infinite number of Edge.t values even without recursion. Using predictable names for local variables, such as a pair of the declared name and the depth of the stack, would avoid these difficulties.