infer_clone

Commit Graph

Author	SHA1	Message	Date
Scott Owens	17b3c7a49f	[sledge sem] Add top-level llair semantics Summary: Give the llair semantics observable side effects (writes to global variables) and a semantic function mirroring the LLVM semantics. Start sketching out the LLVM/llair translation equivalence proof in a top-down way from the obvious statement of equality of the semantics. Reviewed By: jberdine Differential Revision: D17399654 fbshipit-source-id: 2170678a8	5 years ago
Scott Owens	30c301a3e8	[sledge sem] Add a more llair-like LLVM semantics Summary: The simple LLVM semantics steps one instruction at a time, but the generated llair does whole blocks at a time, since many individual LLVM instructions can become a single llair expression. We add a bigger-step LLVM semantics that does whole blocks at a time (except that it also stops at function calls, since those end blocks in llair). The steps in this bigger-step semantics should be at the same granularity as the llair steps, making it easier to verify the translation. We add a notion of observation to the LLVM semantics (right now, just global variable writes) and use that to define two top-level semantic functions, which we prove to be equivalent. Reviewed By: jberdine Differential Revision: D17396016 fbshipit-source-id: ee632fb92	5 years ago
Scott Owens	f298d728c5	[sledge sem] Start sketching translation correctness Summary: This includes a few changes and corrections to the semantics, to support the translation. This initial attempt to reason about LLVM -> llair showed three things that needed repair in the semantics, in addition to various bugs. We address them as follows. Refactor llair semantics to have only a single kind of flat value: integers that fit into specified bit widths. Operations on size values (e.g., offsets, indices and the like) can just take an integer and ignore its number of bits. Pointers can just be considered integers that fit into a certain size given by the constant pointer_size. Later on we can consider making this a parameter to the model. Change the generic memory model interface to use numbers rather than words as the generic encoding of a large value. This makes it more useful for llair where words are not used. Pay more careful attention to signed/unsigned issues. Neither LLVM nor llair have a concept of signed vs unsigned value. Instead individual operations interpret bit patterns in various ways, some of which are ambiguous in the LLVM manual. For example, since getelementpointer's indices are explicitly said to be interpreted as signed 2's complement, we should probably do the same for insertvalue and extractvalue. However it is not clear how the argument to alloca is to be interpreted. For now we assume signed. Reviewed By: jberdine Differential Revision: D17164133 fbshipit-source-id: 31a8af635	5 years ago
Scott Owens	9f44bbc264	[sledge semantics] Refactor the memory model Summary: LLVM and llair have similar memory models, and we don't want to duplicate any definitions or theorems. This adds a new memory model theory which should be understandable in its own right. A heap is a mapping from addresses to bytes, alongside a set of valid addresses, and intervals that have been allocated already. Primitives are defined for allocating and de-allocating as well as reading and writing chuncks of bytes. There is also a generic type of structured values, and functions for converting them to/from byte arrays. Reviewed By: jberdine Differential Revision: D17074470 fbshipit-source-id: bdab6089f	5 years ago
Scott Owens	808a61623f	Add types to the variable syntax in llair Summary: Each variable now contains its type, alongside its name. This is more uniform than in LLVM, where the name is usually paired with a type, but not always, for example, the register type of the result of an extractvalue is left implicit. Reviewed By: jberdine Differential Revision: D16984630 fbshipit-source-id: 1c3bc4985	5 years ago
Scott Owens	84883127af	Add a skeleton of an approach to llvm->llair Summary: This sketches out how translation can be approached. It is partially based on the Sledge code. For basic blocks, isn't based on the Sledge code, but just my own thoughts as a starting point. Essentially, we are trying to build up larger expressions, and so not assigning to temporary registers that don't live past the end of the block. This does remove sharing, so a fancier approach could check for multiple uses of end-of-block dead registers, or look at the sizes of expressions. The approach should be flexible enough to accommodate such changes. Fix icmp syntax Using finite maps is elegant in the semantics, but awkward for writing the translation function. Refactor the mappings from labels to functions and from labels to blocks to use association lists instead. To remove phi nodes, the translation takes every edge in the control flow graph and makes a new basic block that contains a single parallel move instruction that corresponds to the action of the phi node of the target block. Reviewed By: jberdine Differential Revision: D16831051 fbshipit-source-id: 005663e26	5 years ago
Scott Owens	a635aff1bc	Finish proving sanity checking property Summary: There could very well still be bugs in the semantics, since the invariant here doesn't say all that much, and it completely ignores local registers. But most trivial things and typos are probably fixed. Reviewed By: jberdine Differential Revision: D16803281 fbshipit-source-id: 48ba2523b	5 years ago
Scott Owens	89c3da4510	Prove that Ret preserves the invariant Summary: Made progress on the sanity checking lemma (that the step relation preserves some simple invariants on the state). Proved the Ret instruction case of the state invariant lemma. To do this, I fixed a few bugs in the definition, and strengthened the invariants. Reviewed By: jberdine Differential Revision: D16786900 fbshipit-source-id: 6fa8cb170	5 years ago
Scott Owens	df5f20956f	Define a simple initial state that inits the globals Summary: Global variables need allocating and initialising before the machine can start. The definition here shouldn't constrain how and where they are allocated. For example, they don't all need to have separate allocations. We also tag allocated blocks so that the allocation for a global can never be deallocated. Start working on a sanity checking invariant on states. Reviewed By: jberdine Differential Revision: D16735068 fbshipit-source-id: 0d5e60e7a	5 years ago
Scott Owens	97eb280cb5	Add initial mini-LLVM semantics written in HOL4 Summary: Start working on a simple model of LLVM with the ultimate goal of handling relevant and/or tricky aspects of LLVM and LLAIR and then formalising the translation from LLVM to LLAIR. This is a complete initial model of everything that we are interested in except for exceptions, which should be tricky. Also no thought has gone into the treatment of poison and the undefined value, so the treatment is naive, which is at least partially justified because we are interested in the semantics of LLVM IR after the optimisation passes have run. Include some sanity checking theorems. Reviewed By: jberdine Differential Revision: D16731885 fbshipit-source-id: fd53949fe	5 years ago

10 Commits (17b3c7a49f0ce7b7aaac4e63992c4d53c1cbab30)