A tool to lift x86 binary code into logic, unrolling loops as needed.
Lift some x86 binary code into an ACL2 representation, by symbolic execution including inlining all functions and unrolling all loops.
Usually,
To inspect the resulting DAG, you can simply enter its name at the prompt to print it.
(def-unrolled lifted-name &key :target ; default :entry-point :executable ; default :none :inputs ; default :skip :output ; default :all :extra-assumptions ; default nil :suppress-assumptions ; default nil :inputs-disjoint-from ; default :code :assume-bytes ; default :non-write :stack-slots ; default 100 :existing-stack-slots ; default :auto :position-independent ; default :auto :type-assumptions-for-array-vars ; default t :prune-precise ; default 1000 :prune-approx ; default t :extra-rules ; default nil :remove-rules ; default nil :extra-assumption-rules ; default nil :remove-assumption-rules ; default nil :step-limit ; default 1000000 :step-increment ; default 100 :stop-pcs ; default nil :memoizep ; default t :monitor ; default nil :normalize-xors ; default nil :count-hits ; default nil :print ; default :brief :print-base ; default 10 :max-printed-term-size ; default 10000 :untranslatep ; default t :produce-function ; default t :non-executable ; default :auto :produce-theorem ; default nil :prove-theorem ; default nil :max-result-term-size ; default 10000 :restrict-theory ; default t )
A symbol, the name to use for the generated function. The name of the generated constant is created by adding stars to the front and back of this symbol.
Where to start lifting (a numeric offset, the name of a subroutine (a string), or the symbol :entry-point)
The x86 binary executable that contains the target function. Usually this is a string representing the file name/path of the executable. However, it can instead be a parsed executable (satisfying
parsed-executablep ).
Either the special value :skip (meaning generate no additional assumptions on the input) or a doublet list pairing input names with types. Types include things like u32, u32*, and u32[2].
An indication of which state component(s) will hold the result of the computation being lifted. See output-indicatorp.
Extra assumptions for lifting, in addition to the standard-assumptions
Whether to suppress the standard assumptions. This does not suppress any assumptions generated about the :inputs.
What to assume about the inputs (specified using the :inputs option) being disjoint from the sections/segments in the executable. The value :all means assume the inputs are disjoint from all sections/segments. The value :code means assume the inputs are disjoint from the code/text section. The value nil means do not include any assumptions of this kind.
Indication of which sections/segments to assume still have their original bytes, either
:all (meaning assume it for all sections/segments) or:non-write (meaning assume it for only non-writeable sections/segments). Note that global variables may be initialized to certain values but may have then been overwritten before the function being lifted is called, so it may not be appropriate to assume such variables still have their original values.
How much unused stack space to assume is available, in terms of the number of stack slots, which are 4 bytes for 32-bit executables and 8 bytes for 64-bit executables. The stack will expand into this space during (symbolic) execution.
How much available stack space to assume exists. Usually at least 1, for the saved return address.
Whether to attempt the lifting without assuming that the binary is loaded at a particular position.
Whether to put in type assumptions for the variables that represent elements of input arrays.
Whether to prune DAGs using precise contexts. Either t or nil or a natural number representing the smallest dag size that we deem too large for pruning (where here the size is the number of nodes in the corresponding term). This kind of pruning can blow up if attempted for DAGs that represent huge terms.
Whether to prune DAGs using approximate contexts. Either t or nil or a natural number representing the smallest dag size that we deem too large for pruning (where here the size is the number of nodes in the corresponding term). This kind of pruning should not blow up but doesn't use fully precise contextual information.
A symbol-list indicating rules to use, in addition to (unroller-rules32) or (unroller-rules64) plus a few others.
A symbol-list indicating rules to turn off.
Extra rules to be used when simplifying assumptions.
Rules to be removed when simplifying assumptions.
Limit on the total number of symbolic executions steps to allow (total number of steps over all branches, if the simulation splits).
Number of model steps to allow before pausing to simplify the DAG and remove unused nodes.
A list of program counters (natural numbers) at which to stop the execution, for debugging.
Whether to memoize during rewriting (when not using contextual information -- as doing both would be unsound).
Rule names (symbols) to be monitored when rewriting.
Whether to normalize BITXOR and BVXOR nodes when rewriting (t, nil, or :compact).
Whether to count rule hits during rewriting (t means count hits for every rule, :total means just count the total number of hits, nil means don't count hits)
Verbosity level.
Base to use when printing during lifting. Must be either 10 or 16.
Max term-size of a DAG that is allowed to be printed as a term. Larger DAGs will be printed as DAGs, not terms.
Whether to untranslate terms when printing.
Whether to produce a function, not just a constant DAG, representing the result of the lifting.
Whether to make the generated function non-executable, e.g., because stobj updates are not properly let-bound. Either t or nil or :auto.
Whether to try to produce a theorem (possibly skip-proofed) about the result of the lifting.
Whether to try to prove the theorem with ACL2 (rarely works, since Axe's Rewriter is different and more scalable than ACL2's rewriter).
Max term-size of a result if it is to be represented as a term (when printing it, and in the generated function). A larger result will be represented as a DAG, embedded in the function using an evaluator.
To be deprecated...