class Metasm::Disassembler

a disassembler class holds a copy of a program sections, a list of decoded instructions, xrefs is able to backtrace an expression from an address following the call flow (backwards)

Attributes

address_binding[RW]

hash address => binding

addrs_todo[RW]

list of [addr to disassemble, (optional)who jumped to it, (optional)got there by a subfunction return]

backtrace_maxblocks[RW]

number of blocks to backtrace before aborting if no result is found (defaults to class.backtrace_maxblocks, 50 by default)

backtrace_maxblocks_data[RW]

maximum backtrace length for :r/:w, defaults to backtrace_maxblocks

backtrace_maxblocks_fast[RW]

max bt length for backtrace_fast blocks, default=0

backtrace_maxcomplexity[RW]

max complexity for an Expr during backtrace before abort

backtrace_maxcomplexity_data[RW]

max complexity for an Expr during backtrace before abort

c_parser[RW]

a cparser that parsed some C header files, prototypes are converted to DecodedFunction when jumped to

callback_finished[RW]

callback called once all addresses have been disassembled

callback_newaddr[RW]

callback called whenever an instruction will backtrace :x (before the backtrace is started) arguments: |addr of origin, array of exprs to backtrace| must return the replacement array, nil == []

callback_newinstr[RW]

called whenever an instruction is decoded and added to an instruction block. arg: the new decoded instruction returns the new di to consider (nil to end block)

callback_prebacktrace[RW]

callback called before each backtrace that may take some time

callback_selfmodifying[RW]

called whenever the disassembler tries to disassemble an addresse that has been written to. arg: the address

callback_stopaddr[RW]

called when the disassembler stops (stopexec/undecodable instruction)

check_smc[RW]

bool, true to check write xrefs on each instr disasm (default true)

comment[RW]

hash address => array of strings default dasm dump will only show comments at beginning of code blocks

cpu[RW]
debug_backtrace[RW]
decoded[RW]

hash addr => DecodedInstruction

disassemble_known_functions[RW]

if false, disassembler skips internal functions with a prototype defined in a C header (eg static libraries)

disassemble_maxblocklength[RW]

maximum number of instructions inside a basic block, split past this limit

entrypoints[RW]
funcs_stdabi[RW]

bool, set to true (default) if functions with undetermined binding should be assumed to return with ABI-conforming binding (conserve frame ptr)

function[RW]

hash addr => DecodedFunction (includes 'imported' functions)

gui[RW]

pointer to the gui widget we're displayed in

misc[RW]

arbitrary data stored by other objects

prog_binding[RW]

binding (jointure of @sections.values.exports)

program[RW]
sections[RW]

hash addr => edata

xrefs[RW]

hash addr => (array of) xrefs - access with add_xref/each_xref

Public Class Methods

autoexe_load(f, &b) click to toggle source

allows us to be AutoExe.loaded

# File metasm/disassemble.rb, line 2276
def self.autoexe_load(f, &b)
        d = load(f, &b)
        d.program
end
backtrace_maxblocks() click to toggle source

access the default value for @@backtrace_maxblocks for newly created Disassemblers

# File metasm/disassemble_api.rb, line 135
def self.backtrace_maxblocks ; @@backtrace_maxblocks ; end
backtrace_maxblocks=(b) click to toggle source
# File metasm/disassemble_api.rb, line 136
def self.backtrace_maxblocks=(b) ; @@backtrace_maxblocks = b ; end
load(str, &b) click to toggle source

loads a disassembler from a saved file

# File metasm/disassemble_api.rb, line 1105
def self.load(str, &b)
        d = new(nil, nil)
        d.load(str, &b)
        d
end
new(program, cpu=program.cpu) click to toggle source

creates a new disassembler

# File metasm/disassemble.rb, line 448
def initialize(program, cpu=program.cpu)
        reinitialize(program, cpu)
end

Public Instance Methods

add_comment(addr, cmt) click to toggle source

adds a commentary at the given address comments are found in the array @comment: {addr => [list of strings]}

# File metasm/disassemble_api.rb, line 140
def add_comment(addr, cmt)
        @comment[addr] ||= []
        @comment[addr] |= [cmt]
end
add_section(encoded, base) click to toggle source

adds a section, updates prog_binding base addr is an Integer or a String (label name for offset 0)

# File metasm/disassemble.rb, line 477
def add_section(encoded, base)
        encoded, base = base, encoded if base.kind_of?(EncodedData)
        case base
        when ::Integer
        when ::String
                raise "invalid section base #{base.inspect} - not at section start" if encoded.export[base] and encoded.export[base] != 0
                if ed = get_edata_at(base)
                        ed.del_export(base)
                end
                encoded.add_export base, 0
        else raise "invalid section base #{base.inspect} - expected string or integer"
        end

        @sections[base] = encoded
        @label_alias_cache = nil
        encoded.binding(base).each { |k, v|
                @old_prog_binding[k] = @prog_binding[k] = v.reduce
        }

        # update section_edata.reloc
        # label -> list of relocs that refers to it
        @inv_section_reloc ||= {}
        @sections.each { |b, e|
                e.reloc.each { |o, r|
                        r.target.externals.grep(::String).each { |ext| (@inv_section_reloc[ext] ||= []) << [b, e, o, r] }
                }
        }

        self
end
add_xref(addr, x) click to toggle source
# File metasm/disassemble.rb, line 508
def add_xref(addr, x)
        case @xrefs[addr]
        when nil; @xrefs[addr] = x
        when x
        when ::Array; @xrefs[addr] |= [x]
        else @xrefs[addr] = [@xrefs[addr], x]
        end
end
addr_to_fileoff(addr) click to toggle source

transform an address into a file offset

# File metasm/disassemble_api.rb, line 482
def addr_to_fileoff(addr)
        addr = normalize(addr)
        @program.addr_to_fileoff(addr)
end
auto_label_at(addr, base='xref', *rewritepfx) click to toggle source

returns the label at the specified address, creates it if needed using “prefix_addr” renames the existing label if it is in the form rewritepfx_addr returns nil if the address is not known and is not a string

# File metasm/disassemble.rb, line 602
def auto_label_at(addr, base='xref', *rewritepfx)
        addr = Expression[addr].reduce
        addrstr = "#{base}_#{Expression[addr]}"
        return if addrstr !~ /^\w+$/
        e, b = get_section_at(addr)
        if not e
                l = Expression[addr].reduce_rec if Expression[addr].reduce_rec.kind_of?(::String)
                l ||= addrstr if addr.kind_of?(Expression) and addr.externals.grep(::Symbol).empty?
        elsif not l = e.inv_export[e.ptr]
                l = @program.new_label(addrstr)
                e.add_export l, e.ptr
                if @label_alias_cache ||= nil
                        (@label_alias_cache[b + e.ptr] ||= []) << l
                end
                @old_prog_binding[l] = @prog_binding[l] = b + e.ptr
        elsif rewritepfx.find { |p| base != p and addrstr.sub(base, p) == l }
                newl = addrstr
                newl = @program.new_label(newl) unless @old_prog_binding[newl] and @old_prog_binding[newl] == @prog_binding[l]       # avoid _uuid when a -> b -> a
                rename_label l, newl
                l = newl
        end
        l
end
backtrace(expr, start_addr, nargs={}) click to toggle source

backtraces the value of an expression from start_addr updates blocks backtracked_for if type is set uses backtrace_walk all values returned are from backtrace_check_found (which may generate xrefs, labels, addrs to dasm) unless :no_check is specified options:

:include_start => start backtracking including start_addr
:from_subfuncret =>
:origin => origin to set for xrefs when resolution is successful
:orig_expr => initial expression
:type => xref type (:r, :w, :x, :addr)  when :x, the results are added to #addrs_todo
:len => xref len (for :r/:w)
:snapshot_addr => addr (or array of) where the backtracker should stop
 if a snapshot_addr is given, values found are ignored if continuing the backtrace does not get to it (eg maxdepth/unk_addr/end)
:maxdepth => maximum number of blocks to backtrace
:detached => true if backtracking type :x and the result should not have from = origin set in @addrs_todo
:max_complexity{_data} => maximum complexity of the expression before aborting its backtrace
:log => Array, will be updated with the backtrace evolution
:only_upto => backtrace only to update bt_for for current block & previous ending at only_upto
:no_check => don't use backtrace_check_found (will not backtrace indirection static values)
:terminals => array of symbols with constant value (stop backtracking if all symbols in the expr are terminals) (only supported with no_check)
:cpu_context => disassembler cpu_context
# File metasm/disassemble.rb, line 1525
        def backtrace(expr, start_addr, nargs={})
                include_start   = nargs.delete :include_start
                from_subfuncret = nargs.delete :from_subfuncret
                origin          = nargs.delete :origin
                origexpr        = nargs.delete :orig_expr
                type            = nargs.delete :type
                len             = nargs.delete :len
                snapshot_addr   = nargs.delete(:snapshot_addr) || nargs.delete(:stopaddr)
                maxdepth        = nargs.delete(:maxdepth) || @backtrace_maxblocks
                detached        = nargs.delete :detached
                max_complexity  = nargs.delete(:max_complexity) || @backtrace_maxcomplexity
                max_complexity_data = nargs.delete(:max_complexity) || @backtrace_maxcomplexity_data
                bt_log          = nargs.delete :log   # array to receive the ongoing backtrace info
                only_upto       = nargs.delete :only_upto
                no_check        = nargs.delete :no_check
                terminals       = nargs.delete(:terminals) || []
                cpu_context   = nargs.delete :cpu_context
                raise ArgumentError, "invalid argument to backtrace #{nargs.keys.inspect}" if not nargs.empty?

                expr = Expression[expr]

                origexpr = expr if origin == start_addr

                start_addr = normalize(start_addr)
                di = @decoded[start_addr]

                if not snapshot_addr and @cpu.backtrace_is_stack_address(expr)
puts "  not backtracking stack address #{expr}" if debug_backtrace
                        return []
                end

                if type == :r or type == :w
                        max_complexity = max_complexity_data
                        maxdepth = @backtrace_maxblocks_data if backtrace_maxblocks_data and maxdepth > @backtrace_maxblocks_data
                end

                if vals = (no_check ? (!need_backtrace(expr, terminals) and [expr]) : backtrace_check_found(expr,
                                di, origin, type, len, maxdepth, detached, cpu_context, snapshot_addr))
                        # no need to update backtracked_for
                        return vals
                elsif maxdepth <= 0
                        return [Expression::Unknown]
                end

                # create initial backtracked_for
                if type and origin == start_addr and di
                        btt = BacktraceTrace.new(expr, origin, origexpr, type, len, maxdepth-1, cpu_context)
                        btt.address = di.address
                        btt.exclude_instr = true if not include_start
                        btt.from_subfuncret = true if from_subfuncret and include_start
                        btt.detached = true if detached
                        di.block.backtracked_for |= [btt]
                end

                @callback_prebacktrace[] if callback_prebacktrace

                # list of Expression/Integer
                result = []

puts "backtracking #{type} #{expr} from #{di || Expression[start_addr || 0]} for #{@decoded[origin]}" if debug_backtrace or $DEBUG
                bt_log << [:start, expr, start_addr] if bt_log
                backtrace_walk(expr, start_addr, include_start, from_subfuncret, snapshot_addr, maxdepth) { |ev, expr_, h|
                        expr = expr_
                        case ev
                        when :unknown_addr, :maxdepth
puts "  backtrace end #{ev} #{expr}" if debug_backtrace
                                result |= [expr] if not snapshot_addr
                                @addrs_todo << { :addr => expr, :from => (detached ? nil : origin), :cpu_context => cpu_context } if not snapshot_addr and type == :x and origin
                        when :end
                                if not expr.kind_of?(StoppedExpr)
                                        oldexpr = expr
                                        expr = backtrace_emu_blockup(h[:addr], expr)
puts "  backtrace up #{Expression[h[:addr]]}  #{oldexpr}#{" => #{expr}" if expr != oldexpr}" if debug_backtrace
                                        bt_log << [:up, expr, oldexpr, h[:addr],  :end] if bt_log and expr != oldexpr
                                        if expr != oldexpr and not snapshot_addr and vals = (no_check ?
                                                        (!need_backtrace(expr, terminals) and [expr]) :
                                                        backtrace_check_found(expr, nil, origin, type, len,
                                                                maxdepth-h[:loopdetect].length, detached, cpu_context, snapshot_addr))
                                                result |= vals
                                                next
                                        end
                                end
puts "  backtrace end #{ev} #{expr}" if debug_backtrace
                                if not snapshot_addr
                                        result |= [expr]

                                        btt = BacktraceTrace.new(expr, origin, origexpr, type, len, maxdepth-h[:loopdetect].length-1, cpu_context)
                                        btt.detached = true if detached
                                        @decoded[h[:addr]].block.backtracked_for |= [btt] if @decoded[h[:addr]]
                                        @function[h[:addr]].backtracked_for |= [btt] if @function[h[:addr]] and h[:addr] != :default
                                        @addrs_todo << { :addr => expr, :from => (detached ? nil : origin), :cpu_context => cpu_context } if type == :x and origin
                                end
                        when :stopaddr
                                if not expr.kind_of?(StoppedExpr)
                                        oldexpr = expr
                                        expr = backtrace_emu_blockup(h[:addr], expr)
puts "  backtrace up #{Expression[h[:addr]]}  #{oldexpr}#{" => #{expr}" if expr != oldexpr}" if debug_backtrace
                                        bt_log << [:up, expr, oldexpr, h[:addr], :end] if bt_log and expr != oldexpr
                                end
puts "  backtrace end #{ev} #{expr}" if debug_backtrace
                                result |= ((expr.kind_of?(StoppedExpr)) ? expr.exprs : [expr])
                        when :loop
                                next false if expr.kind_of?(StoppedExpr)
                                t = h[:looptrace]
                                oldexpr = t[0][0]
                                next false if expr == oldexpr               # unmodifying loop
puts "  bt loop at #{Expression[t[0][1]]}: #{oldexpr} => #{expr} (#{t.map { |z| Expression[z[1]] }.join(' <- ')})" if debug_backtrace
                                bt_log << [:loop, expr, oldexpr, t.map { |z| z[1] }] if bt_log
                                false
                        when :up
                                next false if only_upto and h[:to] != only_upto
                                next expr if expr.kind_of?(StoppedExpr)
                                oldexpr = expr
                                expr = backtrace_emu_blockup(h[:from], expr)
puts "  backtrace up #{Expression[h[:from]]}->#{Expression[h[:to]]}  #{oldexpr}#{" => #{expr}" if expr != oldexpr}" if debug_backtrace
                                bt_log << [:up, expr, oldexpr, h[:from], h[:to]] if bt_log

                                if expr != oldexpr and vals = (no_check ? (!need_backtrace(expr, terminals) and [expr]) :
                                                backtrace_check_found(expr, @decoded[h[:from]], origin, type, len,
                                                        maxdepth-h[:loopdetect].length, detached, cpu_context, snapshot_addr))
                                        if snapshot_addr
                                                expr = StoppedExpr.new vals
                                                next expr
                                        else
                                                result |= vals
                                                bt_log << [:found, vals, h[:from]] if bt_log
                                                next false
                                        end
                                end

                                if origin and type
                                        # update backtracked_for
                                        update_btf = lambda { |btf, new_btt|
                                                # returns true if btf was modified
                                                if i = btf.index(new_btt)
                                                        btf[i] = new_btt if btf[i].maxdepth < new_btt.maxdepth
                                                else
                                                        btf << new_btt
                                                end
                                        }

                                        btt = BacktraceTrace.new(expr, origin, origexpr, type, len, maxdepth-h[:loopdetect].length-1, cpu_context)
                                        btt.detached = true if detached
                                        if x = di_at(h[:from])
                                                update_btf[x.block.backtracked_for, btt]
                                        end
                                        if x = @function[h[:from]] and h[:from] != :default
                                                update_btf[x.backtracked_for, btt]
                                        end
                                        if x = di_at(h[:to])
                                                btt = btt.dup
                                                btt.address = x.address
                                                btt.from_subfuncret = true if h[:sfret] == :subfuncret
                                                if backtrace_check_funcret(btt, h[:from], h[:real_to] || h[:to])
puts "   function returns to caller" if debug_backtrace
                                                        next false
                                                end
                                                if not update_btf[x.block.backtracked_for, btt]
puts "   already backtraced" if debug_backtrace
                                                        next false
                                                end
                                        end
                                end
                                expr
                        when :di, :func
                                next if expr.kind_of?(StoppedExpr)
                                if not snapshot_addr and @cpu.backtrace_is_stack_address(expr)
puts "  not backtracking stack address #{expr}" if debug_backtrace
                                        next false
                                end

oldexpr = expr
                                case ev
                                when :di
                                        h[:addr] = h[:di].address
                                        expr = backtrace_emu_instr(h[:di], expr)
                                        bt_log << [ev, expr, oldexpr, h[:di], h[:addr]] if bt_log and expr != oldexpr
                                when :func
                                        expr = backtrace_emu_subfunc(h[:func], h[:funcaddr], h[:addr], expr, origin, maxdepth-h[:loopdetect].length)
                                        if snapshot_addr and snapshot_addr == h[:funcaddr]
                                                # XXX recursiveness detection needs to be fixed
puts "  backtrace: recursive function #{Expression[h[:funcaddr]]}" if debug_backtrace
                                                next false
                                        end
                                        bt_log << [ev, expr, oldexpr, h[:funcaddr], h[:addr]] if bt_log and expr != oldexpr
                                end
puts "  backtrace #{h[:di] || Expression[h[:funcaddr]]}  #{oldexpr} => #{expr}" if debug_backtrace and expr != oldexpr
                                if vals = (no_check ? (!need_backtrace(expr, terminals) and [expr]) : backtrace_check_found(expr,
                                                h[:di], origin, type, len, maxdepth-h[:loopdetect].length, detached, cpu_context, snapshot_addr))
                                        if snapshot_addr
                                                expr = StoppedExpr.new vals
                                        else
                                                result |= vals
                                                bt_log << [:found, vals, h[:addr]] if bt_log
                                                next false
                                        end
                                elsif expr.complexity > max_complexity
puts "  backtrace aborting, expr too complex" if debug_backtrace
                                        next false
                                end
                                expr
                        else raise ev.inspect
                        end
                }

puts '  backtrace result: ' + result.map { |r| Expression[r] }.join(', ') if debug_backtrace

                result
        end
backtrace_check_found(expr, di, origin, type, len, maxdepth, detached, cpu_context, snapshot_addr=nil) click to toggle source

returns an array of expressions, or nil if expr needs more backtrace it needs more backtrace if expr.externals include a Symbol != :unknown (symbol == register value) if it need no more backtrace, expr's indirections are recursively resolved xrefs are created, and di args are updated (immediate => label) if type is :x, addrs_todo is updated, and if di starts a block, expr is checked to see if it may be a subfunction return value

expr indirection are solved by first finding the value of the pointer, and then rebacktracking for write-type access detached is true if type is :x and from should not be set in addrs_todo (indirect call flow, eg external function callback) if the backtrace ends pre entrypoint, returns the value encoded in the raw binary XXX global variable (modified by another function), exported data, multithreaded app.. TODO handle memory aliasing (mov ebx, eax ; write [ebx] ; read [eax]) TODO trace expr evolution through backtrace, to modify immediates to an expr involving label names TODO mov [ptr], imm ; <…> ; jmp [ptr] => rename imm as loc_XX

eg. mov eax, 42 ; add eax, 4 ; jmp eax  =>  mov eax, some_label-4
# File metasm/disassemble.rb, line 1830
        def backtrace_check_found(expr, di, origin, type, len, maxdepth, detached, cpu_context, snapshot_addr=nil)
                # only entrypoints or block starts called by a :saveip are checked for being a function
                # want to execute [esp] from a block start
                if type == :x and di and di == di.block.list.first and @cpu.backtrace_is_function_return(expr, @decoded[origin]) and (
                        # which is an entrypoint..
                        (not di.block.from_normal and not di.block.from_subfuncret) or
                        # ..or called from a saveip
                        (bool = false ; di.block.each_from_normal { |fn| bool = true if @decoded[fn] and @decoded[fn].opcode.props[:saveip] } ; bool))

                        # now we can mark the current address a function start
                        # the actual return address will be found later (we tell the caller to continue the backtrace)
                        addr = di.address
                        l = auto_label_at(addr, 'sub', 'loc', 'xref')
                        if not f = @function[addr]
                                f = @function[addr] = DecodedFunction.new
                                puts "found new function #{l} at #{Expression[addr]}" if $VERBOSE
                        end
                        f.finalized = false

                        if @decoded[origin]
                                f.return_address ||= []
                                f.return_address |= [origin]
                                @decoded[origin].add_comment "endsub #{l}"
                                # TODO add_xref (to update the comment on rename_label)
                        end

                        f.backtracked_for |= @decoded[addr].block.backtracked_for.find_all { |btt| not btt.address }
                end

                return if need_backtrace(expr)
                if snapshot_addr
                        return if expr.expr_externals(true).find { |ee| ee.kind_of?(Indirection) }
                end

puts "backtrace #{type} found #{expr} from #{di} orig #{@decoded[origin] || Expression[origin] if origin}" if debug_backtrace
                result = backtrace_value(expr, maxdepth)
                # keep the ori pointer in the results to emulate volatile memory (eg decompiler prefers this)
                #result << expr if not type   # XXX returning multiple values for nothing is too confusing, TODO fix decompiler
                result.uniq!

                # create xrefs/labels
                result.each { |e|
                        backtrace_found_result(e, di, type, origin, len, detached, cpu_context)
                } if type and origin

                result
        end
backtrace_check_funcret(btt, funcaddr, instraddr) click to toggle source

checks if the BacktraceTrace is a call to a known subfunction returns true and updates self.addrs_todo

# File metasm/disassemble.rb, line 1737
        def backtrace_check_funcret(btt, funcaddr, instraddr)
                if di = @decoded[instraddr] and @function[funcaddr] and btt.type == :x and
                                not btt.from_subfuncret and
                                @cpu.backtrace_is_function_return(btt.expr, @decoded[btt.origin]) and
                                retaddr = backtrace_emu_instr(di, btt.expr) and
                                not need_backtrace(retaddr)
puts "  backtrace addrs_todo << #{Expression[retaddr]} from #{di} (funcret)" if debug_backtrace
                        di.block.add_to_subfuncret normalize(retaddr)
                        if @decoded[funcaddr].kind_of?(DecodedInstruction)
                                # check that all callers :saveip returns (eg recursive call that was resolved
                                # before we found funcaddr was a function)
                                @decoded[funcaddr].block.each_from_normal { |fm|
                                        if fdi = di_at(fm) and fdi.opcode.props[:saveip] and not fdi.block.to_subfuncret
                                                backtrace_check_funcret(btt, funcaddr, fm)
                                        end
                                }
                        end
                        if not @function[funcaddr].finalized
                                # the function is not fully disassembled: arrange for the retaddr to be
                                #  disassembled only after the subfunction is finished
                                # for that we walk the code from the call, mark each block start, and insert the sfret
                                #  just before the 1st function block address in @addrs_todo (which is pop()ed by dasm_step)
                                faddrlist = []
                                todo = []
                                di.block.each_to_normal { |t| todo << normalize(t) }
                                while a = todo.pop
                                        next if faddrlist.include?(a) or not get_section_at(a)
                                        faddrlist << a
                                        if @decoded[a].kind_of?(DecodedInstruction)
                                                @decoded[a].block.each_to_samefunc(self) { |t| todo << normalize(t) }
                                        end
                                end

                                idx = @addrs_todo.index(@addrs_todo.find { |aa| faddrlist.include? normalize(aa[:addr]) }) || -1
                                @addrs_todo.insert(idx, { :addr => retaddr, :from => instraddr, :from_subfuncret => true, :cpu_context => btt.cpu_context })
                        else
                                @addrs_todo << { :addr => retaddr, :from => instraddr, :from_subfuncret => true, :cpu_context => btt.cpu_context }
                        end
                        true
                end
        end
backtrace_emu_blockup(addr, expr) click to toggle source

applies a location binding

# File metasm/disassemble.rb, line 1791
def backtrace_emu_blockup(addr, expr)
        (ab = @address_binding[addr]) ? Expression[expr.bind(ab).reduce] : expr
end
backtrace_emu_instr(di, expr) click to toggle source

applies one decodedinstruction to an expression

# File metasm/disassemble.rb, line 1780
def backtrace_emu_instr(di, expr)
        @cpu.backtrace_emu(di, expr)
end
backtrace_emu_subfunc(func, funcaddr, calladdr, expr, origin, maxdepth) click to toggle source

applies one subfunction to an expression

# File metasm/disassemble.rb, line 1785
def backtrace_emu_subfunc(func, funcaddr, calladdr, expr, origin, maxdepth)
        bind = func.get_backtrace_binding(self, funcaddr, calladdr, expr, origin, maxdepth)
        Expression[expr.bind(bind).reduce]
end
backtrace_found_result(expr, di, type, origin, len, detached, cpu_context) click to toggle source

creates xrefs, updates addrs_todo, updates instr args

# File metasm/disassemble.rb, line 2001
def backtrace_found_result(expr, di, type, origin, len, detached, cpu_context)
        n = normalize(expr)
        fallthrough = true if type == :x and o = di_at(origin) and not o.opcode.props[:stopexec] and n == o.block.list.last.next_addr # delay_slot
        add_xref(n, Xref.new(type, origin, len)) if origin != :default and origin != Expression::Unknown and not fallthrough
        unk = true if n == Expression::Unknown

        add_xref(n, Xref.new(:addr, di.address)) if di and di.address != origin and not unk
        base = { nil => 'loc', 1 => 'byte', 2 => 'word', 4 => 'dword', 8 => 'qword' }[len] || 'xref'
        base = 'sub' if @function[n]
        n = Expression[auto_label_at(n, base, 'xref') || n] if not fallthrough
        n = Expression[n]

        # update instr args
        # TODO trace expression evolution to allow handling of
        #  mov eax, 28 ; add eax, 4 ; jmp eax
        #  => mov eax, (loc_xx-4)
        if di and not unk and expr != n # and di.address == origin
                @cpu.replace_instr_arg_immediate(di.instruction, expr, n)
        end
        if @decoded[origin] and not unk
                 @cpu.backtrace_found_result(self, @decoded[origin], expr, type, len)
        end

        # add comment
        if type and @decoded[origin] # and not @decoded[origin].instruction.args.include? n
                @decoded[origin].add_comment "#{type}#{len}:#{n}" if not fallthrough
        end

        # check if target is a string
        if di and type == :r and (len == 1 or len == 2) and s = get_section_at(n)
                l = s[0].inv_export[s[0].ptr]
                case len
                when 1; str = s[0].read(32).unpack('C*')
                when 2; str = s[0].read(64).unpack('v*')
                end
                str = str.inject('') { |str_, c|
                        case c
                        when 0x20..0x7e, ?\n, ?\r, ?\t; str_ << c
                        else break str_
                        end
                }
                if str.length >= 4
                        di.add_comment "#{'L' if len == 2}#{str.inspect}"
                        str = 'a_' + str.downcase.delete('^a-z0-9')[0, 12]
                        if str.length >= 8 and l[0, 5] == 'byte_'
                                rename_label(l, @program.new_label(str))
                        end
                end
        end

        # XXX all this should be done in  backtrace() { <here> }
        if type == :x and origin
                if detached
                        o = @decoded[origin] ? origin : di ? di.address : nil       # lib function callback have origin == libfuncname, so we must find a block somewhere else
                        origin = nil
                        @decoded[o].block.add_to_indirect(normalize(n)) if @decoded[o] and not unk
                else
                        @decoded[origin].block.add_to_normal(normalize(n)) if @decoded[origin] and not unk
                end
                @addrs_todo << { :addr => n, :from => origin, :cpu_context => cpu_context }
        end
end
backtrace_indirection(ind, maxdepth) click to toggle source

returns the array of values pointed by the indirection at its invocation (ind.origin) first resolves the pointer using backtrace_value, if it does not point in edata keep the original pointer then backtraces from ind.origin until it finds an :w xref origin if no :w access is found, returns the value encoded in the raw section data TODO handle unaligned (partial?) writes

# File metasm/disassemble.rb, line 1900
        def backtrace_indirection(ind, maxdepth)
                if not ind.origin
                        puts "backtrace_ind: no origin for #{ind}" if $VERBOSE
                        return [ind]
                end

                ret = []

                decode_imm = lambda { |addr, len|
                        edata = get_edata_at(addr)
                        if edata
                                Expression[ edata.decode_imm("u#{8*len}".to_sym, @cpu.endianness) ]
                        else
                                Expression::Unknown
                        end
                }

                # resolve pointers (they may include Indirections)
                backtrace_value(ind.target, maxdepth).each { |ptr|
                        # find write xrefs to the ptr
                        refs = []
                        each_xref(ptr, :w) { |x|
                                # XXX should be rebacktracked on new xref
                                next if not @decoded[x.origin]
                                refs |= [x.origin]
                        } if ptr != Expression::Unknown

                        if refs.empty?
                                if get_section_at(ptr)
                                        # static data, newer written : return encoded value
                                        ret |= [decode_imm[ptr, ind.len]]
                                        next
                                else
                                        # unknown pointer : backtrace the indirection, hope it solves itself
                                        initval = ind
                                end
                        else
                                # wait until we find a write xref, then backtrace the written value
                                initval = true
                        end

                        # wait until we arrive at an xref'ing instruction, then backtrace the written value
                        backtrace_walk(initval, ind.origin, true, false, nil, maxdepth-1) { |ev, expr, h|
                                case ev
                                when :unknown_addr, :maxdepth, :stopaddr
puts "   backtrace_indirection for #{ind.target} failed: #{ev}" if debug_backtrace
                                        ret |= [Expression::Unknown]
                                when :end
                                        if not refs.empty? and (expr == true or not need_backtrace(expr))
                                                if expr == true
                                                        # found a path avoiding the :w xrefs, read the encoded initial value
                                                        ret |= [decode_imm[ptr, ind.len]]
                                                else
                                                        bd = expr.expr_indirections.inject({}) { |h_, i| h_.update i => decode_imm[i.target, i.len] }
                                                        ret |= [Expression[expr.bind(bd).reduce]]
                                                end
                                        else
                                                # unknown pointer, backtrace did not resolve...
                                                ret |= [Expression::Unknown]
                                        end
                                when :di
                                        di = h[:di]
                                        if expr == true
                                                next true if not refs.include? di.address
                                                # find the expression to backtrace: assume this is the :w xref from this di
                                                writes = get_xrefs_rw(di)
                                                writes = writes.find_all { |x_type, x_ptr, x_len| x_type == :w and x_len == ind.len }
                                                if writes.length != 1
                                                        puts "backtrace_ind: incompatible xrefs to #{ptr} from #{di}" if $DEBUG
                                                        ret |= [Expression::Unknown]
                                                        next false
                                                end
                                                expr = Indirection.new(writes[0][1], ind.len, di.address)
                                        end
                                        expr = backtrace_emu_instr(di, expr)
                                        # may have new indirections... recall bt_value ?
                                        #if not need_backtrace(expr)
                                        if expr.expr_externals.all? { |e| @prog_binding[e] or @function[normalize(e)] } and expr.expr_indirections.empty?
                                                ret |= backtrace_value(expr, maxdepth-1-h[:loopdetect].length)
                                                false
                                        else
                                                expr
                                        end
                                when :func
                                        next true if expr == true  # XXX
                                        expr = backtrace_emu_subfunc(h[:func], h[:funcaddr], h[:addr], expr, ind.origin, maxdepth-h[:loopdetect].length)
                                        #if not need_backtrace(expr)
                                        if expr.expr_externals.all? { |e| @prog_binding[e] or @function[normalize(e)] } and expr.expr_indirections.empty?
                                                ret |= backtrace_value(expr, maxdepth-1-h[:loopdetect].length)
                                                false
                                        else
                                                expr
                                        end
                                end
                        }
                }

                ret
        end
backtrace_update_function_binding(addr, func=@function[addr], retaddrs=func.return_address) click to toggle source
# File metasm/disassemble.rb, line 1795
def backtrace_update_function_binding(addr, func=@function[addr], retaddrs=func.return_address)
        @cpu.backtrace_update_function_binding(self, addr, func, retaddrs)
end
backtrace_value(expr, maxdepth) click to toggle source

returns an array of expressions with Indirections resolved (recursive with backtrace_indirection)

# File metasm/disassemble.rb, line 1879
def backtrace_value(expr, maxdepth)
        # array of expression with all indirections resolved
        result = [Expression[expr.reduce]]

        # solve each indirection sequentially, clone expr for each value (aka cross-product)
        result.first.expr_indirections.uniq.each { |i|
                next_result = []
                backtrace_indirection(i, maxdepth).each { |rr|
                        next_result |= result.map { |e| Expression[e.bind(i => rr).reduce] }
                }
                result = next_result
        }

        result.uniq
end
backtrace_walk(obj, addr, include_start, from_subfuncret, stopaddr, maxdepth) { |:maxdepth, w_obj, :addr => w_addr, :loopdetect => w_loopdetect| ... } click to toggle source

walks the backtrace tree from an address, passing along an object

the steps are (1st = event, followed by hash keys)

for each decoded instruction encountered: :di :di

when backtracking to a block through a decodedfunction: (yield for each of the block's subfunctions) (the decodedinstruction responsible for the call will be yield next) :func :func, :funcaddr, :addr, :depth

when jumping from one block to another (excluding :loop): # XXX include :loops ? :up :from, :to, :sfret

when the backtrack has nothing to backtrack to (eg program entrypoint): :end :addr

when the backtrack stops by taking too long to complete: :maxdepth :addr

when the backtrack stops for encountering the specified stop address: :stopaddr :addr

when rebacktracking a block already seen in the current branch: (looptrace is an array of [obj, block end addr, from_subfuncret], from oldest to newest) :loop :looptrace

when the address does not match a known instruction/function: :unknown_addr :addr

the block return value is used as follow for :di, :func, :up and :loop: false => the backtrace stops for the branch nil => the backtrace continues with the current object anything else => the backtrace continues with this object

method arguments:

obj is the initial value of the object
addr is the address where the backtrace starts
include_start is a bool specifying if the backtrace should start at addr or just before
from_subfuncret is a bool specifying if addr points to a decodedinstruction that calls a subfunction
stopaddr is an [array of] address of instruction, the backtrace will stop just after executing it
maxdepth is the maximum depth (in blocks) for each backtrace branch.
(defaults to dasm.backtrace_maxblocks, which defaults do Dasm.backtrace_maxblocks)
# File metasm/disassemble.rb, line 1285
def backtrace_walk(obj, addr, include_start, from_subfuncret, stopaddr, maxdepth)
        start_addr = normalize(addr)
        stopaddr = [stopaddr] if stopaddr and not stopaddr.kind_of?(::Array)

        # array of [obj, addr, from_subfuncret, loopdetect]
        # loopdetect is an array of [obj, addr, from_type] of each end of block encountered
        todo = []

        # array of [obj, blockaddr]
        # avoids rewalking the same value
        done = []

        # updates todo with the addresses to backtrace next
        walk_up = lambda { |w_obj, w_addr, w_loopdetect|
                if w_loopdetect.length > maxdepth
                        yield :maxdepth, w_obj, :addr => w_addr, :loopdetect => w_loopdetect
                elsif stopaddr and stopaddr.include?(w_addr)
                        yield :stopaddr, w_obj, :addr => w_addr, :loopdetect => w_loopdetect
                elsif w_di = @decoded[w_addr] and w_di != w_di.block.list.first and w_di.address != w_di.block.address
                        prevdi = w_di.block.list[w_di.block.list.index(w_di)-1]
                        todo << [w_obj, prevdi.address, :normal, w_loopdetect]
                elsif w_di
                        next if done.include? [w_obj, w_addr]
                        done << [w_obj, w_addr]
                        hadsomething = false
                        w_di.block.each_from { |f_addr, f_type|
                                next if f_type == :indirect
                                hadsomething = true
                                o_f_addr = f_addr
                                f_addr = @decoded[f_addr].block.list.last.address if @decoded[f_addr].kind_of?(DecodedInstruction) # delay slot
                                if l = w_loopdetect.find { |l_obj, l_addr, l_type| l_addr == f_addr and l_type == f_type }
                                        f_obj = yield(:loop, w_obj, :looptrace => w_loopdetect[w_loopdetect.index(l)..-1], :loopdetect => w_loopdetect)
                                        if f_obj and f_obj != w_obj       # should avoid infinite loops
                                                f_loopdetect = w_loopdetect[0...w_loopdetect.index(l)]
                                        end
                                else
                                        f_obj = yield(:up, w_obj, :from => w_addr, :to => f_addr, :sfret => f_type, :loopdetect => w_loopdetect, :real_to => o_f_addr)
                                end
                                next if f_obj == false
                                f_obj ||= w_obj
                                f_loopdetect ||= w_loopdetect
                                # only count non-trivial paths in loopdetect (ignore linear links)
                                add_detect = [[f_obj, f_addr, f_type]]
                                add_detect = [] if @decoded[f_addr].kind_of?(DecodedInstruction) and tmp = @decoded[f_addr].block and
                                                ((w_di.block.from_subfuncret.to_a == [] and w_di.block.from_normal == [f_addr] and
                                                 tmp.to_normal == [w_di.address] and tmp.to_subfuncret.to_a == []) or
                                                (w_di.block.from_subfuncret == [f_addr] and tmp.to_subfuncret == [w_di.address]))
                                todo << [f_obj, f_addr, f_type, f_loopdetect + add_detect ]
                        }
                        yield :end, w_obj, :addr => w_addr, :loopdetect => w_loopdetect if not hadsomething
                elsif @function[w_addr] and w_addr != :default and w_addr != Expression::Unknown
                        next if done.include? [w_obj, w_addr]
                        oldlen = todo.length
                        each_xref(w_addr, :x) { |x|
                                f_addr = x.origin
                                o_f_addr = f_addr
                                f_addr = @decoded[f_addr].block.list.last.address if @decoded[f_addr].kind_of?(DecodedInstruction) # delay slot
                                if l = w_loopdetect.find { |l_obj, l_addr, l_type| l_addr == w_addr }
                                        f_obj = yield(:loop, w_obj, :looptrace => w_loopdetect[w_loopdetect.index(l)..-1], :loopdetect => w_loopdetect)
                                        if f_obj and f_obj != w_obj
                                                f_loopdetect = w_loopdetect[0...w_loopdetect.index(l)]
                                        end
                                else
                                        f_obj = yield(:up, w_obj, :from => w_addr, :to => f_addr, :sfret => :normal, :loopdetect => w_loopdetect, :real_to => o_f_addr)
                                end
                                next if f_obj == false
                                f_obj ||= w_obj
                                f_loopdetect ||= w_loopdetect
                                todo << [f_obj, f_addr, :normal, f_loopdetect + [[f_obj, f_addr, :normal]] ]
                        }
                        yield :end, w_obj, :addr => w_addr, :loopdetect => w_loopdetect if todo.length == oldlen
                else
                        yield :unknown_addr, w_obj, :addr => w_addr, :loopdetect => w_loopdetect
                end
        }

        if include_start
                todo << [obj, start_addr, from_subfuncret ? :subfuncret : :normal, []]
        else
                walk_up[obj, start_addr, []]
        end

        while not todo.empty?
                obj, addr, type, loopdetect = todo.pop
                di = @decoded[addr]
                if di and type == :subfuncret
                        di.block.each_to_normal { |sf|
                                next if not f = @function[normalize(sf)]
                                s_obj = yield(:func, obj, :func => f, :funcaddr => sf, :addr => addr, :loopdetect => loopdetect)
                                next if s_obj == false
                                s_obj ||= obj
                                if l = loopdetect.find { |l_obj, l_addr, l_type| addr == l_addr and l_type == :normal }
                                        l_obj = yield(:loop, s_obj, :looptrace => loopdetect[loopdetect.index(l)..-1], :loopdetect => loopdetect)
                                        if l_obj and l_obj != s_obj
                                                s_loopdetect = loopdetect[0...loopdetect.index(l)]
                                        end
                                        next if l_obj == false
                                        s_obj = l_obj if l_obj
                                end
                                s_loopdetect ||= loopdetect
                                todo << [s_obj, addr, :normal, s_loopdetect + [[s_obj, addr, :normal]] ]
                        }
                elsif di
                        # XXX should interpolate index if di is not in block.list, but what if the addresses are not Comparable ?
                        di.block.list[0..(di.block.list.index(di) || -1)].reverse_each { |di_|
                                di = di_   # XXX not sure..
                                if stopaddr and ea = di.next_addr and stopaddr.include?(ea)
                                        yield :stopaddr, obj, :addr => ea, :loopdetect => loopdetect
                                        break
                                end
                                ex_obj = obj
                                obj = yield(:di, obj, :di => di, :loopdetect => loopdetect)
                                break if obj == false
                                obj ||= ex_obj
                        }
                        walk_up[obj, di.block.address, loopdetect] if obj
                elsif @function[addr] and addr != :default and addr != Expression::Unknown
                        ex_obj = obj
                        obj = yield(:func, obj, :func => @function[addr], :funcaddr => addr, :addr => addr, :loopdetect => loopdetect)
                        next if obj == false
                        obj ||= ex_obj
                        walk_up[obj, addr, loopdetect]
                else
                        yield :unknown_addr, obj, :addr => addr, :loopdetect => loopdetect
                end
        end
end
backtrace_xrefs_di_rw(di) click to toggle source

trace whose xrefs this di is responsible of

# File metasm/disassemble.rb, line 1125
def backtrace_xrefs_di_rw(di)
        get_xrefs_rw(di).each { |type, ptr, len|
                backtrace(ptr, di.address, :origin => di.address, :type => type, :len => len).each { |xaddr|
                        next if xaddr == Expression::Unknown
                        if @check_smc and type == :w
                                #len.times { |off| # check unaligned ?
                                waddr = xaddr      #+ off
                                if wdi = di_at(waddr)
                                        puts "W: disasm: #{di} overwrites #{wdi}" if $VERBOSE
                                        wdi.add_comment "overwritten by #{di}"
                                end
                                #}
                        end
                }
        }
end
backtrace_xrefs_di_x(di, cpu_context) click to toggle source

trace xrefs for execution

# File metasm/disassemble.rb, line 1143
def backtrace_xrefs_di_x(di, cpu_context)
        ar = @program.get_xrefs_x(self, di)
        ar = @callback_newaddr[di.address, ar] || ar if callback_newaddr
        ar.each { |expr| backtrace(expr, di.address, :origin => di.address, :type => :x, :cpu_context => cpu_context) }
end
block_at(addr) click to toggle source

returns the InstructionBlock containing the address at addr

# File metasm/disassemble_api.rb, line 159
def block_at(addr)
        di = di_at(addr)
        di.block if di
end
block_including(addr) click to toggle source

returns the InstructionBlock containing the byte at addr returns the one of di_including() on multiple matches (overlapping instrs)

# File metasm/disassemble_api.rb, line 182
def block_including(addr)
        di = di_including(addr)
        di.block if di
end
c_constants() click to toggle source

list the constants ([name, integer value]) defined in the C code (define / enums)

# File metasm/disassemble.rb, line 564
def c_constants
        @c_parser_constcache ||= @c_parser.numeric_constants
end
call_sites(funcaddr) click to toggle source

find the addresses of calls calling the address, handles thunks

# File metasm/disassemble_api.rb, line 1706
def call_sites(funcaddr)
        find_call_site = proc { |a|
                until not di = di_at(a)
                        if di.opcode.props[:saveip]
                                cs = di.address
                                break
                        end
                        if di.block.from_subfuncret.to_a.first
                                while di.block.from_subfuncret.to_a.length == 1
                                        a = di.block.from_subfuncret[0]
                                        break if not di_at(a)
                                        a = @decoded[a].block.list.first.address
                                        di = @decoded[a]
                                end
                        end
                        break if di.block.from_subfuncret.to_a.first
                        break if di.block.from_normal.to_a.length != 1
                        a = di.block.from_normal.first
                end
                cs
        }
        ret = []
        each_xref(normalize(funcaddr), :x) { |a|
                ret << find_call_site[a.origin]
        }
        ret.compact.uniq
end
check_noreturn_function(fa) click to toggle source

given an address, detect if it may be a noreturn fuction it is if all its end blocks are calls to noreturn functions if it is, create a @function with noreturn = true should only be called with fa = target of a call

# File metasm/disassemble.rb, line 1223
def check_noreturn_function(fa)
        fb = function_blocks(fa, false, false)
        return if fb.empty?
        lasts = fb.keys.find_all { |k| fb[k] == [] }
        if lasts.all? { |la|
                b = block_at(la)
                next if not di = b.list.last
                (di.opcode.props[:saveip] and b.to_normal.to_a.all? { |tfa|
                        tf = function_at(tfa) and tf.noreturn
                }) or (di.opcode.props[:stopexec] and not (di.opcode.props[:setip] or not get_xrefs_x(di).empty?))
        }
                # yay
                @function[fa] ||= DecodedFunction.new
                @function[fa].noreturn = true
        end
end
code_binding(*a) click to toggle source

computes the binding of a code sequence just a forwarder to CPU#code_binding

# File metasm/disassemble_api.rb, line 680
def code_binding(*a)
        @cpu.code_binding(self, *a)
end
compose_bt_binding(bd1, bd2) click to toggle source

compose two code/instruction's backtrace_binding assumes bd1 is followed by bd2 in the code flow eg inc edi + push edi =>

{ Ind[:esp, 4] => Expr[:edi + 1], :esp => Expr[:esp - 4], :edi => Expr[:edi + 1] }

XXX if bd1 writes to memory with a pointer that is reused in bd2, this function has to revert the change made by bd2, which only works with simple ptr addition now XXX unhandled situations may be resolved using :unknown, or by returning incorrect values

# File metasm/disassemble_api.rb, line 1774
def compose_bt_binding(bd1, bd2)
        if bd1.kind_of? DecodedInstruction
                bd1 = bd1.backtrace_binding ||= cpu.get_backtrace_binding(bd1)
        end
        if bd2.kind_of? DecodedInstruction
                bd2 = bd2.backtrace_binding ||= cpu.get_backtrace_binding(bd2)
        end

        reduce = lambda { |e| Expression[Expression[e].reduce] }

        bd = {}

        bd2.each { |k, v|
                bd[k] = reduce[v.bind(bd1)]
        }

        # for each pointer appearing in keys of bd1, we must infer from bd2 what final
        # pointers should appear in bd
        # eg 'mov [eax], 0  mov ebx, eax'  => { [eax] <- 0, [ebx] <- 0, ebx <- eax }
        bd1.each { |k, v|
                if k.kind_of? Indirection
                        done = false
                        k.pointer.externals.each { |e|
                                # XXX this will break on nontrivial pointers or bd2
                                bd2.each { |k2, v2|
                                        # we dont want to invert computation of flag_zero/carry etc (booh)
                                        next if k2.to_s =~ /flag/

                                        # discard indirection etc, result would be too complex / not useful
                                        next if not Expression[v2].expr_externals.include? e

                                        done = true

                                        # try to reverse the computation made upon 'e'
                                        # only simple addition handled here
                                        ptr = reduce[k.pointer.bind(e => Expression[[k2, :-, v2], :+, e])]

                                        # if bd2 does not rewrite e, duplicate the original pointer
                                        if not bd2[e]
                                                bd[k] ||= reduce[v]

                                                # here we should not see 'e' in ptr anymore
                                                ptr = Expression::Unknown if ptr.externals.include? e
                                        else
                                                # cant check if add reversion was successful..
                                        end

                                        bd[Indirection[reduce[ptr], k.len]] ||= reduce[v]
                                }
                        }
                        bd[k] ||= reduce[v] if not done
                else
                        bd[k] ||= reduce[v]
                end
        }

        bd
end
decode_byte(addr) click to toggle source

read a byte at address addr

# File metasm/disassemble_api.rb, line 228
def decode_byte(addr)
        decode_int(addr, :u8)
end
decode_c_ary(structname, addr, len) click to toggle source
# File metasm/disassemble_api.rb, line 1845
def decode_c_ary(structname, addr, len)
        if c_parser and edata = get_edata_at(addr)
                c_parser.decode_c_ary(structname, len, edata.data, edata.ptr)
        end
end
decode_c_struct(structname, addr) click to toggle source

return a C::AllocCStruct from c_parser TODO handle program.class::Header.to_c_struct

# File metasm/disassemble_api.rb, line 1839
def decode_c_struct(structname, addr)
        if c_parser and edata = get_edata_at(addr)
                c_parser.decode_c_struct(structname, edata.data, edata.ptr)
        end
end
decode_dword(addr) click to toggle source

read a dword at address addr the dword is cpu-sized (eg 32 or 64bits)

# File metasm/disassemble_api.rb, line 234
def decode_dword(addr)
        decode_int(addr, @cpu.size/8)
end
decode_int(addr, type) click to toggle source

read an int of arbitrary type (:u8, :i32, …)

# File metasm/disassemble_api.rb, line 220
def decode_int(addr, type)
        type = "u#{type*8}".to_sym if type.kind_of? Integer
        if e = get_section_at(addr)
                e[0].decode_imm(type, @cpu.endianness)
        end
end
decode_strz(addr, maxsz=4096) click to toggle source

read a zero-terminated string from addr if no terminal 0 is found, return nil

# File metasm/disassemble_api.rb, line 240
def decode_strz(addr, maxsz=4096)
        if e = get_section_at(addr)
                str = e[0].read(maxsz).to_s
                return if not len = str.index(?\0)
                str[0, len]
        end
end
decode_wstrz(addr, maxsz=4096) click to toggle source

read a zero-terminated wide string from addr return nil if no terminal found

# File metasm/disassemble_api.rb, line 250
def decode_wstrz(addr, maxsz=4096)
        if e = get_section_at(addr)
                str = e[0].read(maxsz).to_s
                return if not len = str.unpack('v*').index(0)
                str[0, 2*len]
        end
end
decompile(*addr) click to toggle source
# File metasm/disassemble.rb, line 2268
def decompile(*addr)
        decompiler.decompile(*addr)
end
decompile_func(addr) click to toggle source
# File metasm/disassemble.rb, line 2271
def decompile_func(addr)
        decompiler.decompile_func(addr)
end
decompiler() click to toggle source
# File metasm/disassemble.rb, line 2261
def decompiler
        parse_c '' if not c_parser
        @decompiler ||= Decompiler.new(self)
end
decompiler=(dc) click to toggle source
# File metasm/disassemble.rb, line 2265
def decompiler=(dc)
        @decompiler = dc
end
del_label_at(addr, name=get_label_at(addr)) click to toggle source

remove a label at address addr

# File metasm/disassemble_api.rb, line 311
def del_label_at(addr, name=get_label_at(addr))
        ed = get_edata_at(addr)
        if ed and ed.inv_export[ed.ptr]
                ed.del_export name, ed.ptr
                @label_alias_cache = nil
        end
        each_xref(addr) { |xr|
                next if not xr.origin or not o = @decoded[xr.origin] or not o.kind_of? Renderable
                o.each_expr { |e|
                        next unless e.kind_of?(Expression)
                        e.lexpr = addr if e.lexpr == name
                        e.rexpr = addr if e.rexpr == name
                }
        }
        @old_prog_binding.delete name
        @prog_binding.delete name
end
demangle_cppname(name) click to toggle source

returns a demangled C++ name

# File metasm/disassemble_api.rb, line 741
def demangle_cppname(name)
        case name[0]
        when ??       # MSVC
                name = name[1..-1]
                demangle_msvc(name[1..-1]) if name[0] == ??
        when ?_
                name = name.sub(/_GLOBAL__[ID]_/, '')
                demangle_gcc(name[2..-1][/\S*/]) if name[0, 2] == '_Z'
        end
end
demangle_gcc(name) click to toggle source

from www.codesourcery.com/public/cxx-abi/abi.html

# File metasm/disassemble_api.rb, line 775
def demangle_gcc(name)
        subs = []
        ret = ''
        decode_tok = lambda {
                name ||= ''
                case name[0]
                when nil
                        ret = nil
                when ?N
                        name = name[1..-1]
                        decode_tok[]
                        until name[0] == ?E
                                break if not ret
                                ret << '::'
                                decode_tok[]
                        end
                        name = name[1..-1]
                when ?I
                        name = name[1..-1]
                        ret = ret[0..-3] if ret[-2, 2] == '::'
                        ret << '<'
                        decode_tok[]
                        until name[0] == ?E
                                break if not ret
                                ret << ', '
                                decode_tok[]
                        end
                        ret << ' ' if ret and ret[-1] == ?>
                        ret << '>' if ret
                        name = name[1..-1]
                when ?T
                        case name[1]
                        when ?T; ret << 'vtti('
                        when ?V; ret << 'vtable('
                        when ?I; ret << 'typeinfo('
                        when ?S; ret << 'typename('
                        else ret = nil
                        end
                        name = name[2..-1].to_s
                        decode_tok[] if ret
                        ret << ')' if ret
                        name = name[1..-1] if name[0] == ?E
                when ?C
                        name = name[2..-1]
                        base = ret[/([^:]*)(<.*|::)?$/, 1]
                        ret << base
                when ?D
                        name = name[2..-1]
                        base = ret[/([^:]*)(<.*|::)?$/, 1]
                        ret << '~' << base
                when ?0..?9
                        nr = name[/^[0-9]+/]
                        name = name[nr.length..-1].to_s
                        ret << name[0, nr.to_i]
                        name = name[nr.to_i..-1]
                        subs << ret[/[\w:]*$/]
                when ?S
                        name = name[1..-1]
                        case name[0]
                        when ?_, ?0..?9, ?A..?Z
                                case name[0]
                                when ?_; idx = 0 ; name = name[1..-1]
                                when ?0..?9; idx = name[0, 1].unpack('C')[0] - 0x30 + 1 ; name = name[2..-1]
                                when ?A..?Z; idx = name[0, 1].unpack('C')[0] - 0x41 + 11 ; name = name[2..-1]
                                end
                                if not subs[idx]
                                        ret = nil
                                else
                                        ret << subs[idx]
                                end
                        when ?t
                                ret << 'std::'
                                name = name[1..-1]
                                decode_tok[]
                        else
                                std = { ?a => 'std::allocator',
                                        ?b => 'std::basic_string',
                                        ?s => 'std::string', # 'std::basic_string < char, std::char_traits<char>, std::allocator<char> >',
                                        ?i => 'std::istream', # 'std::basic_istream<char,  std::char_traits<char> >',
                                        ?o => 'std::ostream', # 'std::basic_ostream<char,  std::char_traits<char> >',
                                        ?d => 'std::iostream', # 'std::basic_iostream<char, std::char_traits<char> >'
                                }[name[0]]
                                if not std
                                        ret = nil
                                else
                                        ret << std
                                end
                                name = name[1..-1]
                        end
                when ?P, ?R, ?r, ?V, ?K
                        attr = { ?P => '*', ?R => '&', ?r => ' restrict', ?V => ' volatile', ?K => ' const' }[name[0]]
                        name = name[1..-1]
                        rl = ret.length
                        decode_tok[]
                        if ret
                                ret << attr
                                subs << ret[rl..-1]
                        end
                else
                        if ret =~ /[(<]/ and ty = {
                ?v => 'void', ?w => 'wchar_t', ?b => 'bool', ?c => 'char', ?a => 'signed char',
                ?h => 'unsigned char', ?s => 'short', ?t => 'unsigned short', ?i => 'int',
                ?j => 'unsigned int', ?l => 'long', ?m => 'unsigned long', ?x => '__int64',
                ?y => 'unsigned __int64', ?n => '__int128', ?o => 'unsigned __int128', ?f => 'float',
                ?d => 'double', ?e => 'long double', ?g => '__float128', ?z => '...'
                        }[name[0]]
                                name = name[1..-1]
                                ret << ty
                        else
                                fu = name[0, 2]
                                name = name[2..-1]
                                if op = {
                'nw' => ' new', 'na' => ' new[]', 'dl' => ' delete', 'da' => ' delete[]',
                'ps' => '+', 'ng' => '-', 'ad' => '&', 'de' => '*', 'co' => '~', 'pl' => '+',
                'mi' => '-', 'ml' => '*', 'dv' => '/', 'rm' => '%', 'an' => '&', 'or' => '|',
                'eo' => '^', 'aS' => '=', 'pL' => '+=', 'mI' => '-=', 'mL' => '*=', 'dV' => '/=',
                'rM' => '%=', 'aN' => '&=', 'oR' => '|=', 'eO' => '^=', 'ls' => '<<', 'rs' => '>>',
                'lS' => '<<=', 'rS' => '>>=', 'eq' => '==', 'ne' => '!=', 'lt' => '<', 'gt' => '>',
                'le' => '<=', 'ge' => '>=', 'nt' => '!', 'aa' => '&&', 'oo' => '||', 'pp' => '++',
                'mm' => '--', 'cm' => ',', 'pm' => '->*', 'pt' => '->', 'cl' => '()', 'ix' => '[]',
                'qu' => '?', 'st' => ' sizeof', 'sz' => ' sizeof', 'at' => ' alignof', 'az' => ' alignof'
                                }[fu]
                                        ret << "operator#{op}"
                                elsif fu == 'cv'
                                        ret << "cast<"
                                        decode_tok[]
                                        ret << ">" if ret
                                else
                                        ret = nil
                                end
                        end
                end
                name ||= ''
        }

        decode_tok[]
        subs.pop
        if ret and name != ''
                ret << '('
                decode_tok[]
                while ret and name != ''
                        ret << ', '
                        decode_tok[]
                end
                ret << ')' if ret
        end
        ret
end
demangle_msvc(name) click to toggle source

from wgcc-2.2.2/undecorate.cpp TODO

# File metasm/disassemble_api.rb, line 754
def demangle_msvc(name)
        op = name[0, 1]
        op = name[0, 2] if op == '_'
        if op = {
'2' => "new", '3' => "delete", '4' => "=", '5' => ">>", '6' => "<<", '7' => "!", '8' => "==", '9' => "!=",
'A' => "[]", 'C' => "->", 'D' => "*", 'E' => "++", 'F' => "--", 'G' => "-", 'H' => "+", 'I' => "&",
'J' => "->*", 'K' => "/", 'L' => "%", 'M' => "<", 'N' => "<=", 'O' => ">", 'P' => ">=", 'Q' => ",",
'R' => "()", 'S' => "~", 'T' => "^", 'U' => "|", 'V' => "&&", 'W' => "||", 'X' => "*=", 'Y' => "+=",
'Z' => "-=", '_0' => "/=", '_1' => "%=", '_2' => ">>=", '_3' => "<<=", '_4' => "&=", '_5' => "|=", '_6' => "^=",
'_7' => "`vftable'", '_8' => "`vbtable'", '_9' => "`vcall'", '_A' => "`typeof'", '_B' => "`local static guard'",
'_C' => "`string'", '_D' => "`vbase destructor'", '_E' => "`vector deleting destructor'", '_F' => "`default constructor closure'",
'_G' => "`scalar deleting destructor'", '_H' => "`vector constructor iterator'", '_I' => "`vector destructor iterator'",
'_J' => "`vector vbase constructor iterator'", '_K' => "`virtual displacement map'", '_L' => "`eh vector constructor iterator'",
'_M' => "`eh vector destructor iterator'", '_N' => "`eh vector vbase constructor iterator'", '_O' => "`copy constructor closure'",
'_S' => "`local vftable'", '_T' => "`local vftable constructor closure'", '_U' => "new[]", '_V' => "delete[]",
'_X' => "`placement delete closure'", '_Y' => "`placement delete[] closure'"}[op]
                op[0] == ?` ? op[1..-2] : "op_#{op}"
        end
end
detect_function_thunk(funcaddr) click to toggle source

checks if the function starting at funcaddr is an external function thunk (eg jmp [SomeExtFunc]) the argument must be the address of a decodedinstruction that is the first of a function,

which must not have return_addresses

returns the new thunk name if it was changed

# File metasm/disassemble.rb, line 1153
def detect_function_thunk(funcaddr)
        # check thunk linearity (no conditional branch etc)
        addr = funcaddr
        count = 0
        while b = block_at(addr)
                count += 1
                return if count > 5 or b.list.length > 5
                if b.to_subfuncret and not b.to_subfuncret.empty?
                        return if b.to_subfuncret.length != 1
                        addr = normalize(b.to_subfuncret.first)
                        return if not b.to_normal or b.to_normal.length != 1
                        # check that the subfunction is simple (eg get_eip)
                        return if not sf = @function[normalize(b.to_normal.first)]
                        return if not btb = sf.backtrace_binding
                        btb = btb.dup
                        btb.delete_if { |k, v| Expression[k] == Expression[v] }
                        return if btb.length > 2 or btb.values.include? Expression::Unknown
                else
                        return if not bt = b.to_normal
                        if bt.include? :default
                                addr = :default
                                break
                        elsif bt.length != 1
                                return
                        end
                        addr = normalize(bt.first)
                end
        end
        fname = Expression[addr].reduce_rec
        if funcaddr != addr and f = @function[funcaddr]
                # forward get_backtrace_binding to target
                f.backtrace_binding = { :thunk => addr }
                f.noreturn = true if @function[addr] and @function[addr].noreturn
        end
        return if not fname.kind_of?(::String)
        l = auto_label_at(funcaddr, 'sub', 'loc')
        return if l[0, 4] != 'sub_'
        puts "found thunk for #{fname} at #{Expression[funcaddr]}" if $DEBUG
        rename_label(l, @program.new_label("thunk_#{fname}"))
end
detect_function_thunk_noreturn(addr) click to toggle source

this is called when reaching a noreturn function call, with the call address it is responsible for detecting the actual 'call' instruction leading to this noreturn function, and eventually mark the call target as a thunk

# File metasm/disassemble.rb, line 1197
def detect_function_thunk_noreturn(addr)
        5.times {
                return if not di = di_at(addr)
                if di.opcode.props[:saveip] and not di.block.to_subfuncret
                        if di.block.to_normal.to_a.length == 1
                                taddr = normalize(di.block.to_normal.first)
                                if di_at(taddr)
                                        @function[taddr] ||= DecodedFunction.new
                                        return detect_function_thunk(taddr)
                                end
                        end
                        break
                else
                        from = di.block.from_normal.to_a + di.block.from_subfuncret.to_a
                        if from.length == 1
                                addr = from.first
                        else break
                        end
                end
        }
end
di_at(addr) click to toggle source

returns the DecodedInstruction at addr if it exists

# File metasm/disassemble_api.rb, line 153
def di_at(addr)
        di = @decoded[addr] || @decoded[normalize(addr)] if addr
        di if di.kind_of? DecodedInstruction
end
di_including(addr) click to toggle source

returns the DecodedInstruction covering addr returns one at starting nearest addr if multiple are available (overlapping instrs)

# File metasm/disassemble_api.rb, line 172
def di_including(addr)
        return if not addr
        addr = normalize(addr)
        if off = (0...16).find { |o| @decoded[addr-o].kind_of? DecodedInstruction and @decoded[addr-o].bin_length > o }
                @decoded[addr-off]
        end
end
disassemble(*entrypoints) click to toggle source

decodes instructions from an entrypoint, (tries to) follows code flow

# File metasm/disassemble.rb, line 639
def disassemble(*entrypoints)
        nil while disassemble_mainiter(entrypoints)
        self
end
disassemble_block(block, cpu_context) click to toggle source

disassembles a new instruction block at block.address (must be normalized)

# File metasm/disassemble.rb, line 820
def disassemble_block(block, cpu_context)
        raise if not block.list.empty?
        di_addr = block.address
        delay_slot = nil
        di = nil

        # try not to run for too long
        # loop usage: break if the block continues to the following instruction, else return
        @disassemble_maxblocklength.times {
                # check collision into a known block
                break if @decoded[di_addr]

                # check self-modifying code
                if @check_smc
                        #(-7...di.bin_length).each { |off|  # uncomment to check for unaligned rewrites
                        waddr = di_addr             #di_addr + off
                        each_xref(waddr, :w) { |x|
                                #next if off + x.len < 0
                                puts "W: disasm: self-modifying code at #{Expression[waddr]}" if $VERBOSE
                                add_comment(di_addr, "overwritten by #{@decoded[x.origin]}")
                                @callback_selfmodifying[di_addr] if callback_selfmodifying
                                return
                        }
                        #}
                end

                # decode instruction
                block.edata.ptr = di_addr - block.address + block.edata_ptr
                cpu_context = cpu_context.dup if cpu_context
                if not di = @cpu.decode_instruction_context(self, block.edata, di_addr, cpu_context)
                        ed = block.edata
                        break if ed.ptr >= ed.length and get_section_at(di_addr) and di = block.list.last
                        puts "#{ed.ptr >= ed.length ? "end of section reached" : "unknown instruction #{ed.data[di_addr-block.address+block.edata_ptr, 4].to_s.unpack('H*').first}"} at #{Expression[di_addr]}" if $VERBOSE
                        return
                end

                @decoded[di_addr] = di
                block.add_di di
                puts di if $DEBUG

                if callback_newinstr
                        ndi = @callback_newinstr[di]
                        if not ndi or not ndi.block
                                block.list.delete di
                                if ndi
                                        block.add_di ndi
                                        ndi.bin_length = di.bin_length if ndi.bin_length == 0
                                        @decoded[di_addr] = ndi
                                end
                        end
                        di = ndi
                end
                return if not di
                block = di.block

                di_addr = di.next_addr

                backtrace_xrefs_di_rw(di)

                if not di_addr or di.opcode.props[:stopexec] or not @program.get_xrefs_x(self, di).empty?
                        # do not backtrace until delay slot is finished (eg MIPS: di is a
                        #  ret and the delay slot holds stack fixup needed to calc func_binding)
                        # XXX if the delay slot is also xref_x or :stopexec it is ignored
                        delay_slot ||= [di, @cpu.delay_slot(di)]
                end

                if delay_slot
                        di, delay = delay_slot
                        if delay == 0 or not di_addr
                                backtrace_xrefs_di_x(di, cpu_context)
                                if di.opcode.props[:stopexec] or not di_addr; return
                                else break
                                end
                        end
                        delay_slot[1] = delay - 1
                end

                if block.edata.inv_export[di_addr - block.address + block.edata_ptr]
                        # ensure there is a block split if we have a label defined
                        break
                end
        }

        ar = [di_addr]
        ar = @callback_newaddr[block.list.last.address, ar] || ar if callback_newaddr
        ar.each { |di_addr_| backtrace(di_addr_, di.address, :origin => di.address, :type => :x, :cpu_context => cpu_context) }

        block
end
disassemble_fast(entrypoint, maxdepth=-1, &b) click to toggle source

disassembles fast from a list of entrypoints see disassemble_fast_step

# File metasm/disassemble.rb, line 940
def disassemble_fast(entrypoint, maxdepth=-1, &b)
        td = entrypoint
        td = { :addr => entrypoint } unless td.kind_of?(::Hash)
        td[:cpu_context] ||= get_initial_cpu_context(td[:addr])
        todo = [td]
        until todo.empty?
                disassemble_fast_step(todo, &b)
                maxdepth -= 1
                todo.delete_if { |a| not @decoded[normalize(a[:addr])] } if maxdepth == 0
        end
        check_noreturn_function(td[:addr])
end
disassemble_fast_block(block, cpu_context, &b) click to toggle source

disassembles fast a new instruction block at block.address (must be normalized) does not recurse into subfunctions assumes all :saveip returns, except those pointing to a subfunc with noreturn yields subfunction addresses (targets of :saveip) no backtrace for :x (change with backtrace_maxblocks_fast) returns a todo-style ary assumes @addrs_todo is empty

# File metasm/disassemble.rb, line 1013
def disassemble_fast_block(block, cpu_context, &b)
        block = InstructionBlock.new(normalize(block), get_section_at(block)[0]) if not block.kind_of?(InstructionBlock)
        di_addr = block.address
        delay_slot = nil
        di = nil
        ret = []

        return ret if @decoded[di_addr]

        @disassemble_maxblocklength.times {
                break if @decoded[di_addr]

                # decode instruction
                block.edata.ptr = di_addr - block.address + block.edata_ptr
                cpu_context = cpu_context.dup if cpu_context
                if not di = @cpu.decode_instruction_context(self, block.edata, di_addr, cpu_context)
                        break if block.edata.ptr >= block.edata.length and get_section_at(di_addr) and di = block.list.last
                        return ret
                end

                @decoded[di_addr] = di
                block.add_di di
                puts di if $DEBUG

                if callback_newinstr
                        ndi = @callback_newinstr[di]
                        if not ndi or not ndi.block
                                block.list.delete di
                                if ndi
                                        block.add_di ndi
                                        ndi.bin_length = di.bin_length if ndi.bin_length == 0
                                        @decoded[di_addr] = ndi
                                end
                        end
                        di = ndi
                end
                return ret if not di

                di_addr = di.next_addr

                if di.opcode.props[:stopexec] or di.opcode.props[:setip]
                        if di.opcode.props[:setip]
                                @addrs_todo = []
                                ar = @program.get_xrefs_x(self, di)
                                ar = @callback_newaddr[di.address, ar] || ar if callback_newaddr
                                ar.each { |expr|
                                        backtrace(expr, di.address, :origin => di.address, :type => :x, :maxdepth => @backtrace_maxblocks_fast, :cpu_context => cpu_context)
                                }
                        end
                        if di.opcode.props[:saveip]
                                @addrs_todo = []
                                ret.concat disassemble_fast_block_subfunc(di, cpu_context, &b)
                        else
                                ret.concat @addrs_todo
                                @addrs_todo = []
                        end
                        delay_slot ||= [di, @cpu.delay_slot(di)]
                end

                if delay_slot
                        if delay_slot[1] <= 0
                                return ret if delay_slot[0].opcode.props[:stopexec]
                                break
                        end
                        delay_slot[1] -= 1
                end
        }

        ar = [di_addr]
        ar = @callback_newaddr[block.list.last.address, ar] || ar if callback_newaddr
        ar.each { |a|
                di.block.add_to_normal(a)
                ret << { :addr => a, :from => di.address, :cpu_context => cpu_context }
        }
        ret
end
disassemble_fast_block_subfunc(di, cpu_context) { |fa, di| ... } click to toggle source

handles when disassemble_fast encounters a call to a subfunction

# File metasm/disassemble.rb, line 1091
def disassemble_fast_block_subfunc(di, cpu_context)
        funcs = di.block.to_normal.to_a
        do_ret = funcs.empty?
        ret = []
        na = di.next_addr + di.bin_length * @cpu.delay_slot(di)
        funcs.each { |fa|
                fa = normalize(fa)
                disassemble_fast_checkfunc(fa)
                yield fa, di if block_given?
                if f = @function[fa] and bf = f.get_backtracked_for(self, fa, di.address) and not bf.empty?
                        # this includes retaddr unless f is noreturn
                        bf.each { |btt|
                                next if btt.type != :x
                                bt = backtrace(btt.expr, di.address, :include_start => true, :origin => btt.origin, :maxdepth => [@backtrace_maxblocks_fast, 1].max, :cpu_context => cpu_context)
                                if btt.detached
                                        ret.concat bt.map { |a| { :addr => a } }  # callback argument
                                elsif not f.noreturn and bt.find { |a| normalize(a) == na }
                                        do_ret = true
                                end
                        }
                elsif not f or not f.noreturn
                        do_ret = true
                end
        }
        if do_ret
                di.block.add_to_subfuncret(na)
                ret << { :addr => na, :from => di.address, :from_subfuncret => true, :cpu_context => cpu_context }
                di.block.add_to_normal :default if not di.block.to_normal and @function[:default]
        end
        di.add_comment 'noreturn' if ret.empty?
        ret
end
disassemble_fast_checkfunc(addr) click to toggle source

check if an addr has an xref :x from a :saveip, if so mark as Function

# File metasm/disassemble.rb, line 990
def disassemble_fast_checkfunc(addr)
        if @decoded[addr].kind_of?(DecodedInstruction) and not @function[addr]
                func = false
                each_xref(addr, :x) { |x_|
                        func = true if odi = di_at(x_.origin) and odi.opcode.props[:saveip]
                }
                if func
                        auto_label_at(addr, 'sub', 'loc', 'xref')
                        @function[addr] = (@function[:default] || DecodedFunction.new).dup
                        @function[addr].finalized = true
                        detect_function_thunk(addr)
                        puts "found new function #{get_label_at(addr)} at #{Expression[addr]}" if $VERBOSE
                end
        end
end
disassemble_fast_deep(*entrypoints) click to toggle source

disassembles_fast from a list of entrypoints, also dasm subfunctions

# File metasm/disassemble.rb, line 923
def disassemble_fast_deep(*entrypoints)
        @entrypoints ||= []
        @entrypoints |= entrypoints

        entrypoints.each { |ep| do_disassemble_fast_deep(:addr => normalize(ep)) }

        @callback_finished[] if callback_finished
end
disassemble_fast_step(todo, &b) click to toggle source

disassembles one block from the ary, see disassemble_fast_block

# File metasm/disassemble.rb, line 954
def disassemble_fast_step(todo, &b)
        return if not x = todo.pop

        addr = normalize(x[:addr])

        if di = @decoded[addr]
                if di.kind_of?(DecodedInstruction)
                        split_block(di.block, di.address) if not di.block_head?
                        di.block.add_from(x[:from], x[:from_subfuncret] ? :subfuncret : :normal) if x[:from] and x[:from] != :default
                end
        elsif @function[addr] and x[:from]
        elsif s = get_section_at(addr)
                if x[:from] and c_parser and not disassemble_known_functions and name = get_all_labels_at(addr).find { |n|
                                cs = c_parser.toplevel.symbol[n] and cs.type.untypedef.kind_of?(C::Function) }
                        # do not disassemble internal function for which we have a prototype (eg static library)
                        puts "found known function #{name} at #{Expression[addr]}" if $VERBOSE
                        @function[addr] = @cpu.decode_c_function_prototype(@c_parser, c_parser.toplevel.symbol[name])
                        detect_function_thunk_noreturn(x[:from]) if @function[addr].noreturn
                else
                        block = InstructionBlock.new(addr, s[0])
                        block.add_from(x[:from], x[:from_subfuncret] ? :subfuncret : :normal) if x[:from] and x[:from] != :default
                        todo.concat disassemble_fast_block(block, x[:cpu_context], &b)
                end
        elsif name = Expression[addr].reduce_rec and name.kind_of?(::String) and not @function[addr]
                if c_parser and cs = c_parser.toplevel.symbol[name] and cs.type.untypedef.kind_of?(C::Function)
                        @function[addr] = @cpu.decode_c_function_prototype(@c_parser, cs)
                        detect_function_thunk_noreturn(x[:from]) if @function[addr].noreturn
                elsif @function[:default]
                        @function[addr] = @function[:default].dup
                end
        end

        disassemble_fast_checkfunc(addr)
end
disassemble_from(addr, from_addr) click to toggle source

disassemble addr as if the code flow came from from_addr

# File metasm/disassemble_api.rb, line 268
def disassemble_from(addr, from_addr)
        from_addr = from_addr.address if from_addr.kind_of? DecodedInstruction
        from_addr = normalize(from_addr)
        if b = block_at(from_addr)
                b.add_to_normal(addr)
        end
        @addrs_todo << { :addr => addr, :from => from_addr }
        disassemble
end
disassemble_instruction(addr) click to toggle source

disassembles one instruction at address returns nil if no instruction can be decoded there does not update any internal state of the disassembler, nor reuse the @decoded cache

# File metasm/disassemble_api.rb, line 261
def disassemble_instruction(addr)
        if e = get_section_at(addr)
                @cpu.decode_instruction(e[0], normalize(addr))
        end
end
disassemble_mainiter(entrypoints=[]) click to toggle source

do one operation relevant to disassembling returns nil once done

# File metasm/disassemble.rb, line 648
def disassemble_mainiter(entrypoints=[])
        @entrypoints ||= []
        if @addrs_todo.empty? and entrypoints.empty?
                post_disassemble
                puts 'disassembly finished' if $VERBOSE
                @callback_finished[] if callback_finished
                return false
        elsif @addrs_todo.empty?
                ep = entrypoints.shift
                cpu_context = get_initial_cpu_context(ep)
                l = auto_label_at(normalize(ep), 'entrypoint') || normalize(ep)
                puts "start disassemble from #{l} (#{entrypoints.length})" if $VERBOSE and not entrypoints.empty?
                @entrypoints << l
                @addrs_todo << { :addr => ep, :cpu_context => cpu_context }
        else
                disassemble_step
        end
        true
end
disassemble_step() click to toggle source

disassembles one block from addrs_todo adds next addresses to handle to addrs_todo if @function exists, jumps to unknows locations are interpreted as to @function

# File metasm/disassemble.rb, line 703
        def disassemble_step
                return if not x = @addrs_todo.pop or @addrs_done.include?(x)
                @addrs_done << x if x[:from]

                addr = x[:addr]
                from = x[:from]
                # from_subfuncret is true if from is the address of a function call that returns to addr

                return if from == Expression::Unknown

                puts "disassemble_step #{Expression[addr]} #{Expression[from] if from} #{x[:from_subfuncret]}  (/#{@addrs_todo.length})" if $DEBUG

                addr = normalize(addr)

                if from and x[:from_subfuncret] and di_at(from)
                        @decoded[from].block.each_to_normal { |subfunc|
                                subfunc = normalize(subfunc)
                                next if not f = @function[subfunc] or f.finalized
                                f.finalized = true
puts "  finalize subfunc #{Expression[subfunc]}" if debug_backtrace
                                backtrace_update_function_binding(subfunc, f)
                                if not f.return_address
                                        detect_function_thunk(subfunc)
                                end
                        }
                end

                if di = @decoded[addr]
                        if di.kind_of?(DecodedInstruction)
                                split_block(di.block, di.address, true) if not di.block_head?       # this updates di.block
                                di.block.add_from(from, x[:from_subfuncret] ? :subfuncret : :normal) if from and from != :default
                                bf = di.block
                        elsif di == true
                                bf = @function[addr]
                        end
                elsif from and bf = @function[addr]
                        detect_function_thunk_noreturn(from) if bf.noreturn
                elsif s = get_section_at(addr)
                        if from and c_parser and not disassemble_known_functions and name = get_all_labels_at(addr).find { |n|
                                        cs = c_parser.toplevel.symbol[n] and cs.type.untypedef.kind_of?(C::Function) }
                                # do not disassemble internal function for which we have a prototype (eg static library)
                                puts "found known function #{name} at #{Expression[addr]}" if $VERBOSE
                                bf = @function[addr] = @cpu.decode_c_function_prototype(@c_parser, c_parser.toplevel.symbol[name])
                                detect_function_thunk_noreturn(from) if bf.noreturn
                        else
                                block = InstructionBlock.new(normalize(addr), s[0])
                                block.add_from(from, x[:from_subfuncret] ? :subfuncret : :normal) if from and from != :default
                                disassemble_block(block, x[:cpu_context])
                        end
                elsif from and c_parser and name = Expression[addr].reduce_rec and name.kind_of?(::String) and
                                cs = c_parser.toplevel.symbol[name] and cs.type.untypedef.kind_of?(C::Function)
                        # use C header prototype for external functions if available
                        bf = @function[addr] = @cpu.decode_c_function_prototype(@c_parser, cs)
                        detect_function_thunk_noreturn(from) if bf.noreturn
                elsif from and not @function[addr]
                        if bf = @function[:default]
                                puts "using default function for #{Expression[addr]} from #{Expression[from]}" if $DEBUG
                                if name = Expression[addr].reduce_rec and name.kind_of?(::String)
                                        @function[addr] = @function[:default].dup
                                else
                                        addr = :default
                                end
                                if @decoded[from]
                                        @decoded[from].block.add_to addr
                                end
                        else
                                puts "not disassembling unknown address #{Expression[addr]} from #{Expression[from]}" if $DEBUG
                        end
                        if from != :default
                                add_xref(addr, Xref.new(:x, from))
                                add_xref(Expression::Unknown, Xref.new(:x, from))
                        end
                else
                        puts "not disassembling unknown address #{Expression[addr]}" if $VERBOSE
                end

                if bf and from and from != :default
                        if bf.kind_of?(DecodedFunction)
                                bff = bf.get_backtracked_for(self, addr, from)
                        else
                                bff = bf.backtracked_for
                        end
                end
                bff.each { |btt|
                        next if btt.address
                        if @decoded[from].kind_of?(DecodedInstruction) and @decoded[from].opcode.props[:saveip] and not x[:from_subfuncret] and not @function[addr]
                                backtrace_check_found(btt.expr, @decoded[addr], btt.origin, btt.type, btt.len, btt.maxdepth, btt.detached, btt.cpu_context)
                        end
                        next if backtrace_check_funcret(btt, addr, from)
                        backtrace(btt.expr, from,
                                  :include_start => true, :from_subfuncret => x[:from_subfuncret],
                                  :origin => btt.origin, :orig_expr => btt.orig_expr, :type => btt.type,
                                  :len => btt.len, :detached => btt.detached, :maxdepth => btt.maxdepth, :cpu_context => btt.cpu_context)
                } if bff
        end
do_disassemble_fast_deep(ep) click to toggle source
# File metasm/disassemble.rb, line 932
def do_disassemble_fast_deep(ep)
        disassemble_fast(ep) { |fa, di|
                do_disassemble_fast_deep(:addr => normalize(fa), :from => di.address)
        }
end
dump(dump_data=true, &b) click to toggle source

dumps the source, optionally including data yields (defaults puts) each line

# File metasm/disassemble.rb, line 2076
def dump(dump_data=true, &b)
        b ||= lambda { |l| puts l }
        @sections.sort_by { |addr, edata| addr.kind_of?(::Integer) ? addr : 0 }.each { |addr, edata|
                addr = Expression[addr] if addr.kind_of?(::String)
                blockoffs = @decoded.values.grep(DecodedInstruction).map { |di| Expression[di.block.address, :-, addr].reduce if di.block_head? }.grep(::Integer).sort.reject { |o| o < 0 or o >= edata.length }
                b[@program.dump_section_header(addr, edata)]
                if not dump_data and edata.length > 16*1024 and blockoffs.empty?
                        b["// [#{edata.length} data bytes]"]
                        next
                end
                unk_off = 0  # last off displayed
                # blocks.sort_by { |b| b.addr }.each { |b|
                while unk_off < edata.length
                        if unk_off == blockoffs.first
                                blockoffs.shift
                                di = @decoded[addr+unk_off]
                                if unk_off != di.block.edata_ptr
                                        b["\n// ------ overlap (#{unk_off-di.block.edata_ptr}) ------"]
                                elsif di.block.from_normal.kind_of?(::Array)
                                        b["\n"]
                                end
                                dump_block(di.block, &b)
                                unk_off += [di.block.bin_length, 1].max
                                unk_off = blockoffs.first if blockoffs.first and unk_off > blockoffs.first
                        else
                                next_off = blockoffs.first || edata.length
                                if dump_data or next_off - unk_off < 16
                                        unk_off = dump_data(addr + unk_off, edata, unk_off, &b)
                                else
                                        b["// [#{next_off - unk_off} data bytes]"]
                                        unk_off = next_off
                                end
                        end
                end
        }
end
dump_block(block, &b) click to toggle source

dumps a block of decoded instructions

# File metasm/disassemble.rb, line 2114
def dump_block(block, &b)
        b ||= lambda { |l| puts l }
        block = @decoded[block].block if @decoded[block]
        dump_block_header(block, &b)
        block.list.each { |di| b[di.show] }
end
dump_block_header(block, &b) click to toggle source

shows the xrefs/labels at block start

# File metasm/disassemble.rb, line 2122
def dump_block_header(block, &b)
        b ||= lambda { |l| puts l }
        xr = []
        each_xref(block.address) { |x|
                case x.type
                when :x; xr << Expression[x.origin]
                when :r, :w; xr << "#{x.type}#{x.len}:#{Expression[x.origin]}"
                end
        }
        if not xr.empty?
                b["\n// Xrefs: #{xr[0, 8].join(' ')}#{' ...' if xr.length > 8}"]
        end
        if block.edata.inv_export[block.edata_ptr] and label_alias[block.address]
                b["\n"] if xr.empty?
                label_alias[block.address].each { |name| b["#{name}:"] }
        end
        if c = @comment[block.address]
                c = c.join("\n") if c.kind_of?(::Array)
                c.each_line { |l| b["// #{l}"] }
        end
end
dump_data(addr, edata, off, &b) click to toggle source

dumps data/labels, honours @xrefs.len if exists dumps one line only stops on end of edata/@decoded/@xref returns the next offset to display TODO array-style data access

# File metasm/disassemble.rb, line 2149
def dump_data(addr, edata, off, &b)
        b ||= lambda { |l| puts l }
        if l = edata.inv_export[off] and label_alias[addr]
                l_list = label_alias[addr].sort
                l = l_list.pop || l
                l_list.each { |ll|
                        b["#{ll}:"]
                }
                l = (l + ' ').ljust(16)
        else l = ''
        end
        elemlen = 1   # size of each element we dump (db by default)
        dumplen = -off % 16   # number of octets to dump
        dumplen = 16 if dumplen == 0
        cmt = []
        each_xref(addr) { |x|
                dumplen = elemlen = x.len if x.len == 2 or x.len == 4
                cmt << " #{x.type}#{x.len}:#{Expression[x.origin]}"
        }
        cmt = " ; @#{Expression[addr]}" + cmt.sort[0, 6].join
        if r = edata.reloc[off]
                dumplen = elemlen = r.type.to_s[1..-1].to_i/8
        end
        dataspec = { 1 => 'db ', 2 => 'dw ', 4 => 'dd ', 8 => 'dq ' }[elemlen]
        if not dataspec
                dataspec = 'db '
                elemlen = 1
        end
        l << dataspec

        # dup(?)
        if off >= edata.data.length
                dups = edata.virtsize - off
                @prog_binding.each_value { |a|
                        tmp = Expression[a, :-, addr].reduce
                        dups = tmp if tmp.kind_of?(::Integer) and tmp > 0 and tmp < dups
                }
                @xrefs.each_key { |a|
                        tmp = Expression[a, :-, addr].reduce
                        dups = tmp if tmp.kind_of?(::Integer) and tmp > 0 and tmp < dups
                }
                dups /= elemlen
                dups = 1 if dups < 1
                b[(l + "#{dups} dup(?)").ljust(48) << cmt]
                return off + dups*elemlen
        end

        vals = []
        edata.ptr = off
        dups = dumplen/elemlen
        elemsym = "u#{elemlen*8}".to_sym
        while edata.ptr < edata.data.length
                if vals.length > dups and vals.last != vals.first
                        # we have a dup(), unread the last element which is different
                        vals.pop
                        addr = Expression[addr, :-, elemlen].reduce
                        edata.ptr -= elemlen
                        break
                end
                break if vals.length == dups and vals.uniq.length > 1
                vals << edata.decode_imm(elemsym, @cpu.endianness)
                addr += elemlen
                if i = (1-elemlen..0).find { |i_|
                        t = addr + i_
                        @xrefs[t] or @decoded[t] or edata.reloc[edata.ptr+i_] or edata.inv_export[edata.ptr+i_]
                }
                        # i < 0
                        edata.ptr += i
                        addr += i
                        break
                end
                break if edata.reloc[edata.ptr-elemlen]
        end

        # line of repeated value => dup()
        if vals.length > 8 and vals.uniq.length == 1
                b[(l << "#{vals.length} dup(#{Expression[vals.first]})").ljust(48) << cmt]
                return edata.ptr
        end

        # recognize strings
        vals = vals.inject([]) { |vals_, value|
                if (elemlen == 1 or elemlen == 2)
                        case value
                        when 0x20..0x7e, 0x0a, 0x0d
                                if vals_.last.kind_of?(::String); vals_.last << value ; vals_
                                else vals_ << value.chr
                                end
                        else vals_ << value
                        end
                else vals_ << value
                end
        }

        vals.map! { |value|
                if value.kind_of?(::String)
                        if value.length > 2 # or value == vals.first or value == vals.last # if there is no xref, don't care
                                value.inspect
                        else
                                value.unpack('C*').map { |c| Expression[c] }
                        end
                else
                        Expression[value]
                end
        }
        vals.flatten!

        b[(l << vals.join(', ')).ljust(48) << cmt]

        edata.ptr
end
each_function_block(addr, incl_subfuncs = false, find_func_start = true) { |a| ... } click to toggle source

iterates over the blocks of a function, yields each func block address returns the graph of blocks (block address => [list of samefunc blocks])

# File metasm/disassemble_api.rb, line 395
def each_function_block(addr, incl_subfuncs = false, find_func_start = true)
        addr = @function.index(addr) if addr.kind_of? DecodedFunction
        addr = addr.address if addr.kind_of? DecodedInstruction
        addr = find_function_start(addr) if not @function[addr] and find_func_start
        todo = [addr]
        ret = {}
        while a = todo.pop
                next if not di = di_at(a)
                a = di.block.address
                next if ret[a]
                ret[a] = []
                yield a if block_given?
                di.block.each_to_samefunc(self) { |f| ret[a] << f ; todo << f }
                di.block.each_to_otherfunc(self) { |f| ret[a] << f ; todo << f } if incl_subfuncs
        end
        ret
end
Also aliased as: function_blocks
each_instructionblock(&b) click to toggle source

yields every InstructionBlock returns the list of IBlocks

# File metasm/disassemble_api.rb, line 196
def each_instructionblock(&b)
        ret = []
        @decoded.each { |addr, di|
                next if not di.kind_of? DecodedInstruction or not di.block_head?
                ret << di.block
                b.call(di.block) if b
        }
        ret
end
Also aliased as: instructionblocks
each_xref(addr, type=nil) { |x_| ... } click to toggle source

yields each xref to a given address, optionaly restricted to a type

# File metasm/disassemble.rb, line 518
def each_xref(addr, type=nil)
        addr = normalize addr

        x = @xrefs[addr]
        x = case x
            when nil; []
            when ::Array; x.dup
            else [x]
            end

        x.delete_if { |x_| x_.type != type } if type

        # add pseudo-xrefs for exe relocs
        if (not type or type == :reloc) and l = get_label_at(addr) and a = @inv_section_reloc[l]
                x_more = []
                a.each { |b, e, o, r|
                        addr = Expression[b]+o
                        # ignore relocs embedded in an already-listed instr
                        x_more << Xref.new(:reloc, addr) if not x.find { |x_|
                                next if not x_.origin or not di_at(x_.origin)
                                (addr - x_.origin) < @decoded[x_.origin].bin_length rescue false
                        }
                }
                x.concat x_more
        end

        x.each { |x_| yield x_ }
end
fileoff_to_addr(foff) click to toggle source

transform a file offset into an address

# File metasm/disassemble_api.rb, line 488
def fileoff_to_addr(foff)
        @program.fileoff_to_addr(foff)
end
find_function_start(addr) click to toggle source

finds the start of a function from the address of an instruction

# File metasm/disassemble_api.rb, line 375
def find_function_start(addr)
        addr = addr.address if addr.kind_of? DecodedInstruction
        todo = [addr]
        done = []
        while a = todo.pop
                a = normalize(a)
                di = @decoded[a]
                next if done.include? a or not di.kind_of? DecodedInstruction
                done << a
                a = di.block.address
                break a if @function[a]
                l = []
                di.block.each_from_samefunc(self) { |f| l << f }
                break a if l.empty?
                todo.concat l
        end
end
fix_noreturn(o) click to toggle source

call this function on a function entrypoint if the function is in fact a __noreturn will cut the to_subfuncret of callers

# File metasm/disassemble_api.rb, line 1691
def fix_noreturn(o)
        each_xref(o, :x) { |a|
                a = normalize(a.origin)
                next if not di = di_at(a) or not di.opcode.props[:saveip]
                # XXX should check if caller also becomes __noreturn
                di.block.each_to_subfuncret { |to|
                        next if not tdi = di_at(to) or not tdi.block.from_subfuncret
                        tdi.block.from_subfuncret.delete_if { |aa| normalize(aa) == di.address }
                        tdi.block.from_subfuncret = nil if tdi.block.from_subfuncret.empty?
                }
                di.block.to_subfuncret = nil
        }
end
flatten_graph(entry, include_subfunc=true) click to toggle source

returns an array of instructions/label that, once parsed and assembled, should give something equivalent to the code accessible from the (list of) entrypoints given from the @decoded dasm graph assume all jump targets have a matching label in @prog_binding may add inconditional jumps in the listing to preserve the code flow

# File metasm/disassemble_api.rb, line 689
def flatten_graph(entry, include_subfunc=true)
        ret = []
        entry = [entry] if not entry.kind_of? Array
        todo = entry.map { |a| normalize(a) }
        done = []
        inv_binding = @prog_binding.invert
        while addr = todo.pop
                next if done.include?(addr)
                done << addr

                ret << Label.new(inv_binding[addr]) if inv_binding[addr]
                if not di_at(addr)
                        ret << @cpu.instr_jump_stop
                        next
                end

                b = @decoded[addr].block
                ret.concat b.list.map { |di| di.instruction }

                b.each_to_otherfunc(self) { |to|
                        to = normalize to
                        todo.unshift to if include_subfunc
                }
                b.each_to_samefunc(self) { |to|
                        to = normalize to
                        todo << to
                }

                if not di = b.list[-1-@cpu.delay_slot] or not di.opcode.props[:stopexec] or di.opcode.props[:saveip]
                        to = b.list.last.next_addr
                        if todo.include?(to) and di_at(to)
                                if done.include?(to)
                                        if not to_l = inv_binding[to]
                                                to_l = auto_label_at(to, 'loc')
                                                if done.include? to and idx = ret.index(@decoded[to].block.list.first.instruction)
                                                        ret.insert(idx, Label.new(to_l))
                                                end
                                        end
                                        ret << @cpu.instr_uncond_jump_to(to_l)
                                else
                                        todo << to        # ensure it's next in the listing
                                end
                        else
                                ret << @cpu.instr_jump_stop
                        end
                end
        end

        ret
end
function_at(addr) click to toggle source

returns the DecodedFunction at addr if it exists

# File metasm/disassemble_api.rb, line 165
def function_at(addr)
        f = @function[addr] || @function[normalize(addr)] if addr
        f if f.kind_of? DecodedFunction
end
function_blocks(addr, incl_subfuncs = false, find_func_start = true)
Alias for: each_function_block
function_graph(funcs = @function.keys + @entrypoints.to_a, ret={}) click to toggle source

returns a graph of function calls for each func passed as arg (default: all), update the 'ret' hash associating func => [list of direct subfuncs called]

# File metasm/disassemble_api.rb, line 417
def function_graph(funcs = @function.keys + @entrypoints.to_a, ret={})
        funcs = funcs.map { |f| normalize(f) }.uniq.find_all { |f| @decoded[f] }
        funcs.each { |f|
                next if ret[f]
                ret[f] = []
                each_function_block(f) { |b|
                        @decoded[b].block.each_to_otherfunc(self) { |sf|
                                ret[f] |= [sf]
                        }
                }
        }
        ret
end
function_graph_from(addr) click to toggle source

return the graph of function => subfunction list recurses from an entrypoint

# File metasm/disassemble_api.rb, line 433
def function_graph_from(addr)
        addr = normalize(addr)
        addr = find_function_start(addr) || addr
        ret = {}
        osz = ret.length-1
        while ret.length != osz
                osz = ret.length
                function_graph(ret.values.flatten + [addr], ret)
        end
        ret
end
function_graph_to(addr) click to toggle source

return the graph of function => subfunction list for which a (sub-sub)function includes addr

# File metasm/disassemble_api.rb, line 447
def function_graph_to(addr)
        addr = normalize(addr)
        addr = find_function_start(addr) || addr
        full = function_graph
        ret = {}
        todo = [addr]
        done = []
        while a = todo.pop
                next if done.include? a
                done << a
                full.each { |f, sf|
                        next if not sf.include? a
                        ret[f] ||= []
                        ret[f] |= [a]
                        todo << f
                }
        end
        ret
end
function_including(addr) click to toggle source

returns the DecodedFunction including this byte return the one of find_function_start() if multiple are possible (block shared by multiple funcs)

# File metasm/disassemble_api.rb, line 189
def function_including(addr)
        return if not di = di_including(addr)
        function_at(find_function_start(di.address))
end
function_walk(addr_start, obj_start) { |:merge, addr, froms, values.first| ... } click to toggle source

iterates over all instructions of a function from a given entrypoint carries an object while walking, the object is yielded every instruction every block is walked only once, after all previous blocks are done (if possible) on a 'jz', a [:clone] event is yielded for every path beside the first on a juction (eg a -> b -> d, a -> c -> d), a [:merge] event occurs if froms have different objs event list:

[:di, <addr>, <decoded_instruction>, <object>]
[:clone, <newaddr>, <oldaddr>, <object>]
[:merge, <newaddr>, {<oldaddr1> => <object1>, <oldaddr2> => <object2>, ...}, <object1>]
[:subfunc, <subfunc_addr>, <call_addr>, <object>]

all events should return an object :merge has a copy of object1 at the end so that uninterested callers can always return args if an event returns false, the trace stops for the current branch

# File metasm/disassemble.rb, line 1426
def function_walk(addr_start, obj_start)
        # addresses of instrs already seen => obj
        done = {}
        todo = [[addr_start, obj_start]]

        while hop = todo.pop
                addr, obj = hop
                next if done.has_key?(done)

                di = di_at(addr)
                next if not di

                if done.empty?
                        dilist = di.block.list[di.block.list.index(di)..-1]
                else
                        # new block, check all 'from' have been seen
                        if not hop[2]
                                # may retry later
                                all_ok = true
                                di.block.each_from_samefunc(self) { |fa| all_ok = false unless done.has_key?(fa) }
                                if not all_ok
                                        todo.unshift([addr, obj, true])
                                        next
                                end
                        end

                        froms = {}
                        di.block.each_from_samefunc(self) { |fa| froms[fa] = done[fa] if done[fa] }
                        if froms.values.uniq.length > 1
                                obj = yield([:merge, addr, froms, froms.values.first])
                                next if obj == false
                        end

                        dilist = di.block.list
                end

                if dilist.each { |_di|
                                break if done.has_key?(_di.address)        # looped back into addr_start
                                done[_di.address] = obj
                                obj = yield([:di, _di.address, _di, obj])
                                break if obj == false      # also return false for the previous 'if'
                        }

                        from = dilist.last.address

                        if di.block.to_normal and di.block.to_normal[0] and
                                        di.block.to_subfuncret and di.block.to_subfuncret[0]
                                # current instruction block calls into a subfunction
                                obj = di.block.to_normal.map { |subf|
                                        yield([:subfunc, subf, from, obj])
                                }.first            # propagate 1st subfunc result
                                next if obj == false
                        end

                        wantclone = false
                        di.block.each_to_samefunc(self) { |ta|
                                if wantclone
                                        nobj = yield([:clone, ta, from, obj])
                                        next if obj == false
                                        todo << [ta, nobj]
                                else
                                        todo << [ta, obj]
                                        wantclone = true
                                end
                        }
                end
        end
end
get_all_labels_at(addr) click to toggle source

return the array of all labels associated to an addr

# File metasm/disassemble_api.rb, line 285
def get_all_labels_at(addr)
        addr = normalize(addr)
        label_alias[addr].to_a
end
get_edata_at(*a) click to toggle source

returns the 1st element of get_section_at (ie the edata at a given address) or nil

# File metasm/disassemble_api.rb, line 146
def get_edata_at(*a)
        if s = get_section_at(*a)
                s[0]
        end
end
get_fwdemu_binding(di, pc=nil, dbg_ctx=nil) click to toggle source

return a backtrace_binding reversed (akin to code emulation) (but not really)

# File metasm/disassemble_api.rb, line 208
def get_fwdemu_binding(di, pc=nil, dbg_ctx=nil)
        @cpu.get_fwdemu_binding(di, pc, dbg_ctx)
end
get_initial_cpu_context(addr) click to toggle source
# File metasm/disassemble.rb, line 668
def get_initial_cpu_context(addr)
        @cpu.disassemble_init_context(self, addr)
end
get_label_at(addr) click to toggle source

returns the label associated to an addr, or nil if none exist

# File metasm/disassemble_api.rb, line 279
def get_label_at(addr)
        e = get_edata_at(addr, false)
        e.inv_export[e.ptr] if e
end
get_section_at(addr, memcheck=true) click to toggle source

returns [edata, edata_base] or nil edata.ptr points to addr

# File metasm/disassemble.rb, line 577
def get_section_at(addr, memcheck=true)
        case addr = normalize(addr)
        when ::Integer
                if s =  @sections.find { |b, e| b.kind_of?(::Integer) and addr >= b and addr < b + e.length } ||
                        @sections.find { |b, e| b.kind_of?(::Integer) and addr == b + e.length }            # end label
                        s[1].ptr = addr - s[0]
                        return if memcheck and s[1].data.respond_to?(:page_invalid?) and s[1].data.page_invalid?(s[1].ptr)
                        [s[1], s[0]]
                end
        when Expression
                if addr.op == :+ and addr.rexpr.kind_of?(::Integer) and addr.rexpr >= 0 and addr.lexpr.kind_of?(::String) and e = @sections[addr.lexpr]
                        e.ptr = addr.rexpr
                        return if memcheck and e.data.respond_to?(:page_invalid?) and e.data.page_invalid?(e.ptr)
                        [e, Expression[addr.lexpr]]
                elsif addr.op == :+ and addr.rexpr.kind_of?(::String) and not addr.lexpr and e = @sections[addr.rexpr]
                        e.ptr = 0
                        return if memcheck and e.data.respond_to?(:page_invalid?) and e.data.page_invalid?(e.ptr)
                        [e, addr.rexpr]
                end
        end
end
get_xrefs_rw(di) click to toggle source

retrieve the list of data r/w crossrefs due to the decodedinstruction returns a list of [type, symbolic expression, length]

# File metasm/disassemble.rb, line 918
def get_xrefs_rw(di)
        @program.get_xrefs_rw(self, di)
end
get_xrefs_x(di) click to toggle source

retrieve the list of execution crossrefs due to the decodedinstruction returns a list of symbolic expressions

# File metasm/disassemble.rb, line 912
def get_xrefs_x(di)
        @program.get_xrefs_x(self, di)
end
gui_hilight_word_regexp(word) click to toggle source
# File metasm/disassemble_api.rb, line 1833
def gui_hilight_word_regexp(word)
        @cpu.gui_hilight_word_regexp(word)
end
inspect() click to toggle source
# File metasm/disassemble.rb, line 2064
def inspect
        "<Metasm::Disassembler @%x>" % object_id
end
instructionblocks(&b)
label_alias() click to toggle source

returns a hash associating addr => list of labels at this addr label_alias may be nil if a new label is created elsewhere in the edata with the same name

# File metasm/disassemble.rb, line 628
def label_alias
        if not @label_alias_cache
                @label_alias_cache = {}
                @prog_binding.each { |k, v|
                        (@label_alias_cache[v] ||= []) << k
                }
        end
        @label_alias_cache
end
load(str) { |type, data| ... } click to toggle source

loads the dasm state from a savefile content will yield unknown segments / binarypath notfound

# File metasm/disassemble_api.rb, line 1113
def load(str)
        raise 'Not a metasm save file' if str[0, 12].chomp != 'Metasm.dasm'
        off = 12
        pp = Preprocessor.new
        app = AsmPreprocessor.new
        while off < str.length
                i = str.index("\n", off) || str.length
                type, len = str[off..i].chomp.split
                off = i+1
                data = str[off, len.to_i]
                off += len.to_i
                case type
                when nil, ''
                when 'binarypath'
                        data = yield(type, data) if not File.exist? data and block_given?
                        reinitialize AutoExe.decode_file(data)
                        @program.disassembler = self
                        @program.init_disassembler
                when 'cpu'
                        cpuname, size, endianness = data.split
                        cpu = Metasm.const_get(cpuname)
                        raise 'invalid cpu' if not cpu < CPU
                        cpu = cpu.new
                        cpu.size = size.to_i
                        cpu.endianness = endianness.to_sym
                        reinitialize Shellcode.new(cpu)
                        @program.disassembler = self
                        @program.init_disassembler
                        @sections.delete(0) # rm empty section at 0, other real 'section' follow
                when 'section'
                        info = data[0, data.index("\n") || data.length]
                        data = data[info.length, data.length]
                        pp.feed!(info)
                        addr = Expression.parse(pp).reduce
                        len = Expression.parse(pp).reduce
                        edata = EncodedData.new(data.unpack('m*').first, :virtsize => len)
                        # check for an existing section, eg from binarypath
                        existing_section = get_section_at(addr)
                        if not existing_section or existing_section[0].data.to_str != edata.data.to_str
                                add_section(addr, edata)
                        end
                when 'map'
                        load_map data
                when 'decoded'
                        data.each_line { |l|
                                begin
                                        next if l !~ /^([^,]*),(\d*) ([^;]*)(?:; (.*))?/
                                        a, len, instr, cmt = $1, $2, $3, $4
                                        a = Expression.parse(pp.feed!(a)).reduce
                                        instr = @cpu.parse_instruction(app.feed!(instr))
                                        di = DecodedInstruction.new(instr, a)
                                        di.bin_length = len.to_i
                                        di.add_comment cmt if cmt
                                        @decoded[a] = di
                                rescue
                                        puts "load: bad di #{l.inspect}" if $VERBOSE
                                end
                        }
                when 'blocks'
                        data.each_line { |l|
                                bla = l.chomp.split(';').map { |sl| sl.split(',') }
                                begin
                                        a = Expression.parse(pp.feed!(bla.shift[0])).reduce
                                        b = InstructionBlock.new(a, get_section_at(a).to_a[0])
                                        bla.shift.each { |e|
                                                a = Expression.parse(pp.feed!(e)).reduce
                                                b.add_di(@decoded[a])
                                        }
                                        bla.zip([:to_normal, :to_subfuncret, :to_indirect, :from_normal, :from_subfuncret, :from_indirect]).each { |l_, s|
                                                b.send("#{s}=", l_.map { |e| Expression.parse(pp.feed!(e)).reduce }) if not l_.empty?
                                        }
                                rescue
                                        puts "load: bad block #{l.inspect}" if $VERBOSE
                                end
                        }
                when 'funcs'
                        data.each_line { |l|
                                begin
                                        a, *r = l.split(',').map { |e| Expression.parse(pp.feed!(e)).reduce }
                                        @function[a] = DecodedFunction.new
                                        @function[a].return_address = r if not r.empty?
                                        @function[a].finalized = true
                                        # TODO
                                rescue
                                        puts "load: bad function #{l.inspect} #$!" if $VERBOSE
                                end
                        }
                when 'comment'
                        data.each_line { |l|
                                begin
                                        a, c = l.split(' ', 2)
                                        a = Expression.parse(pp.feed!(a)).reduce
                                        @comment[a] ||= []
                                        @comment[a] |= [c]
                                rescue
                                        puts "load: bad comment #{l.inspect} #$!" if $VERBOSE
                                end
                        }
                when 'c'
                        begin
                                # TODO parse_invalid_c, split per function, whatever
                                parse_c('')
                                @c_parser.allow_bad_c = true
                                parse_c(data, 'savefile#c')
                        rescue
                                puts "load: bad C: #$!", $!.backtrace if $VERBOSE
                        end
                        @c_parser.readtok until @c_parser.eos? if @c_parser
                when 'xrefs'
                        data.each_line { |l|
                                begin
                                        a, t, len, o = l.chomp.split(',')
                                        case a
                                        when ':default'; a = :default
                                        when ':unknown'; a = Expression::Unknown
                                        else a = Expression.parse(pp.feed!(a)).reduce
                                        end
                                        t = (t.empty? ? nil : t.to_sym)
                                        len = (len != '' ? len.to_i : nil)
                                        o = (o.to_s != '' ? Expression.parse(pp.feed!(o)).reduce : nil)   # :default/:unknown ?
                                        add_xref(a, Xref.new(t, o, len))
                                rescue
                                        puts "load: bad xref #{l.inspect} #$!" if $VERBOSE
                                end
                        }
                #when 'trace'
                else
                        if block_given?
                                yield(type, data)
                        else
                                puts "load: unsupported section #{type.inspect}" if $VERBOSE
                        end
                end
        end
end
load_map(str, off=0) click to toggle source

loads a map file (addr => symbol) off is an optional offset to add to every address found (for eg rebased binaries) understands:

standard map files (eg linux-kernel.map: <addr> <type> <name>, e.g. 'c01001ba t setup_idt')
ida map files (<sectionidx>:<sectionoffset> <name>)

arg is either the map itself or the filename of the map (if it contains no newline)

# File metasm/disassemble_api.rb, line 994
def load_map(str, off=0)
        str = File.read(str) rescue nil if not str.index("\n")
        sks = @sections.keys.sort
        seen = {}
        str.each_line { |l|
                case l.strip
                when /^([0-9A-F]+)\s+(\w+)\s+(\w+)/i # kernel.map style
                        addr = $1.to_i(16)+off
                        set_label_at(addr, $3, false, !seen[addr])
                        seen[addr] = true
                when /^([0-9A-F]+):([0-9A-F]+)\s+([a-z_]\w+)/i       # IDA style
                        # we do not have section load order, let's just hope that the addresses are sorted (and sortable..)
                        #  could check the 1st part of the file, with section sizes, but it is not very convenient
                        # the regexp is so that we skip the 1st part with section descriptions
                        # in the file, section 1 is the 1st section ; we have an additional section (exe header) which fixes the 0-index
                        # XXX this is PE-specific, TODO fix it for ELF (ida references sections, we reference segments...)
                        addr = sks[$1.to_i(16)] + $2.to_i(16) + off
                        set_label_at(addr, $3, false, !seen[addr])
                        seen[addr] = true
                end
        }
end
load_plugin(plugin_filename) click to toggle source

loads a disassembler plugin script this is simply a ruby script instance_eval() in the disassembler the filename argument is autocompleted with '.rb' suffix, and also

searched for in the Metasmdir/samples/dasm-plugins subdirectory if not found in cwd
# File metasm/disassemble_api.rb, line 1738
def load_plugin(plugin_filename)
        if not File.exist?(plugin_filename)
                if File.exist?(plugin_filename+'.rb')
                        plugin_filename += '.rb'
                elsif defined? Metasmdir
                        # try autocomplete
                        pf = File.join(Metasmdir, 'samples', 'dasm-plugins', plugin_filename)
                        if File.exist? pf
                                plugin_filename = pf
                        elsif File.exist? pf + '.rb'
                                plugin_filename = pf + '.rb'
                        end
                end
        end

        instance_eval File.read(plugin_filename)
end
load_plugin_nogui(plugin_filename) click to toggle source

same as load_plugin, but hides the @gui attribute while loading, preventing the plugin do popup stuff this is useful when you want to load a plugin from another plugin to enhance the plugin's functionality XXX this also prevents setting up kbd_callbacks etc..

# File metasm/disassemble_api.rb, line 1759
def load_plugin_nogui(plugin_filename)
        oldgui = gui
        @gui = nil
        load_plugin(plugin_filename)
ensure
        @gui = oldgui
end
merge_blocks(b1, b2, allow_nonadjacent = false) click to toggle source

merge two instruction blocks if they form a simple chain and are adjacent returns true if merged

# File metasm/disassemble_api.rb, line 657
def merge_blocks(b1, b2, allow_nonadjacent = false)
        if b1 and not b1.kind_of? InstructionBlock
                return if not b1 = block_at(b1)
        end
        if b2 and not b2.kind_of? InstructionBlock
                return if not b2 = block_at(b2)
        end
        if b1 and b2 and (allow_nonadjacent or b1.list.last.next_addr == b2.address) and
                        b1.to_normal.to_a == [b2.address] and b2.from_normal.to_a.length == 1 and   # that handles delay_slot
                        b1.to_subfuncret.to_a == [] and b2.from_subfuncret.to_a == [] and
                        b1.to_indirect.to_a == [] and b2.from_indirect.to_a == []
                b2.list.each { |di| b1.add_di di }
                b1.to_normal = b2.to_normal
                b1.to_subfuncret = b2.to_subfuncret
                b1.to_indirect = b2.to_indirect
                b2.list.clear
                @addrs_done.delete_if { |ad| normalize(ad[0]) == b2.address }
                true
        end
end
name_local_vars(addr) click to toggle source

find the function containing addr, and find & rename stack vars in it

# File metasm/disassemble_api.rb, line 1852
def name_local_vars(addr)
        if @cpu.respond_to?(:name_local_vars) and faddr = find_function_start(addr)
                @function[faddr] ||= DecodedFunction.new     # XXX
                @cpu.name_local_vars(self, faddr)
        end
end
need_backtrace(expr, terminals=[]) click to toggle source

returns true if the expression needs more backtrace it checks for the presence of a symbol (not :unknown), which means it depends on some register value

# File metasm/disassemble.rb, line 1811
def need_backtrace(expr, terminals=[])
        return if expr.kind_of?(::Integer)
        !(expr.externals.grep(::Symbol) - [:unknown] - terminals).empty?
end
normalize(addr) click to toggle source

returns the canonical form of addr (absolute address integer or label of start of section + section offset)

# File metasm/disassemble.rb, line 569
def normalize(addr)
        return addr if not addr or addr == :default
        addr = Expression[addr].bind(@old_prog_binding).reduce if not addr.kind_of?(Integer)
        addr
end
parse_c(str, filename=nil, lineno=1) click to toggle source

parses a C string for function prototypes

# File metasm/disassemble.rb, line 553
def parse_c(str, filename=nil, lineno=1)
        @c_parser_constcache = nil
        @c_parser ||= @cpu.new_cparser
        @c_parser.lexer.define_weak('__METASM__DECODE__')
        @c_parser.parse(str, filename, lineno)
rescue ParseError
        @c_parser.lexer.feed! ''
        raise
end
parse_c_file(file) click to toggle source

parses a C header file, from which function prototypes will be converted to DecodedFunction when found in the code flow

# File metasm/disassemble.rb, line 548
def parse_c_file(file)
        parse_c File.read(file), file
end
pattern_scan(pat, addr_start=nil, length=nil, chunksz=nil, margin=nil, &b) click to toggle source

scans all the sections raw for a given regexp return/yields all the addresses matching if yield returns nil/false, do not include the addr in the final result sections are scanned MB by MB, so this should work (slowly) on 4GB sections (eg debugger VM) with addr_start/length, symbol-based section are skipped

# File metasm/disassemble_api.rb, line 929
def pattern_scan(pat, addr_start=nil, length=nil, chunksz=nil, margin=nil, &b)
        chunksz ||= 4*1024*1024       # scan 4MB at a time
        margin ||= 65536      # add this much bytes at each chunk to find /pat/ over chunk boundaries

        pat = Regexp.new(Regexp.escape(pat)) if pat.kind_of? ::String

        found = []
        @sections.each { |sec_addr, e|
                if addr_start
                        length ||= 0x1000_0000
                        begin
                                if sec_addr < addr_start
                                        next if sec_addr+e.length <= addr_start
                                        e = e[addr_start-sec_addr, e.length]
                                        sec_addr = addr_start
                                end
                                if sec_addr+e.length > addr_start+length
                                        next if sec_addr > addr_start+length
                                        e = e[0, sec_addr+e.length-(addr_start+length)]
                                end
                        rescue
                                puts $!, $!.message, $!.backtrace if $DEBUG
                                # catch arithmetic error with symbol-based section
                                next
                        end
                end
                e.pattern_scan(pat, chunksz, margin) { |eo|
                        match_addr = sec_addr + eo
                        found << match_addr if not b or b.call(match_addr)
                        false
                }
        }
        found
end
post_disassemble() click to toggle source
# File metasm/disassemble.rb, line 672
        def post_disassemble
                @decoded.each_value { |di|
                        next if not di.kind_of?(DecodedInstruction)
                        next if not di.opcode or not di.opcode.props[:saveip]
                        if not di.block.to_subfuncret
                                di.add_comment 'noreturn'
                                # there is no need to re-loop on all :saveip as check_noret is transitive
                                di.block.each_to_normal { |fa| check_noreturn_function(fa) }
                        end
                }
                @function.each { |addr, f|
                        next if not @decoded[addr]
                        if not f.finalized
                                f.finalized = true
puts "  finalize subfunc #{Expression[addr]}" if debug_backtrace
                                backtrace_update_function_binding(addr, f)
                                if not f.return_address
                                        detect_function_thunk(addr)
                                end
                        end
                        bd = f.backtrace_binding.reject { |k, v| Expression[k] == Expression[v] or Expression[v] == Expression::Unknown }
                        unk = f.backtrace_binding.map { |k, v| k if v == Expression::Unknown }.compact
                        bd[unk.map { |u| Expression[u].to_s }.sort.join(',')] = Expression::Unknown if not unk.empty?
                        add_comment(addr, "function binding: " + bd.map { |k, v| "#{k} -> #{v}" }.sort.join(', '))
                        add_comment(addr, "function ends at " + f.return_address.map { |ra| Expression[ra] }.join(', ')) if f.return_address
                }
        end
read_raw_data(addr, len) click to toggle source

reads len raw bytes from the mmaped address space

# File metasm/disassemble_api.rb, line 213
def read_raw_data(addr, len)
        if e = get_section_at(addr)
                e[0].read(len)
        end
end
rebase(newaddr) click to toggle source

change the base address of the loaded binary better done early (before disassembling anything) returns the delta

# File metasm/disassemble_api.rb, line 1252
def rebase(newaddr)
        rebase_delta(newaddr - @sections.keys.min)
end
rebase_delta(delta) click to toggle source
# File metasm/disassemble_api.rb, line 1256
def rebase_delta(delta)
        fix = lambda { |a|
                case a
                when Array
                        a.map! { |e| fix[e] }
                when Hash
                        tmp = {}
                        a.each { |k, v| tmp[fix[k]] = v }
                        a.replace tmp
                when Integer
                        a += delta
                when BacktraceTrace
                        a.origin = fix[a.origin]
                        a.address = fix[a.address]
                end
                a
        }

        fix[@sections]
        fix[@decoded]
        fix[@xrefs]
        fix[@function]
        fix[@addrs_todo]
        fix[@addrs_done]
        fix[@comment]
        @prog_binding.each_key { |k| @prog_binding[k] = fix[@prog_binding[k]] }
        @old_prog_binding.each_key { |k| @old_prog_binding[k] = fix[@old_prog_binding[k]] }
        @label_alias_cache = nil

        @decoded.values.grep(DecodedInstruction).each { |di|
                if di.block_head?
                        b = di.block
                        b.address += delta
                        fix[b.to_normal]
                        fix[b.to_subfuncret]
                        fix[b.to_indirect]
                        fix[b.from_normal]
                        fix[b.from_subfuncret]
                        fix[b.from_indirect]
                        fix[b.backtracked_for]
                end
                di.address = fix[di.address]
                di.next_addr = fix[di.next_addr]
        }
        @function.each_value { |f|
                f.return_address = fix[f.return_address]
                fix[f.backtracked_for]
        }
        @xrefs.values.flatten.compact.each { |x| x.origin = fix[x.origin] }
        delta
end
reinitialize(program, cpu=program.cpu) click to toggle source

resets the program

# File metasm/disassemble.rb, line 453
def reinitialize(program, cpu=program.cpu)
        @program = program
        @cpu = cpu
        @sections = {}
        @decoded = {}
        @xrefs = {}
        @function = {}
        @check_smc = true
        @prog_binding = {}
        @old_prog_binding = {}        # same as prog_binding, but keep old var names
        @addrs_todo = []
        @addrs_done = []
        @address_binding = {}
        @backtrace_maxblocks = @@backtrace_maxblocks
        @backtrace_maxblocks_fast = 0
        @backtrace_maxcomplexity = 40
        @backtrace_maxcomplexity_data = 5
        @disassemble_maxblocklength = 100
        @comment = {}
        @funcs_stdabi = true
end
rename_label(old, new) click to toggle source

changes a label to another, updates referring instructions etc returns the new label the new label must be program-uniq (see @program.new_label)

# File metasm/disassemble_api.rb, line 332
def rename_label(old, new)
        return new if old == new
        raise "label #{new.inspect} exists" if @prog_binding[new]
        each_xref(normalize(old)) { |x|
                next if not di = @decoded[x.origin]
                @cpu.replace_instr_arg_immediate(di.instruction, old, new)
                di.comment.to_a.each { |c| c.gsub!(old, new) }
        }
        e = get_edata_at(old, false)
        if e
                e.add_export new, e.export.delete(old), true
        end
        raise "cant rename nonexisting label #{old}" if not @prog_binding[old]
        @label_alias_cache = nil
        @old_prog_binding[new] = @prog_binding[new] = @prog_binding.delete(old)
        @addrs_todo.each { |at|
                case at[0]
                when old; at[0] = new
                when Expression; at[0] = at[0].bind(old => new)
                end
        }

        if @inv_section_reloc[old]
                @inv_section_reloc[old].each { |b, e_, o, r|
                        (0..16).each { |off|
                                if di = @decoded[Expression[b]+o-off] and di.bin_length > off
                                        @cpu.replace_instr_arg_immediate(di.instruction, old, new)
                                end
                        }
                        r.target = r.target.bind(old => new)
                }
                @inv_section_reloc[new] = @inv_section_reloc.delete(old)
        end

        if c_parser and @c_parser.toplevel.symbol[old]
                @c_parser.toplevel.symbol[new] = @c_parser.toplevel.symbol.delete(old)
                @c_parser.toplevel.symbol[new].name = new
        end

        new
end
replace_instrs(from, to, by, patch_by=false) click to toggle source

remove the decodedinstruction from..to, replace them by the new Instructions in 'by' this updates the block list structure, old di will still be visible in @decoded, except from original block (those are deleted) if from..to spans multiple blocks

to.block is splitted after to
all path from from are replaced by a single link to after 'to', be careful !
 (eg a->b->... & a->c ; from in a, to in c => a->b is lost)
all instructions are stuffed in the first block
paths are only walked using from/to_normal

'by' may be empty returns the block containing the new instrs (nil if empty)

# File metasm/disassemble_api.rb, line 502
        def replace_instrs(from, to, by, patch_by=false)
                raise 'bad from' if not fdi = di_at(from) or not fdi.block.list.index(fdi)
                raise 'bad to' if not tdi = di_at(to) or not tdi.block.list.index(tdi)

                # create DecodedInstruction from Instructions in 'by' if needed
                split_block(fdi.block, fdi.address)
                split_block(tdi.block, tdi.block.list[tdi.block.list.index(tdi)+1].address) if tdi != tdi.block.list.last
                fb = fdi.block
                tb = tdi.block

                # generate DecodedInstr from Instrs
                # try to keep the bin_length of original block
                wantlen = tdi.address + tdi.bin_length - fb.address
                wantlen -= by.grep(DecodedInstruction).inject(0) { |len, di| len + di.bin_length }
                ldi = by.last
                ldi = DecodedInstruction.new(ldi) if ldi.kind_of? Instruction
                nb_i = by.grep(Instruction).length
                wantlen = nb_i if wantlen < 0 or (ldi and ldi.opcode.props[:setip])
                if patch_by
                        by.map! { |di|
                                if di.kind_of? Instruction
                                        di = DecodedInstruction.new(di)
                                        wantlen -= di.bin_length = wantlen / by.grep(Instruction).length
                                        nb_i -= 1
                                end
                                di
                        }
                else
                        by = by.map { |di|
                                if di.kind_of? Instruction
                                        di = DecodedInstruction.new(di)
                                        wantlen -= (di.bin_length = wantlen / nb_i)
                                        nb_i -= 1
                                end
                                di
                        }
                end


#puts "  ** patch next_addr to #{Expression[tb.list.last.next_addr]}" if not by.empty? and by.last.opcode.props[:saveip]
                by.last.next_addr = tb.list.last.next_addr if not by.empty? and by.last.opcode.props[:saveip]
                fb.list.each { |di| @decoded.delete di.address }
                fb.list.clear
                tb.list.each { |di| @decoded.delete di.address }
                tb.list.clear
                by.each { |di| fb.add_di di }
                by.each_with_index { |di, i|
                        if odi = di_at(di.address)
                                # collision, hopefully with another deobfuscation run ?
                                if by[i..-1].all? { |mydi| mydi.to_s == @decoded[mydi.address].to_s }
                                        puts "replace_instrs: merge at  #{di}" if $DEBUG
                                        by[i..-1] = by[i..-1].map { |xdi| @decoded[xdi.address] }
                                        by[i..-1].each { fb.list.pop }
                                        split_block(odi.block, odi.address)
                                        tb.to_normal = [di.address]
                                        (odi.block.from_normal ||= []) << to
                                        odi.block.from_normal.uniq!
                                        break
                                else
                                        #raise "replace_instrs: collision  #{di}  vs  #{odi}"
                                        puts "replace_instrs: collision  #{di}  vs  #{odi}" if $VERBOSE
                                        while @decoded[di.address].kind_of? DecodedInstruction     # find free space.. raise ?
                                                di.address += 1   # XXX use floats ?
                                                di.bin_length -= 1
                                        end
                                end
                        end
                        @decoded[di.address] = di
                }
                @addrs_done.delete_if { |ad| normalize(ad[0]) == tb.address or ad[1] == tb.address }
                @addrs_done.delete_if { |ad| normalize(ad[0]) == fb.address or ad[1] == fb.address } if by.empty? and tb.address != fb.address

                # update to_normal/from_normal
                fb.to_normal = tb.to_normal
                fb.to_normal.to_a.each { |newto|
                        # other paths may already point to newto, we must only update the relevant entry
                        if ndi = di_at(newto) and idx = ndi.block.from_normal.to_a.index(to)
                                if by.empty?
                                        ndi.block.from_normal[idx,1] = fb.from_normal.to_a
                                else
                                        ndi.block.from_normal[idx] = fb.list.last.address
                                end
                        end
                }

                fb.to_subfuncret = tb.to_subfuncret
                fb.to_subfuncret.to_a.each { |newto|
                        if ndi = di_at(newto) and idx = ndi.block.from_subfuncret.to_a.index(to)
                                if by.empty?
                                        ndi.block.from_subfuncret[idx,1] = fb.from_subfuncret.to_a
                                else
                                        ndi.block.from_subfuncret[idx] = fb.list.last.address
                                end
                        end
                }

                if by.empty?
                        tb.to_subfuncret = nil if tb.to_subfuncret == []
                        tolist = tb.to_subfuncret || tb.to_normal.to_a
                        if lfrom = get_label_at(fb.address) and tolist.length == 1
                                lto = auto_label_at(tolist.first)
                                each_xref(fb.address, :x) { |x|
                                        next if not di = @decoded[x.origin]
                                        @cpu.replace_instr_arg_immediate(di.instruction, lfrom, lto)
                                        di.comment.to_a.each { |c| c.gsub!(lfrom, lto) }
                                }
                        end
                        fb.from_normal.to_a.each { |newfrom|
                                if ndi = di_at(newfrom) and idx = ndi.block.to_normal.to_a.index(from)
                                        ndi.block.to_normal[idx..idx] = tolist
                                end
                        }
                        fb.from_subfuncret.to_a.each { |newfrom|
                                if ndi = di_at(newfrom) and idx = ndi.block.to_subfuncret.to_a.index(from)
                                        ndi.block.to_subfuncret[idx..idx] = tolist
                                end
                        }
                else
                        # merge with adjacent blocks
                        merge_blocks(fb, fb.to_normal.first) if fb.to_normal.to_a.length == 1 and di_at(fb.to_normal.first)
                        merge_blocks(fb.from_normal.first, fb) if fb.from_normal.to_a.length == 1 and di_at(fb.from_normal.first)
                end

                fb if not by.empty?
        end
resolve(expr) click to toggle source

static resolution of indirections

# File metasm/disassemble.rb, line 1800
def resolve(expr)
        binding = Expression[expr].expr_indirections.inject(@old_prog_binding) { |binding_, ind|
                e = get_edata_at(resolve(ind.target))
                return expr if not e
                binding_.merge ind => Expression[ e.decode_imm("u#{8*ind.len}".to_sym, @cpu.endianness) ]
        }
        Expression[expr].bind(binding).reduce
end
save_file(file) click to toggle source

saves the dasm state in a file

# File metasm/disassemble_api.rb, line 1018
def save_file(file)
        tmpfile = file + '.tmp'
        File.open(tmpfile, 'wb') { |fd| save_io(fd) }
        File.rename tmpfile, file
end
save_io(fd) click to toggle source

saves the dasm state to an IO

# File metasm/disassemble_api.rb, line 1025
def save_io(fd)
        fd.puts 'Metasm.dasm'

        if @program.filename and not @program.kind_of?(Shellcode)
                t = @program.filename.to_s
                fd.puts "binarypath #{t.length}", t
        else
                t = "#{@cpu.class.name.sub(/.*::/, '')} #{@cpu.size} #{@cpu.endianness}"
                fd.puts "cpu #{t.length}", t
                # XXX will be reloaded as a Shellcode with this CPU, but it may be a custom EXE
                # do not output binarypath, we'll be loaded as a Shellcode, 'section' will suffice
        end

        @sections.each { |a, e|
                # forget edata exports/relocs
                # dump at most 16Mo per section
                t = "#{Expression[a]} #{e.length}\n" +
                        [e.data[0, 2**24].to_str].pack('m*')
                fd.puts "section #{t.length}", t
        }

        t = save_map.join("\n")
        fd.puts "map #{t.length}", t

        t = @decoded.map { |a, d|
                next if not d.kind_of? DecodedInstruction
                "#{Expression[a]},#{d.bin_length} #{d.instruction}#{" ; #{d.comment.join(' ')}" if d.comment}"
        }.compact.sort.join("\n")
        fd.puts "decoded #{t.length}", t

        t = @comment.map { |a, c|
                c.to_a.map { |l| l.chomp }.join("\n").split("\n").map { |lc| "#{Expression[a]} #{lc.chomp}" }
        }.join("\n")
        fd.puts "comment #{t.length}", t

        bl = @decoded.values.map { |d|
                d.block if d.kind_of? DecodedInstruction and d.block_head?
        }.compact
        t = bl.map { |b|
                [Expression[b.address],
                 b.list.map { |d| Expression[d.address] }.join(','),
                 b.to_normal.to_a.map { |t_| Expression[t_] }.join(','),
                 b.to_subfuncret.to_a.map { |t_| Expression[t_] }.join(','),
                 b.to_indirect.to_a.map { |t_| Expression[t_] }.join(','),
                 b.from_normal.to_a.map { |t_| Expression[t_] }.join(','),
                 b.from_subfuncret.to_a.map { |t_| Expression[t_] }.join(','),
                 b.from_indirect.to_a.map { |t_| Expression[t_] }.join(','),
                ].join(';')
        }.sort.join("\n")
        fd.puts "blocks #{t.length}", t

        t = @function.map { |a, f|
                next if not @decoded[a]
                [a, *f.return_address.to_a].map { |e| Expression[e] }.join(',')
        }.compact.sort.join("\n")
        # TODO binding ?
        fd.puts "funcs #{t.length}", t

        t = @xrefs.map { |a, x|
                a = ':default' if a == :default
                a = ':unknown' if a == Expression::Unknown
                # XXX origin
                case x
                when nil
                when Xref
                        [Expression[a], x.type, x.len, (Expression[x.origin] if x.origin)].join(',')
                when Array
                        x.map { |x_| [Expression[a], x_.type, x_.len, (Expression[x_.origin] if x_.origin)].join(',') }
                end
        }.compact.join("\n")
        fd.puts "xrefs #{t.length}", t

        t = @c_parser.to_s
        fd.puts "c #{t.length}", t

        #t = bl.map { |b| b.backtracked_for }
        #fd.puts "trace #{t.length}" , t
end
save_map() click to toggle source

exports the addr => symbol map (see load_map)

# File metasm/disassemble_api.rb, line 980
def save_map
        @prog_binding.map { |l, o|
                type = di_at(o) ? 'c' : 'd'  # XXX
                o = o.to_s(16).rjust(8, '0') if o.kind_of? ::Integer
                "#{o} #{type} #{l}"
        }
end
section_info() click to toggle source

returns info on sections, from @program if supported returns an array of [name, addr, length, info]

# File metasm/disassemble_api.rb, line 469
def section_info
        if @program.respond_to? :section_info
                @program.section_info
        else
                list = []
                @sections.each { |k, v|
                        list << [get_label_at(k), normalize(k), v.length, nil]
                }
                list
        end
end
set_label_at(addr, name, memcheck=true, overwrite=true) click to toggle source

sets the label for the specified address returns nil if the address is not mapped memcheck is passed to get_section_at to validate that the address is mapped keep existing label if 'overwrite' is false

# File metasm/disassemble_api.rb, line 294
def set_label_at(addr, name, memcheck=true, overwrite=true)
        addr = Expression[addr].reduce
        e, b = get_section_at(addr, memcheck)
        if not e
        elsif not l = e.inv_export[e.ptr] or (!overwrite and l != name)
                split_block(addr)
                l = @program.new_label(name)
                e.add_export l, e.ptr
                @label_alias_cache = nil
                @old_prog_binding[l] = @prog_binding[l] = b + e.ptr
        elsif l != name
                l = rename_label l, @program.new_label(name)
        end
        l
end
split_block(block, address=nil, rebacktrace=false) click to toggle source

splits an InstructionBlock, updates the blocks backtracked_for

# File metasm/disassemble.rb, line 800
def split_block(block, address=nil, rebacktrace=false)
        if not address        # invoked as split_block(0x401012)
                return if not @decoded[block].kind_of?(DecodedInstruction)
                block, address = @decoded[block].block, block
        end
        return block if address == block.address
        new_b = block.split address
        if rebacktrace
                new_b.backtracked_for.dup.each { |btt|
                        backtrace(btt.expr, btt.address,
                                  :only_upto => block.list.last.address,
                                  :include_start => !btt.exclude_instr, :from_subfuncret => btt.from_subfuncret,
                                  :origin => btt.origin, :orig_expr => btt.orig_expr, :type => btt.type, :len => btt.len,
                                  :detached => btt.detached, :maxdepth => btt.maxdepth, :cpu_context => btt.cpu_context)
                }
        end
        new_b
end
strings_scan(minlen=6, &b) click to toggle source

returns/yields [addr, string] found using pattern_scan /[x20-x7e]/

# File metasm/disassemble_api.rb, line 965
def strings_scan(minlen=6, &b)
        ret = []
        nexto = 0
        pattern_scan(/[\x20-\x7e]{#{minlen},}/m, nil, 1024) { |o|
                if o - nexto > 0
                        next unless e = get_edata_at(o)
                        str = e.data[e.ptr, 1024][/[\x20-\x7e]{#{minlen},}/m]
                        ret << [o, str] if not b or b.call(o, str)
                        nexto = o + str.length
                end
        }
        ret
end
to_s() click to toggle source
# File metasm/disassemble.rb, line 2068
def to_s
        a = ''
        dump { |l| a << l << "\n" }
        a
end
toggle_expr_char(o) click to toggle source

change Expression display mode for current object o to display integers as char constants

# File metasm/disassemble_api.rb, line 1607
def toggle_expr_char(o)
        return if not o.kind_of?(Renderable)
        tochars = lambda { |v|
                if v.kind_of?(::Integer)
                        a = []
                        vv = v.abs
                        a << (vv & 0xff)
                        vv >>= 8
                        while vv > 0
                                a << (vv & 0xff)
                                vv >>= 8
                        end
                        if a.all? { |b| b < 0x7f }
                                s = a.pack('C*').inspect.gsub("'") { '\\\'' }[1...-1]
                                ExpressionString.new(v, (v > 0 ? "'#{s}'" : "-'#{s}'"), :char)
                        end
                end
        }
        o.each_expr { |e|
                if e.kind_of?(Expression)
                        if nr = tochars[e.rexpr]
                                e.rexpr = nr
                        elsif e.rexpr.kind_of?(ExpressionString) and e.rexpr.type == :char
                                e.rexpr = e.rexpr.expr
                        end
                        if nl = tochars[e.lexpr]
                                e.lexpr = nl
                        elsif e.lexpr.kind_of?(ExpressionString) and e.lexpr.type == :char
                                e.lexpr = e.lexpr.expr
                        end
                end
        }
end
toggle_expr_dec(o) click to toggle source
# File metasm/disassemble_api.rb, line 1641
def toggle_expr_dec(o)
        return if not o.kind_of?(Renderable)
        o.each_expr { |e|
                if e.kind_of?(Expression)
                        if e.rexpr.kind_of?(::Integer)
                                e.rexpr = ExpressionString.new(Expression[e.rexpr], e.rexpr.to_s, :decimal)
                        elsif e.rexpr.kind_of?(ExpressionString) and e.rexpr.type == :decimal
                                e.rexpr = e.rexpr.reduce
                        end
                        if e.lexpr.kind_of?(::Integer)
                                e.lexpr = ExpressionString.new(Expression[e.lexpr], e.lexpr.to_s, :decimal)
                        elsif e.lexpr.kind_of?(ExpressionString) and e.lexpr.type == :decimal
                                e.lexpr = e.lexpr.reduce
                        end
                end
        }
end
toggle_expr_offset(o) click to toggle source

patch Expressions in current object to include label names when available XXX should we also create labels ?

# File metasm/disassemble_api.rb, line 1661
def toggle_expr_offset(o)
        return if not o.kind_of? Renderable
        o.each_expr { |e|
                next unless e.kind_of?(Expression)
                if n = @prog_binding[e.lexpr]
                        e.lexpr = n
                elsif e.lexpr.kind_of? ::Integer and n = get_label_at(e.lexpr)
                        add_xref(normalize(e.lexpr), Xref.new(:addr, o.address)) if o.respond_to? :address
                        e.lexpr = n
                end
                if n = @prog_binding[e.rexpr]
                        e.rexpr = n
                elsif e.rexpr.kind_of? ::Integer and n = get_label_at(e.rexpr)
                        add_xref(normalize(e.rexpr), Xref.new(:addr, o.address)) if o.respond_to? :address
                        e.rexpr = n
                end
        }
end
toggle_expr_str(o) click to toggle source

toggle all ExpressionStrings

# File metasm/disassemble_api.rb, line 1681
def toggle_expr_str(o)
        return if not o.kind_of?(Renderable)
        o.each_expr { |e|
                next unless e.kind_of?(ExpressionString)
                e.hide_str = !e.hide_str
        }
end
trace_function_register(start_addr, init_state) { |di, r, newv, trace_state| ... } click to toggle source

dataflow method walks a function, starting at addr follows the usage of registers, computing the evolution from the value they had at start_addr whenever an instruction references the register (or anything derived from it),

yield [di, used_register, reg_value, trace_state] where reg_value is the Expression holding the value of
the register wrt the initial value at start_addr, and trace_state the value of all registers (reg_value
not yet applied)
reg_value may be nil if used_register is not modified by the function (eg call [eax])
the yield return value is propagated, unless it is nil/false

init_state is a hash { :reg => initial value }

# File metasm/disassemble_api.rb, line 1318
def trace_function_register(start_addr, init_state)
        function_walk(start_addr, init_state) { |args|
                trace_state = args.last
                case args.first
                when :di
                        di = args[2]
                        update = {}
                        get_fwdemu_binding(di).each { |r, v|
                                if v.kind_of?(Expression) and v.externals.find { |e| trace_state[e] }
                                        # XXX may mix old (from trace) and current (from v) registers
                                        newv = v.bind(trace_state)
                                        update[r] = yield(di, r, newv, trace_state)
                                elsif r.kind_of?(ExpressionType) and rr = r.externals.find { |e| trace_state[e] }
                                        # reg dereferenced in a write (eg mov [esp], 42)
                                        next if update.has_key?(rr)       # already yielded
                                        if yield(di, rr, trace_state[rr], trace_state) == false
                                                update[rr] = false
                                        end
                                elsif trace_state[r]
                                        # started on mov reg, foo
                                        next if di.address == start_addr
                                        update[r] = false
                                end
                        }

                        # directly walk the instruction argument list for registers not appearing in the binding
                        @cpu.instr_args_memoryptr(di).each { |ind|
                                b = @cpu.instr_args_memoryptr_getbase(ind)
                                if b and b = b.symbolic and not update.has_key?(b)
                                        yield(di, b, nil, trace_state)
                                end
                        }
                        @cpu.instr_args_regs(di).each { |r|
                                r = r.symbolic
                                if not update.has_key?(r)
                                        yield(di, r, nil, trace_state)
                                end
                        }

                        update.each { |r, v|
                                trace_state = trace_state.dup
                                if v
                                        # cannot follow non-registers, or we would have to emulate every single
                                        # instruction (try following [esp+4] across a __stdcall..)
                                        trace_state[r] = v if r.kind_of?(::Symbol)
                                else
                                        trace_state.delete r
                                end
                        }
                when :subfunc
                        faddr = args[1]
                        f = @function[faddr]
                        f = @function[f.backtrace_binding[:thunk]] if f and f.backtrace_binding[:thunk]
                        if f
                                binding = f.backtrace_binding
                                if binding.empty?
                                        backtrace_update_function_binding(faddr)
                                        binding = f.backtrace_binding
                                end
                                # XXX fwdemu_binding ?
                                binding.each { |r, v|
                                        if v.externals.find { |e| trace_state[e] }
                                                if r.kind_of?(::Symbol)
                                                        trace_state = trace_state.dup
                                                        trace_state[r] = Expression[v.bind(trace_state)].reduce
                                                end
                                        elsif trace_state[r]
                                                trace_state = trace_state.dup
                                                trace_state.delete r
                                        end
                                }
                        end
                when :merge
                        # when merging paths, keep the smallest common state subset
                        # XXX may have unexplored froms
                        conflicts = args[2]
                        trace_state = trace_state.dup
                        conflicts.each { |addr, st|
                                trace_state.delete_if { |k, v| st[k] != v }
                        }
                end
                trace_state = false if trace_state.empty?
                trace_state
        }
end
trace_update_reg_structptr(addr, reg, structname, structoff=0) click to toggle source

define a register as a pointer to a structure rename all [reg+off] as [reg+struct.member] in current function also trace assignments of pointer members

# File metasm/disassemble_api.rb, line 1407
def trace_update_reg_structptr(addr, reg, structname, structoff=0)
        sname = soff = ctx = nil
        expr_to_sname = lambda { |expr|
                if not expr.kind_of?(Expression) or expr.op != :+
                        sname = nil
                        next
                end

                sname = expr.lexpr || expr.rexpr
                soff = (expr.lexpr ? expr.rexpr : 0)

                if soff.kind_of?(Expression)
                        # ignore index in ptr array
                        if soff.op == :* and soff.lexpr == @cpu.size/8
                                soff = 0
                        elsif soff.rexpr.kind_of?(Expression) and soff.rexpr.op == :* and soff.rexpr.lexpr == @cpu.size/8
                                soff = soff.lexpr
                        elsif soff.lexpr.kind_of?(Expression) and soff.lexpr.op == :* and soff.lexpr.lexpr == @cpu.size/8
                                soff = soff.rexpr
                        end
                elsif soff.kind_of?(::Symbol)
                        # array with 1 byte elements / pre-scaled idx?
                        if not ctx[soff]
                                soff = 0
                        end
                end
        }

        lastdi = nil
        trace_function_register(addr, reg => Expression[structname, :+, structoff]) { |di, r, val, trace|

                next if r.to_s =~ /flag/     # XXX maybe too ia32-specific?

                ctx = trace
                @cpu.instr_args_memoryptr(di).each { |ind|
                        # find the structure dereference in di
                        b = @cpu.instr_args_memoryptr_getbase(ind)
                        b = b.symbolic if b
                        next unless trace[b]
                        imm = @cpu.instr_args_memoryptr_getoffset(ind) || 0

                        # check expr has the form 'traced_struct_reg + off'
                        expr_to_sname[trace[b] + imm]       # Expr#+ calls Expr#reduce
                        next unless sname.kind_of?(::String) and soff.kind_of?(::Integer)
                        next if not st = c_parser.toplevel.struct[sname] or not st.kind_of?(C::Union)

                        # ignore lea esi, [esi+0]
                        next if soff == 0 and not di.backtrace_binding.find { |k, v| v-k != 0 }

                        # TODO if trace[b] offset != 0, we had a lea reg, [struct+substruct_off], tweak str accordingly

                        # resolve struct + off into struct.membername
                        str = st.name.dup
                        mb = st.expand_member_offset(c_parser, soff, str)
                        # patch di
                        imm = imm.rexpr if imm.kind_of?(Expression) and not imm.lexpr and imm.rexpr.kind_of?(ExpressionString)
                        imm = imm.expr if imm.kind_of?(ExpressionString)
                        @cpu.instr_args_memoryptr_setoffset(ind, ExpressionString.new(imm, str, :structoff))

                        # check if the type is an enum/bitfield, patch instruction immediates
                        trace_update_reg_structptr_arg_enum(di, ind, mb, str) if mb
                } if lastdi != di.address
                lastdi = di.address

                next Expression[structname, :+, structoff] if di.address == addr and r == reg

                # check if we need to trace 'r' further
                val = val.reduce_rec if val.kind_of?(Expression)
                val = Expression[val] if val.kind_of?(::String)
                case val
                when Expression
                        # only trace trivial structptr+off expressions
                        expr_to_sname[val]
                        if sname.kind_of?(::String) and soff.kind_of?(::Integer)
                                Expression[sname, :+, soff]
                        end

                when Indirection
                        # di is mov reg, [ptr+struct.offset]
                        # check if the target member is a pointer to a struct, if so, trace it
                        expr_to_sname[val.pointer.reduce]

                        next unless sname.kind_of?(::String) and soff.kind_of?(::Integer)

                        if st = c_parser.toplevel.struct[sname] and st.kind_of?(C::Union)
                                pt = st.expand_member_offset(c_parser, soff, '')
                                pt = pt.untypedef if pt
                                if pt.kind_of?(C::Pointer)
                                        tt = pt.type.untypedef
                                        stars = ''
                                        while tt.kind_of?(C::Pointer)
                                                stars << '*'
                                                tt = tt.type.untypedef
                                        end
                                        if tt.kind_of?(C::Union) and tt.name
                                                Expression[tt.name + stars]
                                        end
                                end

                        elsif soff == 0 and sname[-1] == ?*
                                # XXX pointer to pointer to struct
                                # full C type support would be better, but harder to fit in an Expr
                                Expression[sname[0...-1]]
                        end
                # in other cases, stop trace
                end
        }
end
trace_update_reg_structptr_arg_enum(di, ind, mb, str) click to toggle source

found a special member of a struct, check if we can apply bitfield/enum name to other constants in the di

# File metasm/disassemble_api.rb, line 1518
def trace_update_reg_structptr_arg_enum(di, ind, mb, str)
        if ename = mb.has_attribute_var('enum') and enum = c_parser.toplevel.struct[ename] and enum.kind_of?(C::Enum)
                # handle enums: struct moo { int __attribute__((enum(bla))) fld; };
                doit = lambda { |_di|
                        if num = _di.instruction.args.grep(Expression).first and num_i = num.reduce and num_i.kind_of?(::Integer)
                                # handle enum values on tagged structs
                                if enum.members and name = enum.members.index(num_i)
                                        num.lexpr = nil
                                        num.op = :+
                                        num.rexpr = ExpressionString.new(Expression[num_i], name, :enum)
                                        _di.add_comment "enum::#{ename}" if _di.address != di.address
                                end
                        end
                }

                doit[di]

                # mov eax, [ptr+struct.enumfield]  =>  trace eax
                if reg = @cpu.instr_args_regs(di).find { |r| v = di.backtrace_binding[r.symbolic] and (v - ind.symbolic) == 0 }
                        reg = reg.symbolic
                        trace_function_register(di.address, reg => Expression[0]) { |_di, r, val, trace|
                                next if r != reg and val != Expression[reg]
                                doit[_di]
                                val
                        }
                end

        elsif mb.untypedef.kind_of?(C::Struct)
                # handle bitfields

                byte_off = 0
                if str =~ /\+(\d+)$/
                        # test byte [bitfield+1], 0x1  =>  test dword [bitfield], 0x100
                        # XXX little-endian only
                        byte_off = $1.to_i
                        str[/\+\d+$/] = ''
                end
                cmt = str.split('.')[-2, 2].join('.') if str.count('.') > 1

                doit = lambda { |_di, add|
                        if num = _di.instruction.args.grep(Expression).first and num_i = num.reduce and num_i.kind_of?(::Integer)
                                # TODO handle ~num_i
                                num_left = num_i << add
                                s_or = []
                                mb.untypedef.members.each { |mm|
                                        if bo = mb.bitoffsetof(c_parser, mm)
                                                boff, blen = bo
                                                if mm.name && blen == 1 && ((num_left >> boff) & 1) > 0
                                                        s_or << mm.name
                                                        num_left &= ~(1 << boff)
                                                end
                                        end
                                }
                                if s_or.first
                                        if num_left != 0
                                                s_or << ('0x%X' % num_left)
                                        end
                                        s = s_or.join('|')
                                        num.lexpr = nil
                                        num.op = :+
                                        num.rexpr = ExpressionString.new(Expression[num_i], s, :bitfield)
                                        _di.add_comment cmt if _di.address != di.address
                                end
                        end
                }

                doit[di, byte_off*8]

                if reg = @cpu.instr_args_regs(di).find { |r| v = di.backtrace_binding[r.symbolic] and (v - ind.symbolic) == 0 }
                        reg = reg.symbolic
                        trace_function_register(di.address, reg => Expression[0]) { |_di, r, val, trace|
                                if r.kind_of?(Expression) and r.op == :&
                                       if r.lexpr == reg
                                               # test al, 42
                                               doit[_di, byte_off*8]
                                       elsif r.lexpr.kind_of?(Expression) and r.lexpr.op == :>> and r.lexpr.lexpr == reg
                                               # test ah, 42
                                               doit[_di, byte_off*8+r.lexpr.rexpr]
                                       end
                                end
                                next if r != reg and val != Expression[reg]
                                doit[_di, byte_off*8]
                                _di.address == di.address && r == reg ? Expression[0] : val
                        }
                end
        end
end
undefine_from(addr) click to toggle source

undefine a sequence of decodedinstructions from an address stops at first non-linear branch removes @decoded, @comments, @xrefs, @addrs_done does not update @prog_binding (does not undefine labels)

# File metasm/disassemble_api.rb, line 632
def undefine_from(addr)
        return if not di_at(addr)
        @comment.delete addr if @function.delete addr
        split_block(addr)
        addrs = []
        while di = di_at(addr)
                di.block.list.each { |ddi| addrs << ddi.address }
                break if di.block.to_subfuncret.to_a != [] or di.block.to_normal.to_a.length != 1
                addr = di.block.to_normal.first
                break if ndi = di_at(addr) and ndi.block.from_normal.to_a.length != 1
        end
        addrs.each { |a| @decoded.delete a }
        @xrefs.delete_if { |a, x|
                if not x.kind_of? Array
                        true if x and addrs.include? x.origin
                else
                        x.delete_if { |xx| addrs.include? xx.origin }
                        true if x.empty?
                end
        }
        @addrs_done.delete_if { |ad| !(addrs & [normalize(ad[0]), normalize(ad[1])]).empty? }
end