module BioDSL::Dynamic

Module containing code to locate nucleotide patterns in sequences allowing for ambiguity codes and a given maximum edit distance. Insertions are nucleotides found in the pattern but not in the sequence. Deletions are nucleotides found in the sequence but not in the pattern.

Inspired by the paper by Bruno Woltzenlogel Paleo (page 197): www.logic.at/people/bruno/Papers/2007-GATE-ESSLLI.pdf

Constants

Match

Public Instance Methods

patmatch(pattern, pos = 0, max_edit_distance = 0) click to toggle source

str.patmatch(pattern[, pos[, max_edit_distance]])
-> Match or nil
str.patscan(pattern[, pos[, max_edit_distance]]) { |match|
  block
}
-> Match

Method to iterate through a sequence to locate the first pattern match starting from a given position and allowing for a maximum edit distance.

# File lib/BioDSL/seq/dynamic.rb, line 53
def patmatch(pattern, pos = 0, max_edit_distance = 0)
  patscan(pattern, pos, max_edit_distance) do |m|
    return m
  end
end
patscan(pattern, pos = 0, max_edit_distance = 0) { |match| ... } click to toggle source

str.patscan(pattern[, pos[, max_edit_distance]])
-> Array or nil
str.patscan(pattern[, pos[, max_edit_distance]]) { |match|
  block
}
-> Match

Method to iterate through a sequence to locate pattern matches starting from a given position and allowing for a maximum edit distance. Matches found in block context return the Match object. Otherwise matches are returned in an Array.

# File lib/BioDSL/seq/dynamic.rb, line 72
def patscan(pattern, pos = 0, max_edit_distance = 0)
  matches = []

  while (result = match_C(@seq, length, pattern, pattern.length, pos,
                          max_edit_distance))
    match = Match.new(*result, @seq[result[0]...result[0] + result[1]])

    if block_given?
      yield match
    else
      matches << match
    end

    pos = match.beg + 1
  end

  return matches unless block_given?
end