module BioDSL::Dynamic
Module containing code to locate nucleotide patterns in sequences allowing for ambiguity codes and a given maximum edit distance. Insertions are nucleotides found in the pattern but not in the sequence. Deletions are nucleotides found in the sequence but not in the pattern.
Inspired by the paper by Bruno Woltzenlogel Paleo (page 197): www.logic.at/people/bruno/Papers/2007-GATE-ESSLLI.pdf
Constants
- Match
Public Instance Methods
str.patmatch(pattern[, pos[, max_edit_distance]]) -> Match or nil str.patscan(pattern[, pos[, max_edit_distance]]) { |match| block } -> Match
Method to iterate through a sequence to locate the first pattern match starting from a given position and allowing for a maximum edit distance.
# File lib/BioDSL/seq/dynamic.rb, line 53 def patmatch(pattern, pos = 0, max_edit_distance = 0) patscan(pattern, pos, max_edit_distance) do |m| return m end end
str.patscan(pattern[, pos[, max_edit_distance]]) -> Array or nil str.patscan(pattern[, pos[, max_edit_distance]]) { |match| block } -> Match
Method to iterate through a sequence to locate pattern matches starting from a given position and allowing for a maximum edit distance. Matches found in block context return the Match
object. Otherwise matches are returned in an Array
.
# File lib/BioDSL/seq/dynamic.rb, line 72 def patscan(pattern, pos = 0, max_edit_distance = 0) matches = [] while (result = match_C(@seq, length, pattern, pattern.length, pos, max_edit_distance)) match = Match.new(*result, @seq[result[0]...result[0] + result[1]]) if block_given? yield match else matches << match end pos = match.beg + 1 end return matches unless block_given? end