module RGFA::Sequence

Extensions of the String class to handle nucleotidic sequences

Constants

WCC

Watson-Crick Complements

Public Instance Methods

rc(tolerant: false, rnasequence: false) click to toggle source

Computes the reverse complement of a nucleotidic sequence

@return [String] reverse complement, without newlines and spaces @return [String] “*” if string is “*”

@param tolerant [Boolean] (defaults to: false)

if true, anything non-sequence is complemented to itself

@param rnasequence [Boolean] (defaults to: false)

if true, any A and a is complemented into u and U; otherwise
it is so, only if an U is found; otherwise DNA is assumed

@raise [RuntimeError] if not tolerant and chars are found for which

no Watson-Crick complement is defined

@raise [RuntimeError] if sequence contains both U and T

@example

"ACTG".rc  # => "CAGT"
"acGT".rc  # => "ACgt"

@example Undefined sequence is represented by “*”:

"*".rc     # => "*"

@example Extended IUPAC Alphabet:

"ARBN".rc  # => "NVYT"

@example Usage with RNA sequences:

"ACUG".rc                    # => "CAGU"
"ACG".rc(rnasequence: true)  # => "CGU"
"ACUT".rc                    # (raises RuntimeError, both U and T)
# File lib/rgfa/sequence.rb, line 32
def rc(tolerant: false, rnasequence: false)
  return "*" if self == "*"
  retval = each_char.map do |c|
    if c == "U" or c == "u"
      rnasequence = true
    elsif rnasequence and (c == "T" or c == "t")
      raise "String contains both U/u and T/t"
    end
    wcc = WCC.fetch(c, tolerant ? c : nil)
    raise "#{self}: no Watson-Crick complement for #{c}" if wcc.nil?
    wcc
  end.reverse.join
  if rnasequence
    retval.tr!("tT","uU")
  end
  retval
end