class JgiGffRecord

Fixes up JGI to GFF problems. I don’t mean to blame anyone but they just don’t seem to go together

Constants

ATTRIBUTES_COL
END_COL
FEATURE_COL
FRAME_COL
SCORE_COL
SEQNAME_COL
SOURCE_COL
START_COL
STRAND_COL

Public Class Methods

new(line) click to toggle source
# File lib/jgi_genes.rb, line 179
def initialize(line)
  @line = line
  
  parts = line.split("\t");
  if parts.length != 9 and parts.length != 8
    raise Exception, "Badly formatted GFF line - doesn't have correct number of components '#{line}"
  end

  
  parse_mandatory_columns(parts)
  
  parse_attributes(parts[ATTRIBUTES_COL])
  
end

Public Instance Methods

parse_attributes(attribute_string) click to toggle source

parse the last part of a line into a hash contained in attributes global variable

# File lib/jgi_genes.rb, line 211
def parse_attributes(attribute_string)
  @attributes = Hash.new #define empty attributes even if there are none
  
  if attribute_string
    #let the fancy parsing begin
    aparts = attribute_string.split '; '
    
    aparts.each do |bit|
      hbits = bit.split ' '
      if !hbits or hbits.length != 2
        raise Exception, "Failed to parse attributes in line: #{line}"
      end
      str = hbits[1].gsub(/\"/, '').rstrip.lstrip
      @attributes[hbits[0]] = str
    end
  end
end
parse_mandatory_columns(parts) click to toggle source

Given an array of 8 strings, parse the columns into something that can be understood by this object

# File lib/jgi_genes.rb, line 197
def parse_mandatory_columns(parts)
  @seqname = parts[SEQNAME_COL]
  @source = parts[SOURCE_COL]
  @feature = parts[FEATURE_COL]
  @start = parts[START_COL]
  @end = parts[END_COL]
  @score = parts[SCORE_COL]
  @strand = parts[STRAND_COL]
  @frame = parts[FRAME_COL]
end
to_s() click to toggle source
# File lib/jgi_genes.rb, line 230
def to_s
  @line
end