class EuPathDBGeneInformationTable

A class for parsing the ‘gene information table’ files from EuPathDB, such as cryptodb.org/common/downloads/release-4.3/Cmuris/txt/CmurisGene_CryptoDB-4.3.txt

The usual way of interacting with these is the use of the each method, which returns a EuPathDBGeneInformation object with all of the recorded information in it.

Public Class Methods

new(io) click to toggle source

Initialise using an IO object, say File.open(‘/path/to/CmurisGene_CryptoDB-4.3.txt’). After opening, the each method can be used to iterate over the genes that are present in the file

# File lib/eupathdb_gene_information_table.rb, line 54
def initialize(io)
  @io = io
end

Public Instance Methods

each() { |g| ... } click to toggle source

Return a EuPathDBGeneInformation object with the contained info in it, one at a time

# File lib/eupathdb_gene_information_table.rb, line 60
def each
  while g = next_gene
    yield g
  end
end
next_gene() click to toggle source

Returns a EuPathDBGeneInformation object with all the data you could possibly want.

# File lib/eupathdb_gene_information_table.rb, line 68
def next_gene
  info = EuPathDBGeneInformation.new
  
  # first, read the table, which should start with the ID column
  line = @io.readline.strip
  while line == ''
    return nil if @io.eof?
    line = @io.readline.strip
  end
  
  while line != ''
    if matches = line.match(/^(.*?)\: (.*)$/)
      info.add_information(matches[1], matches[2])
    else
      raise Exception, "EuPathDBGeneInformationTable Couldn't parse this line: #{line}"
    end
    
    line = @io.readline.strip
  end
  
  # now read each of the tables, which should start with the
  # 'TABLE: <name>' entry
  line = @io.readline.strip
  table_name = nil
  headers = nil
  data = []
  while line != '------------------------------------------------------------'
    if line == ''
      # add it to the stack unless we are just starting out
      info.add_table(table_name, headers, data) unless table_name.nil?
      
      # reset things
      table_name = nil
      headers = nil
      data = []
    elsif matches = line.match(/^TABLE\: (.*)$/)
      # name of a table
      table_name = matches[1]
    elsif line.match(/^\[.*\]/)
      # headings of the table
      headers = line.split("\t").collect do |header|
        header.gsub(/^\[/,'').gsub(/\]$/,'')
      end
    else
      # a proper data row
      data.push line.split("\t")
    end
    line = @io.readline.strip
  end
  
  # return the object that has been created
  return info
end