class GeneValidator::OpenReadingFrameValidation

This class contains the methods necessary for checking whether there is a main Open Reading Frame in the predicted sequence

Public Class Methods

new(prediction, hits) click to toggle source

Initilizes the object Params: prediction: a Sequence object representing the blast query hits: a vector of Sequence objects (representing blast hits)

Calls superclass method
# File lib/genevalidator/validation_open_reading_frame.rb, line 76
def initialize(prediction, hits)
  super
  @short_header = 'MainORF'
  @header       = 'Main ORF'
  @description  = 'Check whether there is a single main Open Reading' \
                  ' Frame in the predicted gene. This validation does not' \
                  ' require any BLAST hits.'
  @cli_name     = 'orf'
  @type         = config[:type]
end

Public Instance Methods

get_orfs(_orf_length = 100, prediction = @prediction) click to toggle source

Find open reading frames in the original sequence Applied only to nucleotide sequences Params: orf_length: minimimum ORF length, default 100 prediction: Sequence object Output: Hash containing the data on ORFs

# File lib/genevalidator/validation_open_reading_frame.rb, line 130
def get_orfs(_orf_length = 100, prediction = @prediction)
  '-' if prediction.type != 'nucleotide'

  seq = Bio::Sequence::NA.new(prediction.raw_sequence)

  result = {}
  key = 0
  (1..6).each do |f|
    s = seq.translate(f)
    f = -1 if f == 4
    f = -2 if f == 5
    f = -3 if f == 6
    s.scan(/(\w{30,})/) do |_orf|
      orf_start = $LAST_MATCH_INFO.offset(0)[0] + 1
      orf_end   = $LAST_MATCH_INFO.offset(0)[1] + 1
      coverage = (((orf_end - orf_start) / s.length.to_f) * 100).ceil
      # reduce the orf_end and the translated length by 2% to increase the
      #   width between ORFs on the plot
      chopping = s.length * 0.02
      orf_end = (orf_end.to_f - chopping).floor
      translated_length = (s.length - chopping).ceil
      key += 1
      result[key] = { frame: f, orf_start: orf_start, orf_end: orf_end,
                      coverage: coverage,
                      translated_length: translated_length }
    end
  end
  result
end
plot_orfs(orfs, translated_length) click to toggle source

Plots the resions corresponding to open reading frames Param orfs: Hash containing the open reading frame output: location where the plot will be saved in jped file format prediction: Sequence objects

# File lib/genevalidator/validation_open_reading_frame.rb, line 166
def plot_orfs(orfs, translated_length)
  raise QueryError unless orfs.is_a? Hash

  data = []

  # Create hashes for the Background
  (-3..3).each do |frame|
    next if frame == 0
    data << { 'y' => frame, 'start' => 1, 'stop' => translated_length,
              'color' => 'gray' }
  end

  # Create the hashes for the ORFs...
  orfs.each do |_key, h|
    data << { 'y' => h[:frame], 'start' => h[:orf_start],
              'stop' => h[:orf_end], 'color' => 'red' }
  end

  Plot.new(data,
           :lines,
           'Open Reading Frames in all 6 Frames',
           'Open Reading Frame (Minimimum Length: 30 amino acids),red',
           'Offset in the Prediction',
           'Reading Frame',
           14)
end
run() click to toggle source

Check whether there is a main reading frame Output: ORFValidationOutput object

# File lib/genevalidator/validation_open_reading_frame.rb, line 91
def run
  if type.to_s != 'nucleotide'
    @validation_report = ValidationReport.new('', :unapplicable)
    return @validation_report
  end

  raise unless prediction.is_a?(Query)

  start = Time.new
  orfs = get_orfs

  longest_orf       = orfs.max_by { |_key, hash| hash[:coverage] }
  longest_orf_frame = longest_orf[1][:frame]
  coverage          = longest_orf[1][:coverage]
  translated_length = longest_orf[1][:translated_length]
  plot1             = plot_orfs(orfs, translated_length)

  @validation_report = ORFValidationOutput.new(@short_header, @header,
                                               @description, orfs,
                                               coverage, longest_orf_frame)
  @validation_report.run_time = Time.now - start

  @validation_report.plot_files.push(plot1)
  @validation_report
rescue StandardError
  @validation_report = ValidationReport.new('Unexpected error', :error,
                                            @short_header, @header,
                                            @description)
  @validation_report.errors.push 'Unexpected Error'
end