class BioDSL::WriteFasta

Write sequences from stream in FASTA format.

Description

write_fasta writes sequence from the data stream in FASTA format. However, a FASTA entry will only be written if a SEQ key and a SEQ_NAME key is present. An example FASTA entry:

>test1
TATGACGCGCATCGACAGCAGCACGAGCATGCATCGACTG
TGCACTGACTACGAGCATCACTATATCATCATCATAATCT
TACGACATCTAGGGACTAC

For more about the FASTA format:

en.wikipedia.org/wiki/FASTA_format

Usage

write_fasta([wrap: <uin>[, output: <file>[, force: <bool>
            [, gzip: <bool> | bzip2: <bool>]]]])

Options

Examples

To write FASTA entries to STDOUT.

write_fasta

To write FASTA entries wrapped in lines of length of 80 to STDOUT.

write_fasta(wrap: 80)

To write FASTA entries to a file ‘test.fna’.

write_fasta(output: "test.fna")

To overwrite output file if this exists use the force option:

write_fasta(output: "test.fna", force: true)

To write gzipped FASTA entries to file ‘test.fna.gz’.

write_fasta(output: "test.fna.gz", gzip: true)

To write bzipped FASTA entries to file ‘test.fna.bz2’.

write_fasta(output: "test.fna.bz2", bzip2: true)

Constants

STATS

Public Class Methods

new(options) click to toggle source

Constructor for the WriteFasta class.

@param [Hash] options Options hash. @option options [Bool] :force Flag allowing overwriting files. @option options [String] :output Output file path. @option options [Integer] :wrap Wrap sequences at this length (default no

wrap)

@option options [Bool] :gzip Output will be gzip’ed. @option options [Bool] :bzip2 Output will be bzip2’ed.

@return [WriteFasta] Returns an instance of the class.

# File lib/BioDSL/commands/write_fasta.rb, line 97
def initialize(options)
  @options = options
  check_options
  @options[:output] ||= $stdout
end

Public Instance Methods

lmb() click to toggle source

Return a lambda for the write_fasta command.

@return [Proc] Returns the write_fasta command lambda.

# File lib/BioDSL/commands/write_fasta.rb, line 106
def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    if @options[:output] == $stdout
      write_stdout(input, output)
    else
      write_file(input, output)
    end
  end
end

Private Instance Methods

check_options() click to toggle source

Check the options.

# File lib/BioDSL/commands/write_fasta.rb, line 121
def check_options
  options_allowed(@options, :force, :output, :wrap, :gzip, :bzip2)
  options_unique(@options, :gzip, :bzip2)
  options_tie(@options, gzip: :output, bzip2: :output)
  options_files_exist_force(@options, :output)
end
compress() click to toggle source

Determine what compression should be used for output.

@return [Symbol, nil] Compression flag or nil if no compression.

# File lib/BioDSL/commands/write_fasta.rb, line 200
def compress
  return :gzip  if @options[:gzip]
  return :bzip2 if @options[:bzip2]
end
record2entry(record) click to toggle source

Creates a Seq object from a given record if SEQ_NAME and SEQ is present.

@param record [Hash] Biopices record to convert.

@return [BioDSL::Seq] Sequence entry.

# File lib/BioDSL/commands/write_fasta.rb, line 190
def record2entry(record)
  return unless record.key? :SEQ_NAME
  return unless record.key? :SEQ

  BioDSL::Seq.new_bp(record)
end
write_file(input, output) click to toggle source

Write all sequence entries to a specified file.

@param input [Enumerator] The input stream. @param output [Enumerator::Yielder] The output stream.

# File lib/BioDSL/commands/write_fasta.rb, line 154
def write_file(input, output)
  Fasta.open(@options[:output], 'w', compress: compress) do |ios|
    input.each do |record|
      @status[:records_in] += 1

      if (entry = record2entry(record))
        ios.puts entry.to_fasta(@options[:wrap])
        @status[:sequences_in] += 1
        @status[:sequences_out] += 1
        @status[:residues_in] += entry.length
        @status[:residues_out] += entry.length
      end

      write_output(output, record)
    end
  end
end
write_output(output, record) click to toggle source

Write a given record to the output stream if this exist.

@param output [Enumerator::Yielder, nil] Output stream. @param record [Hash] Biopices record to write.

# File lib/BioDSL/commands/write_fasta.rb, line 178
def write_output(output, record)
  return unless output

  output << record
  @status[:records_out] += 1
end
write_stdout(input, output) click to toggle source

Write all sequence entries to stdout.

@param input [Enumerator] The input stream. @param output [Enumerator::Yielder] The output stream.

# File lib/BioDSL/commands/write_fasta.rb, line 132
def write_stdout(input, output)
  wrap = @options[:wrap]

  input.each do |record|
    @status[:records_in] += 1

    if (entry = record2entry(record))
      $stdout.puts entry.to_fasta(wrap)
      @status[:sequences_in] += 1
      @status[:sequences_out] += 1
      @status[:residues_in] += entry.length
      @status[:residues_out] += entry.length
    end

    write_output(output, record)
  end
end