class BioDSL::WriteFastq

Write sequences from stream in FASTQ format.

Description

write_fastq writes sequence from the data stream in FASTQ format. However, a FASTQ entry will only be written if a SEQ key and a SEQ_NAME key is present. An example FASTQ entry:

>test1
TATGACGCGCATCGACAGCAGCACGAGCATGCATCGACTG
TGCACTGACTACGAGCATCACTATATCATCATCATAATCT
TACGACATCTAGGGACTAC

For more about the FASTQ format:

en.wikipedia.org/wiki/FASTQ_format

Usage

write_fastq([encoding: <:base_33|:base_64>[, output: <file>
            [, force: <bool>[, gzip: <bool> | bzip2: <bool>]]])

Options

Examples

To write FASTQ entries to STDOUT.

write_fastq

To write FASTQ entries to a file ‘test.fq’.

write_fastq(output: "test.fq")

To overwrite output file if this exists use the force option:

write_fastq(output: "test.fq", force: true)

To write gzipped FASTQ entries to file ‘test.fq.gz’.

write_fastq(output: "test.fq.gz", gzip: true)

To write bzipped FASTQ entries to file ‘test.fq.bz2’.

write_fastq(output: "test.fq.bz2", bzip2: true)

Constants

STATS

Public Class Methods

new(options) click to toggle source

Constructor for WriteFastq.

@param options [Hash] Options hash. @option options [String,Symbol] :encoding @option options [Boolean] :force @option options [String] :output @option options [Boolean] :gzip @option options [Boolean] :bzip2

@return [WriteFastq] Class instance.

# File lib/BioDSL/commands/write_fastq.rb, line 93
def initialize(options)
  @options = options
  check_options
  @options[:output] ||= $stdout
  @compress           = choose_compression
  @encoding           = choose_encoding
end

Public Instance Methods

lmb() click to toggle source

Return command lambda for write_fastq.

@return [Proc] Command lambda.

# File lib/BioDSL/commands/write_fastq.rb, line 104
def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    if @options[:output] == $stdout
      process_input(input, output, $stdout)
    else
      Fastq.open(@options[:output], 'w', compress: @compress) do |ios|
        process_input(input, output, ios)
      end
    end
  end
end

Private Instance Methods

check_options() click to toggle source

Check options.

# File lib/BioDSL/commands/write_fastq.rb, line 121
def check_options
  options_allowed(@options, :encoding, :force, :output, :gzip, :bzip2)
  options_allowed_values(@options, encoding: [:base_33, :base_64, 'base_33',
                                              'base_64'])
  options_unique(@options, :gzip, :bzip2)
  options_tie(@options, gzip: :output, bzip2: :output)
  options_files_exist_force(@options, :output)
end
choose_compression() click to toggle source

Choose compression to use which can either be gzip or bzip2 or no compression.

@return [Symbol,nil] Compression.

# File lib/BioDSL/commands/write_fastq.rb, line 172
def choose_compression
  if @options[:gzip]
    :gzip
  elsif @options[:bzip2]
    :bzip2
  end
end
choose_encoding() click to toggle source

Chose the quality score encoding.

@return [Symbol,nil] Encoding.

# File lib/BioDSL/commands/write_fastq.rb, line 183
def choose_encoding
  if @options[:encoding]
    @options[:encoding].to_sym
  else
    :base_33
  end
end
process_input(input, output, ios) click to toggle source

Process all records in the input stream and output FASTQ data to the given ios, and finally emit all records to the output stream if specified.

@param input [Enumerable] Input stream. @param output [Enumerable::Yielder] Output stream. @param ios [BioDSL::Fastq::IO,STDOUT] Output IO.

# File lib/BioDSL/commands/write_fastq.rb, line 136
def process_input(input, output, ios)
  input.each do |record|
    @status[:records_in] += 1

    if record[:SEQ]
      @status[:sequences_in] += 1
      @status[:residues_in] += record[:SEQ].length

      write_fastq(record, ios) if record[:SEQ_NAME] && record[:SCORES]
    end

    if output
      output << record
      @status[:records_out] += 1
    end
  end
end
write_fastq(record, ios) click to toggle source

Given a BioPeices record convert this to a sequence entry and output in FASTQ format to the speficied IO.

@param record [Hash] BioDSL record. @param ios [BioDSL::Fastq::IO,STDOUT] Output IO.

# File lib/BioDSL/commands/write_fastq.rb, line 159
def write_fastq(record, ios)
  entry = BioDSL::Seq.new_bp(record)
  entry.qual_convert!(:base_33, @encoding)

  ios.puts entry.to_fastq
  @status[:sequences_out] += 1
  @status[:residues_out] += entry.length
end