class BioDSL::Count

Count the number of records in the stream.

count counts the number of records in the stream and outputs the count as a record who’s count is not included. Using the output option will output the count in a file as a table with header.

Usage

count([output: <file>[, force: <bool]])

Options

Examples

To count the number of records in the file ‘test.fq`:

BD.new.read_fastq(input: "test.fq").count(output: "count.txt").dump.run

{:SEQ_NAME=>"ILLUMINA-52179E_0004:2:1:1040:5263#TTAGGC/1",
 :SEQ=>"TTCGGCATCGGCGGCGACGTTGGCGGCGGGGCCGGGCGGGTCGANNNCAT",
 :SEQ_LEN=>50,
 :SCORES=>"GGFBGGEADFAFFDDD,-5AC5?>C:)7?#####################"}
{:SEQ_NAME=>"ILLUMINA-52179E_0004:2:1:1041:14486#TTAGGC/1",
 :SEQ=>"CATGGCGTATGCCAGACGGCCAGAACGATGGCCGCCGGGCTTCANNNAAG",
 :SEQ_LEN=>50,
 :SCORES=>"FFFFDBD?EEEEEEEFGGFAGAGEFDF=BFGFFGGDDDD=ABAA######"}
{:SEQ_NAME=>"ILLUMINA-52179E_0004:2:1:1043:19446#TTAGGC/1",
 :SEQ=>"CGGTACTGATCGAGTGTCAGGCTGTTGATCGCCGCGGGCGGGGGTNNGAC",
 :SEQ_LEN=>50,
 :SCORES=>"ECAEBEEEEEFFFFFEFFFFDDEEEGGGGGDEBEECBDAE@#########"}
{:RECORD_TYPE=>"count", :COUNT=>3}

And the count is also saved in the file ‘count.txt`:

#RECORD_TYPE COUNT
count  3

Constants

STATS

Public Class Methods

new(options) click to toggle source

Constructor for the count command.

@param options [Hash] Options hash. @option options [String] :output Path to output file. @option options [Boolean] :force Force overwrite of output file.

@return [Count] Instance of class Count.

# File lib/BioDSL/commands/count.rb, line 78
def initialize(options)
  @options = options

  check_options
end

Public Instance Methods

lmb() click to toggle source

Return the command lambda for count.

@return [Proc] Command lambda.

# File lib/BioDSL/commands/count.rb, line 87
def lmb
  lambda do |input, output, status|
    status_init(status, STATS)

    process_input(input, output)

    new_record = {
      RECORD_TYPE: 'count',
      COUNT: @status[:records_in]
    }

    output << new_record
    @status[:records_out] += 1

    write_output if @options[:output]
  end
end

Private Instance Methods

check_options() click to toggle source

Check options.

# File lib/BioDSL/commands/count.rb, line 108
def check_options
  options_allowed(@options, :output, :force)
  options_allowed_values(@options, force: [true, false, nil])
  options_files_exist_force(@options, :output)
end
process_input(input, output) click to toggle source

Process the input stream and emit all recors to the output stream.

@param input [Enumerator] Input stream @param output [Enumerator::Yielder] Output stream

# File lib/BioDSL/commands/count.rb, line 118
def process_input(input, output)
  input.each do |record|
    @status[:records_in] += 1

    output << record
    @status[:records_out] += 1
  end
end
write_output() click to toggle source

Write output table to file.

# File lib/BioDSL/commands/count.rb, line 128
def write_output
  Filesys.open(@options[:output], 'w') do |ios|
    ios.puts "#RECORD_TYPE\tCOUNT"
    ios.puts "count\t#{@status[:records_in]}"
  end
end