class BioDSL::CollapseOtus
Collapse OTUs based on identicial taxonomy strings.¶ ↑
collapse_otus
collapses OTUs in OTU style records if the TAXONOMY string is redundant. At the same time the sample counts (_COUNT
) is incremented the collapsed OTUs.
Usage¶ ↑
collapse_otus
Options¶ ↑
Examples¶ ↑
Here is an OTU table with four rows, one of which has a redundant Taxonomy
string:
BD.new.read_table(input: "otu_table.txt").dump.run {:OTU=>"OTU_1", :CM1_COUNT=>881, :CM10_COUNT=>234, :TAXONOMY=> "Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100); \ Leuconostocaceae(100);Leuconostoc(100)"} {:OTU=>"OTU_0", :CM1_COUNT=>3352, :CM10_COUNT=>4329, :TAXONOMY=> "Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100); \ Streptococcaceae(100);Lactococcus(100)"} {:OTU=>"OTU_5", :CM1_COUNT=>5, :CM10_COUNT=>0, :TAXONOMY=> "Bacteria(100);Proteobacteria(100);Gammaproteobacteria(100); \ Pseudomonadales(100);Pseudomonadaceae(100);Pseudomonas(100)"} {:OTU=>"OTU_3", :CM1_COUNT=>228, :CM10_COUNT=>200, :TAXONOMY=> "Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100); \ Streptococcaceae(100);Lactococcus(100)"}
In order to collapse the redudant OTU simply run the stream through collapse_otus
:
BD.new.read_table(input: "otu_table.txt").collapse_otus.dump.run {:OTU=>"OTU_1", :CM1_COUNT=>881, :CM10_COUNT=>234, :TAXONOMY=> "Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100); \ Leuconostocaceae(100);Leuconostoc(100)"} {:OTU=>"OTU_0", :CM1_COUNT=>3580, :CM10_COUNT=>4529, :TAXONOMY=> "Bacteria(100);Firmicutes(100);Bacilli(100);Lactobacillales(100); \ Streptococcaceae(100);Lactococcus(100)"} {:OTU=>"OTU_5", :CM1_COUNT=>5, :CM10_COUNT=>0, :TAXONOMY=> "Bacteria(100);Proteobacteria(100);Gammaproteobacteria(100); \ Pseudomonadales(100);Pseudomonadaceae(100);Pseudomonas(100)"}
Constants
- STATS
Public Class Methods
Constructor for CollapseOtus
.
@param options [Hash] Options Hash.
# File lib/BioDSL/commands/collapse_otus.rb, line 102 def initialize(options) @options = options check_options end
Public Instance Methods
Return the CollapseOtus
command lambda.
@return [Proc] Lambda for the command.
# File lib/BioDSL/commands/collapse_otus.rb, line 111 def lmb lambda do |input, output, status| status_init(status, STATS) hash = {} input.each do |record| @status[:records_in] += 1 if record[:TAXONOMY] @status[:otus_in] += 1 collapse_tax(hash, record) else output << record @status[:records_out] += 1 end end write_tax(hash, output) end end
Private Instance Methods
Check options.
# File lib/BioDSL/commands/collapse_otus.rb, line 137 def check_options options_allowed(@options, nil) end
Collapse identical taxonomies by removing duplicates and adding their counts.
@param hash [Hash] Hash with taxonomy records. @param record [Hash] BioDSL
record with taxonomy info.
# File lib/BioDSL/commands/collapse_otus.rb, line 146 def collapse_tax(hash, record) key = record[:TAXONOMY].gsub(/\(\d+\)/, '').to_sym if hash.key? key record.each do |k, v| hash[key][k] += v if k[-6..-1] == '_COUNT' end else hash[key] = record end end
Output collapsed taxonomy records.
@param hash [Hash] Hash with taxonomy records. @param output [Enumerator::Yielder] Output stream.
# File lib/BioDSL/commands/collapse_otus.rb, line 162 def write_tax(hash, output) hash.each_value do |record| output << record @status[:otus_out] += 1 @status[:records_out] += 1 end end