class BioDSL::AnalyzeResidueDistribution
Analyze the residue distribution from sequences in the stream.¶ ↑
analyze_residue_distribution
determines the distribution per position of residues from sequences and output records per observed residue with counts at the different positions. Using the percent
option outputs the count as percentages of observed residues per position.
The records output looks like this:
{:RECORD_TYPE=>"residue distribution", :V0=>"A", :V1=>5, :V2=>0, :V3=>0, :V4=>0} Which are ready for +write_table+. See examples.
Usage¶ ↑
analyze_residue_distribution([percent: <bool>])
Options¶ ↑
-
percent: <bool> - Output distributions in percent (default=false).
Examples¶ ↑
Consider the following entries in the file ‘test.fna`:
>DNA AGCT >RNA AGCU >Protein FLS* >Gaps -.~
Now we run the data through the following pipeline and get the resulting table:
BD.new. read_fasta(input: "test.fna"). analyze_residue_distribution. grab(select: "residue"). write_table(skip: [:RECORD_TYPE]). run A 2 0 0 0 G 0 2 0 0 C 0 0 2 0 T 0 0 0 1 U 0 0 0 1 F 1 0 0 0 L 0 1 0 0 S 0 0 1 0 * 0 0 0 1 - 1 0 0 0 . 0 1 0 0 ~ 0 0 1 0
Here we do the same as above, but output percentages instead of absolute counts:
BD.new. read_fasta(input: "test.fna"). analyze_residue_distribution(percent: true). grab(select: "residue"). write_table(skip: [:RECORD_TYPE]). run A 50 0 0 0 G 0 50 0 0 C 0 0 50 0 T 0 0 0 33 U 0 0 0 33 F 25 0 0 0 L 0 25 0 0 S 0 0 25 0 * 0 0 0 33 - 25 0 0 0 . 0 25 0 0 ~ 0 0 25 0
Constants
- STATS
Public Class Methods
Constructor for the AnalyzeResidueDistribution
class.
@param [Hash] options Options hash. @option options [Boolean] :percent Output distribution in percent.
@return [AnalyzeResidueDistribution] Returns an instance of the class.
# File lib/BioDSL/commands/analyze_residue_distribution.rb, line 123 def initialize(options) @options = options check_options @counts = Hash.new { |h, k| h[k] = Hash.new(0) } @total = Hash.new(0) @residues = Set.new end
Public Instance Methods
Return a lambda for the read_fasta command.
@return [Proc] Returns the read_fasta command lambda.
# File lib/BioDSL/commands/analyze_residue_distribution.rb, line 136 def lmb require 'set' lambda do |input, output, status| status_init(status, STATS) input.each do |record| @status[:records_in] += 1 analyze_residues(record[:SEQ]) if record[:SEQ] if output output << record @status[:records_out] += 1 end end calc_dist(output) end end
Private Instance Methods
Analyze the sequence distribution of a given sequence.
@param seq [String] - Sequence to analyze.
# File lib/BioDSL/commands/analyze_residue_distribution.rb, line 168 def analyze_residues(seq) @status[:sequences_in] += 1 @status[:sequences_out] += 1 @status[:residues_in] += seq.length @status[:residues_out] += seq.length seq.upcase.chars.each_with_index do |char, i| c = char.to_sym @counts[i][c] += 1 @total[i] += 1 @residues.add(c) end end
Calculate the residue destribution.
@param output [BioDSL::Stream] Output stream.
# File lib/BioDSL/commands/analyze_residue_distribution.rb, line 185 def calc_dist(output) @residues.each do |res| record = {} record[:RECORD_TYPE] = 'residue distribution' record[:V0] = res.to_s if @options[:percent] calc_dist_percent(record, res) else calc_dist_count(record, res) end output << record end end
Calculate the residue distribution for a given residue.
@param record [Hash] BioDSL
record. @param res [Symbol] Residue.
# File lib/BioDSL/commands/analyze_residue_distribution.rb, line 216 def calc_dist_count(record, res) @counts.each do |pos, dist| record["V#{pos + 1}".to_sym] = dist[res] end end
Calculate the residue distribution in percent for a given residue.
@param record [Hash] BioDSL
record. @param res [Symbol] Residue.
# File lib/BioDSL/commands/analyze_residue_distribution.rb, line 205 def calc_dist_percent(record, res) @counts.each do |pos, dist| value = (@total[pos] == 0) ? 0 : 100 * dist[res] / @total[pos] record["V#{pos + 1}".to_sym] = value end end
Check the options.
# File lib/BioDSL/commands/analyze_residue_distribution.rb, line 160 def check_options options_allowed(@options, :percent) options_allowed_values(@options, percent: [nil, true, false]) end