class BioDSL::SliceSeq
Slice sequences in the stream and obtain subsequences.¶ ↑
Slice subsequences from sequences using index positions, that is single postion residues, or using ranges for stretches of residues.
All positions are 0-based.
If the records also contain quality SCORES these are also sliced.
Usage¶ ↑
slice_seq(<slice: <index>|<range>>)
Options¶ ↑
-
slice: <index> - Slice a one residue subsequence.
-
slice: <range> - Slice a range from the sequence.
Examples¶ ↑
Consider the following FASTQ entry in the file test.fq:
@HWI-EAS157_20FFGAAXX:2:1:888:434 TTGGTCGCTCGCTCCGCGACCTCAGATCAGACGTGGGCGAT + !"#$%&'()*+,-./0123456789:;<=>?@ABCDEFGHI
To slice the second residue from the beginning do:
BD.new.read_fastq(input: "test.fq").slice_seq(slice: 2).dump.run {:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434", :SEQ=>"G", :SEQ_LEN=>1, :SCORES=>"#"}
To slice the last residue do:
BD.new.read_fastq(input: "test.fq").slice_seq(slice: -1).dump.run {:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434", :SEQ=>"T", :SEQ_LEN=>1, :SCORES=>"I"}
To slice the first 5 residues do:
BD.new.read_fastq(input: "test.fq").slice_seq(slice: 0 ... 5).dump.run {:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434", :SEQ=>"TTGGT", :SEQ_LEN=>5, :SCORES=>"!\"\#$%"}
To slice the last 5 residues do:
BD.new.read_fastq(input: "test.fq").slice_seq(slice: -5 .. -1).dump.run {:SEQ_NAME=>"HWI-EAS157_20FFGAAXX:2:1:888:434", :SEQ=>"GCGAT", :SEQ_LEN=>5, :SCORES=>"EFGHI"}
Constants
- STATS
Public Class Methods
Constructor for SliceSeq
.
@param options [Hash] Options hash. @option options [Range,Integer] :slice
@return [SliceSeq] Class instance.
# File lib/BioDSL/commands/slice_seq.rb, line 101 def initialize(options) @options = options check_options end
Public Instance Methods
Return lambda for command.
@return [Proc] Command
lambda.
# File lib/BioDSL/commands/slice_seq.rb, line 110 def lmb lambda do |input, output, status| status_init(status, STATS) input.each do |record| @status[:records_in] += 1 slice_seq(record) if record.key? :SEQ output << record @status[:records_out] += 1 end end end
Private Instance Methods
Check options.
# File lib/BioDSL/commands/slice_seq.rb, line 129 def check_options options_allowed(@options, :slice) options_required(@options, :slice) end
Slice sequence in given record.
@param record [Hash] BioDSL
record.
# File lib/BioDSL/commands/slice_seq.rb, line 137 def slice_seq(record) entry = BioDSL::Seq.new_bp(record) @status[:sequences_in] += 1 @status[:residues_in] += entry.length entry = entry[@options[:slice]] @status[:sequences_out] += 1 @status[:residues_out] += entry.length record.merge! entry.to_bp end