class BioDSL::WriteFasta
Write sequences from stream in FASTA format.¶ ↑
Description
write_fasta
writes sequence from the data stream in FASTA format. However, a FASTA entry will only be written if a SEQ key and a SEQ_NAME key is present. An example FASTA entry:
>test1 TATGACGCGCATCGACAGCAGCACGAGCATGCATCGACTG TGCACTGACTACGAGCATCACTATATCATCATCATAATCT TACGACATCTAGGGACTAC
For more about the FASTA format:
en.wikipedia.org/wiki/FASTA_format
Usage¶ ↑
write_fasta([wrap: <uin>[, output: <file>[, force: <bool> [, gzip: <bool> | bzip2: <bool>]]]])
Options¶ ↑
-
output <file> - Output file.
-
force <bool> - Force overwrite existing output file.
-
wrap <uint> - Wrap sequence into lines of wrap length.
-
gzip <bool> - Write gzipped output file.
-
bzip2 <bool> - Write bzipped output file.
Examples¶ ↑
To write FASTA entries to STDOUT.
write_fasta
To write FASTA entries wrapped in lines of length of 80 to STDOUT.
write_fasta(wrap: 80)
To write FASTA entries to a file ‘test.fna’.
write_fasta(output: "test.fna")
To overwrite output file if this exists use the force option:
write_fasta(output: "test.fna", force: true)
To write gzipped FASTA entries to file ‘test.fna.gz’.
write_fasta(output: "test.fna.gz", gzip: true)
To write bzipped FASTA entries to file ‘test.fna.bz2’.
write_fasta(output: "test.fna.bz2", bzip2: true)
Constants
- STATS
Public Class Methods
Constructor for the WriteFasta
class.
@param [Hash] options Options hash. @option options [Bool] :force Flag allowing overwriting files. @option options [String] :output Output file path. @option options [Integer] :wrap Wrap sequences at this length (default no
wrap)
@option options [Bool] :gzip Output will be gzip’ed. @option options [Bool] :bzip2 Output will be bzip2’ed.
@return [WriteFasta] Returns an instance of the class.
# File lib/BioDSL/commands/write_fasta.rb, line 97 def initialize(options) @options = options check_options @options[:output] ||= $stdout end
Public Instance Methods
Return a lambda for the write_fasta command.
@return [Proc] Returns the write_fasta command lambda.
# File lib/BioDSL/commands/write_fasta.rb, line 106 def lmb lambda do |input, output, status| status_init(status, STATS) if @options[:output] == $stdout write_stdout(input, output) else write_file(input, output) end end end
Private Instance Methods
Check the options.
# File lib/BioDSL/commands/write_fasta.rb, line 121 def check_options options_allowed(@options, :force, :output, :wrap, :gzip, :bzip2) options_unique(@options, :gzip, :bzip2) options_tie(@options, gzip: :output, bzip2: :output) options_files_exist_force(@options, :output) end
Determine what compression should be used for output.
@return [Symbol, nil] Compression flag or nil if no compression.
# File lib/BioDSL/commands/write_fasta.rb, line 200 def compress return :gzip if @options[:gzip] return :bzip2 if @options[:bzip2] end
Creates a Seq
object from a given record if SEQ_NAME and SEQ is present.
@param record [Hash] Biopices record to convert.
@return [BioDSL::Seq] Sequence entry.
# File lib/BioDSL/commands/write_fasta.rb, line 190 def record2entry(record) return unless record.key? :SEQ_NAME return unless record.key? :SEQ BioDSL::Seq.new_bp(record) end
Write all sequence entries to a specified file.
@param input [Enumerator] The input stream. @param output [Enumerator::Yielder] The output stream.
# File lib/BioDSL/commands/write_fasta.rb, line 154 def write_file(input, output) Fasta.open(@options[:output], 'w', compress: compress) do |ios| input.each do |record| @status[:records_in] += 1 if (entry = record2entry(record)) ios.puts entry.to_fasta(@options[:wrap]) @status[:sequences_in] += 1 @status[:sequences_out] += 1 @status[:residues_in] += entry.length @status[:residues_out] += entry.length end write_output(output, record) end end end
Write a given record to the output stream if this exist.
@param output [Enumerator::Yielder, nil] Output stream. @param record [Hash] Biopices record to write.
# File lib/BioDSL/commands/write_fasta.rb, line 178 def write_output(output, record) return unless output output << record @status[:records_out] += 1 end
Write all sequence entries to stdout.
@param input [Enumerator] The input stream. @param output [Enumerator::Yielder] The output stream.
# File lib/BioDSL/commands/write_fasta.rb, line 132 def write_stdout(input, output) wrap = @options[:wrap] input.each do |record| @status[:records_in] += 1 if (entry = record2entry(record)) $stdout.puts entry.to_fasta(wrap) @status[:sequences_in] += 1 @status[:sequences_out] += 1 @status[:residues_in] += entry.length @status[:residues_out] += entry.length end write_output(output, record) end end