read_fasta {castor}R Documentation

Load a fasta file.

Description

Efficiently load headers & sequences from a fasta file.

Usage

read_fasta(file,
		   include_headers		= TRUE,
		   include_sequences	= TRUE,
		   truncate_headers_at	= NULL)

Arguments

file

A character, path to the input fasta file. This may be gzipped (with extension .gz).

include_headers

Logical, whether to load the headers. If you don't need the headers you can set this to FALSE for efficiency.

include_sequences

Logical, whether to load the sequences. If you don't need the sequences you can set this to FALSE for efficiency.

truncate_headers_at

Optional character, needle at which to truncate headers. Everything at and after the first instance of the needle will be removed from the headers.

Details

This function is a fast and simple fasta loader. Note that all sequences and headers are loaded into memory at once.

Value

A named list with the following elements:

headers

Character vector, listing the loaded headers in the order encountered. Only included if include_headers was TRUE.

sequences

Character vector, listing the loaded sequences in the order encountered. Only included if include_sequences was TRUE.

Nlines

Integer, number of lines encountered.

Nsequences

Integer, number of sequences encountered.

Author(s)

Stilianos Louca

See Also

read_tree

Examples

## Not run: 
# load a gzipped fasta file
fasta = read_faste(file="myfasta.fasta.gz")

# print the first sequence
cat(fasta$sequences[1])

## End(Not run)

[Package castor version 1.8.3 Index]