VhgGetSubject {Virusparies} | R Documentation |
VhgGetSubject: Process and Count Viral Subjects within Groups
Description
VhgGetSubject: Process and Count Viral Subjects within Groups
Usage
VhgGetSubject(
file,
groupby = "best_query",
remove_identifiers = TRUE,
include_run_ids = FALSE,
extract_brackets = FALSE,
group_unwanted_phyla = NULL
)
Arguments
file |
A data frame containing VirusHunter or VirusGatherer hittable results. |
groupby |
(optional): A character specifying the column containing the groups (default: "best_query"). Note: Gatherer hittables do not have a "best_query" column. Please provide an appropriate column for grouping. |
remove_identifiers |
(optional): if |
include_run_ids |
(optional): If |
extract_brackets |
(optional): extract content within square brackets []. |
group_unwanted_phyla |
(optional): A character string specifying which group of viral phyla to retain in the analysis. Valid values are:
All other phyla not in the specified group will be grouped into a single category:
"Non-RNA-virus" for |
Details
The function VhgGetSubject
counts the number of viral subjects in the ViralRefSeq_subject
column
for each group specified by the groupby
argument.
It returns a tibble with three columns: the first column contains the viral group specified by the groupby
argument,
the second column lists the viral subjects found in that group, and the third column shows how many times each viral subject appears in that group.
Value
a processed tibble object.
Author(s)
Sergej Ruff
See Also
VirusHunterGatherer is available here: https://github.com/lauberlab/VirusHunterGatherer.
Examples
# import data
path <- system.file("extdata", "virushunter.tsv", package = "Virusparies")
file <- ImportVirusTable(path)
# process column and filter for significant groups
file <- VhgPreprocessTaxa(file,taxa_rank = "Family")
file_filtered <- VhgSubsetHittable(file,ViralRefSeq_E_criteria = 1e-5)
subject_df <- VhgGetSubject(file_filtered,groupby = "ViralRefSeq_taxonomy",
remove_identifiers = TRUE)
print(subject_df)