read_grm {genio} | R Documentation |
This function reads a GCTA Genetic Relatedness Matrix (GRM, i.e. kinship) set of files in their binary format, returning the kinship matrix and, if available, the corresponding matrix of pair sample sizes (non-trivial under missingness) and individuals table. Setting some options allows reading plink2 binary kinship formats such as "king" (see examples).
read_grm(
name,
n_ind = NA,
verbose = TRUE,
ext = "grm",
shape = c("triangle", "strict_triangle", "square"),
size_bytes = 4,
comment = "#"
)
name |
The base name of the input files.
Files with that base, plus shared extension (default "grm", see |
n_ind |
The number of individuals, required if the file with the extension |
verbose |
If |
ext |
Shared extension for all three inputs (see |
shape |
The shape of the information to read (may be abbreviated).
Default "triangle" assumes there are |
size_bytes |
The number of bytes per number encoded. Default 4 corresponds to GCTA GRM and plink2 "bin4", whereas plink2 "bin" requires a value of 8. |
comment |
Character to start comments in |
A list with named elements:
kinship
: The symmetric n
-times-n
kinship matrix (GRM). Has IDs as row and column names if the file with extension .<ext>.id
exists. If shape='strict_triangle'
, diagonal will have missing values.
M
: The symmetric n
-times-n
matrix of pair sample sizes (number of non-missing loci pairs), if the file with extension .<ext>.N.bin
exists. Has IDs as row and column names if the file with extension .<ext>.id
was available. If shape='strict_triangle'
, diagonal will have missing values.
fam
: A tibble with two columns: fam
and id
, same as in Plink FAM files. Returned if the file with extension .<ext>.id
exists.
Greatly adapted from sample code from GCTA: https://cnsgenomics.com/software/gcta/#MakingaGRM
# to read "data.grm.bin" and etc, run like this:
# obj <- read_grm("data")
# obj$kinship # the kinship matrix
# obj$M # the pair sample sizes matrix
# obj$fam # the fam and ID tibble
# The following example is more awkward
# because package sample data has to be specified in this weird way:
# read an existing set of GRM files
file <- system.file("extdata", 'sample.grm.bin', package = "genio", mustWork = TRUE)
file <- sub('\\.grm\\.bin$', '', file) # remove extension from this path on purpose
obj <- read_grm(file)
obj$kinship # the kinship matrix
obj$M # the pair sample sizes matrix
obj$fam # the fam and ID tibble
# Read sample plink2 KING-robust files (several variants).
# Read both base.king.bin and base.king.id files.
# All generated with "plink2 <input> --make-king <options> --out base"
# (replace "base" with actual base name) with these options:
# #1) "triangle bin"
# data <- read_grm( 'base', ext = 'king', shape = 'strict', size_bytes = 8 )
# #2) "triangle bin4"
# data <- read_grm( 'base', ext = 'king', shape = 'strict' )
# #3) "square bin"
# data <- read_grm( 'base', ext = 'king', shape = 'square', size_bytes = 8 )
# #4) "square bin4"
# data <- read_grm( 'base', ext = 'king', shape = 'square' )