omics {TensorTest2D} | R Documentation |
The omics data is a subset of the dataset provided by cancer cell line encyclopedia (CCLE) project (Barretina et al., 2012; https://sites.broadinstitute.org/ccle/).
data(omics)
A list contains two objects:
a 3-dimensional array with size (3, 10, 68)
a 68-length vector representing the response variable
This data consists of one response variable and ten genes evaluated under three different platforms.
The response variable measures the log-transformed activity area of taking Vandertanib, a drug targeting on EGFR gene for lung cancer.
The three platforms are DNA copy number variation (CNV), methylation and mRNA expression.
Among the 10 genes, 7 of them (EGFR, EREG, HRAS, KRAS, PTPN11, STAT3, and TGFA) are involved in the protein-protein interaction network of EGFR (https://string-db.org) and the rest (ACTB, GAPDH, and PPIA) are arbitrarily chosen housekeeping genes and play the role of negative control.
Detailed pre-processing procedure is available in Chang et al. (2021).
Barretina, J., Caponigro, G., Stransky, N. et al. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity. Nature 483, 603–607 (2012). (Link)
Sheng-Mao Chang, Meng Yang, Wenbin Lu, Yu-Jyun Huang, Yueyang Huang, Hung Hung, Jeffrey C Miecznikowski, Tzu-Pin Lu, Jung-Ying Tzeng, Gene-set integrative analysis of multi-omics data using tensor-based association test, Bioinformatics, 2021;, btab125, (Link))
data(omics)
names(omics)
dim(omics$omics)
# 3 10 68