hsic.var.rcv {GSelection} | R Documentation |
Error Variance Estimation in Genomic Prediction
Description
Estimation of error variance using Refitted Cross Validation in HSIC LASSO.
Usage
hsic.var.rcv(x,y,d)
Arguments
x |
a matrix of markers or explanatory variables, each column contains one marker and each row represents an individual. |
y |
a column vector of response variable. |
d |
number of variables to be selected from x. |
Details
Refitted cross validation method (RCV) which is a two step method, is used to get the estimate of the error variance. In first step, dataset is divided into two sub-datasets and with the help of HSIC LASSO most significant markers(variables) are selected from the two sub-datasets. This results in two small sets of selected variables. Then using the set selected from 1st sub-dataset error variance is estimated from the 2nd sub-dataset with ordinary least square method and using the set selected from the 2nd sub-dataset error variance is estimated from the 1st sub-dataset with ordinary least square method. Finally the average of those two error variances are taken as the final estimator of error variance with RCV method.
Value
Error variance |
Author(s)
Sayanti Guha Majumdar <sayanti23gm@gmail.com>, Anil Rai, Dwijesh Chandra Mishra
References
Fan, J., Guo, S., Hao, N. (2012). Variance estimation using refitted cross-validation in ultrahigh dimensional regression. Journal of the Royal Statistical Society, 74(1), 37-65.
Yamada, M., Jitkrittum, W., Sigal, L., Xing, E. P. and Sugiyama, M. (2014). High-Dimensional Feature Selection by Feature-Wise Kernelized Lasso. Neural Computation, 26(1):185-207. doi:10.1162/NECO_a_00537
Examples
library(GSelection)
data(GS)
x_trn <- GS[1:40,1:110]
y_trn <- GS[1:40,111]
x_tst <- GS[41:60,1:110]
y_tst <- GS[41:60,111]
hsic_var <- hsic.var.rcv(x_trn,y_trn,10)