cv.learner {learner}R Documentation

Cross-validation for LEARNER

Description

This function performs k-fold cross-validation to select the nuisance parameters (\lambda_1, \lambda_2) for learner.

Usage

cv.learner(
  Y_source,
  Y_target,
  r,
  lambda_1_all,
  lambda_2_all,
  step_size,
  n_folds = 4,
  n_cores = 1,
  control = list()
)

Arguments

Y_source

matrix containing the source population data, as in learner

Y_target

matrix containing the target population data, as in learner

r

(optional) integer specifying the rank of the knowledge graphs, as in learner

lambda_1_all

vector of numerics specifying the candidate values of \lambda_1 (see Details)

lambda_2_all

vector of numerics specifying the candidate values of \lambda_2 (see Details)

step_size

numeric scalar specifying the step size for the Newton steps in the numerical optimization algorithm, as in learner

n_folds

an integer specify the number of cross-validation folds. The default is 4.

n_cores

an integer specifying the number of CPU cores in parallelization. Parallelization is performed across the different candidate (\lambda_1, \lambda_2) pairs. The default is 1, i.e., no parallelization.

control

a list of parameters for controlling the stopping criteria for the numerical optimization algorithm, as in learner.

Details

Given sets of candidate values of \lambda_1 and \lambda_2, this function performs k-fold cross-validation to select the pair (\lambda_1, \lambda_2) with the smallest held out error. This function randomly partitions the entries of Y_target into k (approximately) equally sized subsamples. The training data sets are obtained by removing one of the k subsamples and the corresponding test data sets are based on the held out subsamples. The learner function is applied to each training data set. The held out error is computed by the mean squared error comparing the entries in the test data sets with those imputed from the LEARNER estimates. See McGrath et al. (2024) for further details.

Value

A list with the following elements:

lambda_1_min

value of \lambda_1 with the smallest MSE

lambda_2_min

value of \lambda_2 with the smallest MSE

mse_all

matrix containing MSE value for each (\lambda_1, \lambda_2) pair. The rows correspond to the \lambda_1 values, and the columns correspond to the \lambda_2 values.

r

rank value used.

References

McGrath, S., Zhu, C,. Guo, M. and Duan, R. (2024). LEARNER: A transfer learning method for low-rank matrix estimation. arXiv preprint arXiv:2412.20605.

Examples

res <- cv.learner(Y_source = dat_highsim$Y_source,
                  Y_target = dat_highsim$Y_target,
                  lambda_1_all = c(1, 10, 100),
                  lambda_2_all = c(1, 10, 100),
                  step_size = 0.003)



[Package learner version 0.1.0 Index]