simdat {ipd}R Documentation

Data generation function for various underlying models

Description

Data generation function for various underlying models

Usage

simdat(
  n = c(300, 300, 300),
  effect = 1,
  sigma_Y = 1,
  model = "ols",
  shift = 0,
  scale = 1
)

Arguments

n

Integer vector of size 3 indicating the sample sizes in the training, labeled, and unlabeled data sets, respectively

effect

Regression coefficient for the first variable of interest for inference. Defaults is 1.

sigma_Y

Residual variance for the generated outcome. Defaults is 1.

model

The type of model to be generated. Must be one of "mean", "quantile", "ols", or "logistic". Default is "ols".

shift

Scalar shift of the predictions for continuous outcomes (i.e., "mean", "quantile", and "ols"). Defaults to 0.

scale

Scaling factor for the predictions for continuous outcomes (i.e., "mean", "quantile", and "ols"). Defaults to 1.

Value

A data.frame containing n rows and columns corresponding to the labeled outcome (Y), the predicted outcome (f), a character variable (set) indicating which data set the observation belongs to (training, labeled, or unlabeled), and four independent, normally distributed predictors (X1, X2, X3, and X4), where applicable.

Examples


#-- Mean

dat_mean <- simdat(c(100, 100, 100), effect = 1, sigma_Y = 1,

  model = "mean")

head(dat_mean)

#-- Linear Regression

dat_ols <- simdat(c(100, 100, 100), effect = 1, sigma_Y = 1,

  model = "ols")

head(dat_ols)


[Package ipd version 0.1.3 Index]