standardize_glm {stdReg2}R Documentation

Get regression standardized estimates from a glm

Description

Get regression standardized estimates from a glm

Usage

standardize_glm(
  formula,
  data,
  values,
  clusterid,
  matched_density_cases,
  matched_density_controls,
  matching_variable,
  p_population,
  case_control = FALSE,
  ci_level = 0.95,
  ci_type = "plain",
  contrasts = NULL,
  family = "gaussian",
  reference = NULL,
  transforms = NULL
)

Arguments

formula

The formula which is used to fit the model for the outcome.

data

The data.

values

A named list or data.frame specifying the variables and values at which marginal means of the outcome will be estimated.

clusterid

An optional string containing the name of a cluster identification variable when data are clustered.

matched_density_cases

A function of the matching variable. The probability (or density) of the matched variable among the cases.

matched_density_controls

A function of the matching variable. The probability (or density) of the matched variable among the controls.

matching_variable

The matching variable extracted from the data set.

p_population

Specifies the incidence in the population when case_control=TRUE.

case_control

Whether the data comes from a case-control study.

ci_level

Coverage probability of confidence intervals.

ci_type

A string, indicating the type of confidence intervals. Either "plain", which gives untransformed intervals, or "log", which gives log-transformed intervals.

contrasts

A vector of contrasts in the following format: If set to "difference" or "ratio", then \psi(x)-\psi(x_0) or \psi(x) / \psi(x_0) are constructed, where x_0 is a reference level specified by the reference argument. Has to be NULL if no references are specified.

family

The family argument which is used to fit the glm model for the outcome.

reference

A vector of reference levels in the following format: If contrasts is not NULL, the desired reference level(s). This must be a vector or list the same length as contrasts, and if not named, it is assumed that the order is as specified in contrasts.

transforms

A vector of transforms in the following format: If set to "log", "logit", or "odds", the standardized mean \theta(x) is transformed into \psi(x)=\log\{\theta(x)\}, \psi(x)=\log[\theta(x)/\{1-\theta(x)\}], or \psi(x)=\theta(x)/\{1-\theta(x)\}, respectively. If the vector is NULL, then \psi(x)=\theta(x).

Details

standardize_glm performs regression standardization in generalized linear models, at specified values of the exposure, over the sample covariate distribution. Let Y, X, and Z be the outcome, the exposure, and a vector of covariates, respectively. standardize_glm uses a fitted generalized linear model to estimate the standardized mean \theta(x)=E\{E(Y|X=x,Z)\}, where x is a specific value of X, and the outer expectation is over the marginal distribution of Z.

Value

An object of class std_glm. This is basically a list with components estimates and covariance estimates in res. Results for transformations, contrasts, references are stored in res_contrasts. Obtain numeric results in a data frame with the tidy function.

References

Rothman K.J., Greenland S., Lash T.L. (2008). Modern Epidemiology, 3rd edition. Lippincott, Williams & Wilkins.

Sjölander A. (2016). Regression standardization with the R-package stdReg. European Journal of Epidemiology 31(6), 563-574.

Sjölander A. (2016). Estimation of causal effect measures with the R-package stdReg. European Journal of Epidemiology 33(9), 847-858.

Examples


# basic example
# needs to correctly specify the outcome model and no unmeasered confounders
# (+ standard causal assunmptions)
set.seed(6)
n <- 100
Z <- rnorm(n)
X <- rnorm(n, mean = Z)
Y <- rbinom(n, 1, prob = (1 + exp(X + Z))^(-1))
dd <- data.frame(Z, X, Y)
x <- standardize_glm(
  formula = Y ~ X * Z,
  family = "binomial",
  data = dd,
  values = list(X = 0:1),
  contrasts = c("difference", "ratio"),
  reference = 0
)
x
# different transformations of causal effects

# example from Sjölander (2016) with case-control data
# here the matching variable needs to be passed as an argument
singapore <- AF::singapore
Mi <- singapore$Age
m <- mean(Mi)
s <- sd(Mi)
d <- 5
standardize_glm(
  formula = Oesophagealcancer ~ (Everhotbev + Age + Dial + Samsu + Cigs)^2,
  family = binomial, data = singapore,
  values = list(Everhotbev = 0:1), clusterid = "Set",
  case_control = TRUE,
  matched_density_cases = function(x) dnorm(x, m, s),
  matched_density_controls = function(x) dnorm(x, m - d, s),
  matching_variable = Mi,
  p_population = 19.3 / 100000
)

# multiple exposures
set.seed(7)
n <- 100
Z <- rnorm(n)
X1 <- rnorm(n, mean = Z)
X2 <- rnorm(n)
Y <- rbinom(n, 1, prob = (1 + exp(X1 + X2 + Z))^(-1))
dd <- data.frame(Z, X1, X2, Y)
x <- standardize_glm(
  formula = Y ~ X1 + X2 + Z,
  family = "binomial",
  data = dd, values = list(X1 = 0:1, X2 = 0:1),
  contrasts = c("difference", "ratio"),
  reference = c(X1 = 0, X2 = 0)
)
x
tidy(x)

# continuous exposure
set.seed(2)
n <- 100
Z <- rnorm(n)
X <- rnorm(n, mean = Z)
Y <- rnorm(n, mean = X + Z + 0.1 * X^2)
dd <- data.frame(Z, X, Y)
x <- standardize_glm(
  formula = Y ~ X * Z,
  family = "gaussian",
  data = dd,
  values = list(X = seq(-1, 1, 0.1))
)

# plot standardized mean as a function of x
plot(x)
# plot standardized mean - standardized mean at x = 0 as a function of x
plot(x, contrast = "difference", reference = 0)


[Package stdReg2 version 1.0.1 Index]