output_estimates {banditsCI} | R Documentation |
Calculates average response and differences in average response under counterfactual treatment policies.
Estimates are produced using provided inverse probability weighted (IPW) or augmented inverse probability weighted (AIPW) scores paired with various adaptive weighting schemes, as proposed in Hadad et al. (2021) and Zhan et al. (2021).
We briefly outline the target quantities:
For observations indexed t \in \{1,\dots,A\}
, treatments w \in \{1,\dots,K\}
, we denote as Y_t(w)
the potential outcome for the unit at time t
under treatment w
.
A policy \pi
is a treatment assignment procedure that is the subject of evaluation, described in terms of treatment assignment probabilities for each subject to receive each counterfactual treatment.
We target estimation of average response under a specified policy:
Q(\pi) := \sum_{w = 1}^{K}\textrm{E}\left[\pi(w)Y_t(w)\right]
The user may specify a list of list of policies to be evaluated, under policy1
.
Alternatively, they may estimate policy contrasts if policy0
is provided:
\Delta(\pi^1,\pi^2) := Q(\pi^1) - Q(\pi^2)
output_estimates(
policy0 = NULL,
policy1,
contrasts = "combined",
gammahat,
probs_array,
uniform = TRUE,
non_contextual_minvar = TRUE,
contextual_minvar = TRUE,
non_contextual_stablevar = TRUE,
contextual_stablevar = TRUE,
non_contextual_twopoint = TRUE,
floor_decay = 0
)
policy0 |
Optional matrix. Single policy probability matrix for contrast evaluation, dimensions |
policy1 |
List of matrices. List of counterfactual policy matrices for evaluation, dimensions |
contrasts |
Character. The method to estimate policy contrasts, either |
gammahat |
(A)IPW scores matrix with dimensions |
probs_array |
Numeric array. Probability matrix or array with dimensions |
uniform |
Logical. Estimate uniform weights. |
non_contextual_minvar |
Logical. Estimate non-contextual |
contextual_minvar |
Logical. Estimate contextual |
non_contextual_stablevar |
Logical. Estimate non-contextual |
contextual_stablevar |
Logical. Estimate contextual |
non_contextual_twopoint |
Logical. Estimate |
floor_decay |
Numeric. Floor decay parameter used in the calculation. Default is 0. |
A list of treatment effect estimates under different weighting schemes.
Hadad V, Hirshberg DA, Zhan R, Wager S, Athey S (2021). “Confidence intervals for policy evaluation in adaptive experiments.” Proceedings of the national academy of sciences, 118(15), e2014602118.
Zhan R, Hadad V, Hirshberg DA, Athey S (2021). “Off-policy evaluation via adaptive weighting with data from contextual bandits.” In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2125–2135.
set.seed(123)
# In a non-contextual setting, generate example values for policy1, gammahat, and probs_array
gammahat <- matrix(c(0.5, 0.8, 0.6,
0.3, 0.9, 0.2,
0.5, 0.7, 0.4,
0.8, 0.2, 0.6), ncol = 3, byrow = TRUE)
policy0 <- matrix(c(1, 0, 0,
1, 0, 0,
1, 0, 0,
1, 0, 0), ncol = 3, byrow = TRUE)
policy1 <- list(matrix(c(0, 1, 0,
0, 1, 0,
0, 1, 0,
0, 1, 0), ncol = 3, byrow = TRUE))
probs_array <- array(0, dim = c(4, 4, 3))
for (i in 1:4) {
temp_vector <- runif(3)
normalized_vector <- temp_vector / sum(temp_vector)
probs_array[i, 1, ] <- normalized_vector
}
for (k in 1:3) {
for (i in 1:4) {
temp_vector <- runif(3)
normalized_vector <- temp_vector / sum(temp_vector)
probs_array[i, 2:4, k] <- normalized_vector
}
}
estimates <- output_estimates(policy1 = policy1,
policy0 = policy0,
gammahat = gammahat,
probs_array = probs_array)
# plot
plot_results <- function(result) {
estimates <- result[, "estimate"]
std.errors <- result[, "std.error"]
labels <- rownames(result)
# Define the limits for the x-axis based on estimates and std.errors
xlims <- c(min(estimates - 2*std.errors), max(estimates + 2*std.errors))
# Create the basic error bar plot using base R
invisible(
plot(estimates, 1:length(estimates), xlim = xlims, xaxt = "n",
xlab = "Coefficient Estimate", ylab = "",
yaxt = "n", pch = 16, las = 1, main = "Coefficients and CIs")
)
# Add y-axis labels
invisible(
axis(2, at = 1:length(estimates), labels = labels, las = 1, tick = FALSE,
line = 0.5)
)
# Add the x-axis values
x_ticks <- x_ticks <- seq(from = round(xlims[1], .5),
to = round(xlims[2], .5), by = 0.5)
invisible(
axis(1,
at = x_ticks,
labels = x_ticks)
)
# Add error bars
invisible(
segments(estimates - std.errors,
1:length(estimates),
estimates + std.errors,
1:length(estimates))
)
}
sample_result <- estimates[[1]]
op <- par(no.readonly = TRUE)
par(mar=c(5, 12, 4, 2))
plot_results(sample_result)
par(op)