egf {epigrowthfit} | R Documentation |
Fit Nonlinear Mixed Effects Models of Epidemic Growth
Description
Fits nonlinear mixed effects models of epidemic growth
to collections of one or more disease incidence time series.
Usage
egf(model, ...)
## S3 method for class 'egf_model'
egf(model,
formula_ts,
formula_windows,
formula_parameters = list(),
formula_priors = list(),
data_ts,
data_windows,
subset_ts = NULL,
subset_windows = NULL,
select_windows = NULL,
na_action_ts = c("fail", "pass"),
na_action_windows = c("fail", "omit"),
control = egf_control(),
init = list(),
map = list(),
fit = TRUE,
se = FALSE,
...)
Arguments
model |
an R object specifying a top level nonlinear model,
typically of class egf_model .
|
formula_ts |
a formula of the form cbind(time, x) ~ ts
specifying one or more disease incidence time series in long format.
ts must evaluate to a factor (insofar as as.factor(ts)
is a factor) grouping the data by time series.
time must evaluate to a numeric vector that is increasing
within levels of ts .
Date , POSIXct , and POSIXlt
vectors are supported and coerced to numeric with
julian(time) .
Finally, x must evaluate to a non-negative numeric vector with
x[i] equal to the number of cases observed over the interval
(time[i-1], time[i]] .
Edge cases like x[1] are ignored internally.
Elements of x that are not integer-valued are rounded with a
warning.
formula_ts = cbind(time, x) ~ 1 can be supplied
when there is only one time series; it is equivalent
to formula_ts = cbind(time, x) ~ ts with ts
evaluating to rep(factor(1), length(x)) .
|
formula_windows |
a formula of the form cbind(start, end) ~ ts
specifying disjoint fitting windows (start, end] in long format.
If formula_ts = cbind(time, x) ~ ts1 and
formula_windows = cbind(start, end) ~ ts2 ,
then observation x[i] is associated with window
(start[j], end[j]] if and only if time[i-1] >= start[j] ,
time[i] <= end[j] , and ts1[i] == ts2[j] .
|
formula_parameters |
a list of formulae of the form parameter ~ terms
specifying mixed effects models for top level
nonlinear model parameters using lme4-like syntax
(see, e.g., help("lmer", package = "lme4") ).
Alternatively, a formula of the form ~terms to be recycled
for all parameters.
A list of parameters for which formulae may be specified can be
retrieved with egf_top .
Specifically, deparse(parameter) must be an element of
egf_top(model) .
The default for parameters not assigned a formula is ~1 .
|
formula_priors |
a list of formulae of the form parameter ~ prior
defining priors on:
(i) top level nonlinear model parameters,
(ii) fixed effect coefficients and random effect covariance parameters
(elements of segments beta and theta of the bottom level
parameter vector), or
(iii) random effect covariance matrices
(elements of a list Sigma containing the matrices).
prior must be a call to a prior function
with arguments specifying suitable hyperparameters.
In case (i),
deparse(parameter) must be an element of
egf_top(model) ,
and hyperparameters supplied on the right hand side must have length 1.
In cases (ii) and (iii),
parameter must be beta , theta , or Sigma
or a call to [ or [[ referring to a subset or element
of beta , theta , or Sigma
(e.g., beta[index] , where index is a valid index vector
for beta ),
and hyperparameters are recycled to the length of the indicated
subset.
Expressions prior and index are evaluated in the
corresponding formula environment.
|
data_ts , data_windows |
data frames, lists, or environments to be searched for variables
named in the corresponding formula_* and subset_*
arguments. (formula_parameters uses data_windows .)
Formula environments are searched for variables not found here.
|
subset_ts , subset_windows |
expressions to be evaluated in the corresponding data_*
data frames.
The value should be a valid index vector for the rows of the data frame.
Rows that are not indexed are discarded.
Rows that are indexed are filtered further (e.g., time series
with zero associated fitting windows are discarded regardless of
subset_ts ).
The default is to preserve all rows for further filtering.
|
select_windows |
an expression indicating additional variables in data_windows
(if it is a data frame) to be preserved in the returned object for use
by methods.
The default is to preserve nothing.
A dot ‘.’ is to preserve all variables not occurring in
formula_windows or formula_parameters .
Outside of these two special cases, evaluation of select
follows the implementation of subset.data.frame .
|
na_action_ts |
a character string affecting the handling of NA in x
if formula_ts = cbind(time, x) ~ ts .
"fail" is to throw an error.
"pass" is to ignore NA when fitting and replace
NA when predicting.
NA in time and ts are always an error.
|
na_action_windows |
a character string affecting the handling of NA in
formula_windows and formula_parameters variables.
"fail" is to throw an error.
"omit" is to discard incomplete rows of data.
|
control |
an egf_control object specifying control parameters.
|
init |
a named list of numeric vectors with possible elements beta ,
theta , and b , specifying values to be used in the first
likelihood evaluation for the so-named segments of the bottom level
parameter vector. The default value of each segment is a zero vector,
with the exception that "(Intercept)" coefficients in
beta have default values computed from supplied time series.
Use NA to indicate elements that should retain their default
value.
|
map |
a named list of factors with possible elements beta ,
theta , and b , each as long as the so-named segment
of the bottom level parameter.
Elements of a segment <name> indexed by
is.na(map[["<name>"]])
are fixed at their initial values, rather than estimated,
and elements corresponding to a common factor level are constrained
to have a common value during estimation.
map[["<name>"]] can be an index vector for segment
<name> , instead of a factor.
In this case, the indexed elements of that segment are fixed at their
initial values.
|
fit |
a logical. If FALSE , then egf returns early
(before fitting) with a partial model object.
The details of the partial result are subject to change and
therefore sparsely documented, on purpose ...
|
se |
a logical. If TRUE , then the Hessian matrix of the negative
log marginal likelihood function is computed and inverted to
approximate the joint covariance matrix of segments beta and
theta of the bottom level parameter vector.
Standard errors on the fitted values of all top level nonlinear model
parameters are computed approximately using the delta method.
Computations are preserved in the model object for reuse by methods.
|
... |
additional arguments passed from or to other methods.
|
Details
Users attempting to set arguments formula_priors
, init
,
and map
should know the structure of the bottom level parameter
vector. It is described under topic egf-class
.
If
formula_ts = cbind(time, x) ~ ts1
formula_windows = cbind(start, end) ~ ts2
then it is expected that time
, start
, and end
(after coercion to numeric) measure time on the same scale.
To be precise, numeric times should have a common unit of measure
and, at least within time series, represent displacements from a
common reference time.
These conditions will always hold if time
, start
, and
end
all evaluate to Date
, POSIXct
,
or POSIXlt
vectors.
When day of week effects are estimated, numeric times are interpreted
as numbers of days since midnight on January 1, 1970, so that time
points can be mapped unambiguously to days of week.
Furthermore, in this case, time
(after coercion to numeric) is
required to be integer-valued with one day spacing in all time series.
This means that
isTRUE(all.equal(time, round(time))) &&
all(range(diff(round(time))) == 1)
must be TRUE
in each level of ts1
.
These conditions ensure that intervals between successive time points
represent exactly one day of week.
Value
A list inheriting from class egf
.
See topic egf-class
for class documentation.
See Also
The many methods for class egf
,
listed by methods(class = "egf")
.
Examples
## Simulate 'N' incidence time series exhibiting exponential growth
set.seed(180149L)
N <- 10L
f <- function(time, r, c0) {
lambda <- diff(exp(log(c0) + r * time))
c(NA, rpois(lambda, lambda))
}
time <- seq.int(0, 40, 1)
r <- rlnorm(N, -3.2, 0.2)
c0 <- rlnorm(N, 6, 0.2)
data_ts <-
data.frame(country = gl(N, length(time), labels = LETTERS[1:N]),
time = rep.int(time, N),
x = unlist(Map(f, time = list(time), r = r, c0 = c0)))
rm(f, time)
## Define fitting windows (here, two per time series)
data_windows <-
data.frame(country = gl(N, 1L, 2L * N, labels = LETTERS[1:N]),
wave = gl(2L, 10L),
start = c(sample(seq.int(0, 5, 1), N, TRUE),
sample(seq.int(20, 25, 1), N, TRUE)),
end = c(sample(seq.int(15, 20, 1), N, TRUE),
sample(seq.int(35, 40, 1), N, TRUE)))
## Estimate the generative model
m1 <-
egf(model = egf_model(curve = "exponential", family = "pois"),
formula_ts = cbind(time, x) ~ country,
formula_windows = cbind(start, end) ~ country,
formula_parameters = ~(1 | country:wave),
data_ts = data_ts,
data_windows = data_windows,
se = TRUE)
## Re-estimate the generative model with:
## * Gaussian prior on beta[1L]
## * LKJ prior on all random effect covariance matrices
## (here there happens to be just one)
## * initial value of 'theta' set explicitly
## * theta[3L] fixed at initial value
m2 <-
update(m1,
formula_priors = list(beta[1L] ~ Normal(mu = -3, sigma = 1),
Sigma ~ LKJ(eta = 2)),
init = list(theta = c(log(0.5), log(0.5), 0)),
map = list(theta = 3L))
[Package
epigrowthfit version 0.15.3
Index]