TS_on_LDA {LDATS} | R Documentation |
Conduct a set of Time Series analyses on a set of LDA models
Description
This is a wrapper function that expands the main Time Series
analyses function (TS
) across the LDA models (estimated
using LDA
or LDA_set
and the
Time Series models, with respect to both continuous time formulas and the
number of discrete changepoints. This function allows direct passage of
the control parameters for the parallel tempering MCMC through to the
main Time Series function, TS
, via the
ptMCMC_controls
argument.
check_TS_on_LDA_inputs
checks that the inputs to
TS_on_LDA
are of proper classes for a full analysis.
Usage
TS_on_LDA(
LDA_models,
document_covariate_table,
formulas = ~1,
nchangepoints = 0,
timename = "time",
weights = NULL,
control = list()
)
check_TS_on_LDA_inputs(
LDA_models,
document_covariate_table,
formulas = ~1,
nchangepoints = 0,
timename = "time",
weights = NULL,
control = list()
)
Arguments
LDA_models |
List of LDA models (class LDA_set , produced by
LDA_set ) or a singular LDA model (class LDA ,
produced by LDA ).
|
document_covariate_table |
Document covariate table (rows: documents,
columns: time index and covariate options). Every model needs a
covariate to describe the time value for each document (in whatever
units and whose name in the table is input in timename )
that dictates the application of the change points.
In addition, all covariates named within specific models in
formula must be included. Must be a conformable to a data table,
as verified by check_document_covariate_table .
|
formulas |
Vector of formula (s) for the
continuous (non-change point) component of the time series models. Any
predictor variable included in a formula must also be a column in the
document_covariate_table . Each element (formula) in the vector
is evaluated for each number of change points and each LDA model.
|
nchangepoints |
Vector of integer s corresponding to the number
of change points to include in the time series models. 0 is a valid input
corresponding to no change points (i.e., a singular time series
model), and the current implementation can reasonably include up to 6
change points. Each element in the vector is the number of change points
used to segment the data for each formula (entry in formulas )
component of the TS model, for each selected LDA model.
|
timename |
character element indicating the time variable
used in the time series. Defaults to "time" . The variable must be
integer-conformable or a Date . If the variable named
is a Date , the input is converted to an integer, resulting in the
timestep being 1 day, which is often not desired behavior.
|
weights |
Optional class numeric vector of weights for each
document. Defaults to NULL , translating to an equal weight for
each document. When using multinom_TS in a standard LDATS
analysis, it is advisable to weight the documents by their total size,
as the result of LDA is a matrix of
proportions, which does not account for size differences among documents.
For most models, a scaling of the weights (so that the average is 1) is
most appropriate, and this is accomplished using document_weights .
|
control |
A list of parameters to control the fitting of the
Time Series model including the parallel tempering Markov Chain
Monte Carlo (ptMCMC) controls. Values not input assume defaults set by
TS_control .
|
Value
TS_on_LDA
: TS_on_LDA
-class list
of results
from TS
applied for each model on each LDA model input.
check_TS_inputs
: An error message is thrown if any input
is not proper, else NULL
.
Examples
data(rodents)
document_term_table <- rodents$document_term_table
document_covariate_table <- rodents$document_covariate_table
LDAs <- LDA_set(document_term_table, topics = 2:3, nseeds = 2)
LDA_models <- select_LDA(LDAs)
weights <- document_weights(document_term_table)
formulas <- c(~ 1, ~ newmoon)
mods <- TS_on_LDA(LDA_models, document_covariate_table, formulas,
nchangepoints = 0:1, timename = "newmoon", weights)
[Package
LDATS version 0.3.0
Index]