BayesECM {ezECM} | R Documentation |
Training a Bayesian ECM (B-ECM) model
BayesECM(
Y,
BT = c(100, 1000),
priors = "default",
verb = FALSE,
transform = "logit"
)
Y |
|
BT |
integer vector of length 2, stipulating the number of |
priors |
list of parameters to be used in prior distributions. See details. |
verb |
logical. A setting of |
transform |
character string specifying the transform to use on the elements of |
The output of BayesECM()
provides a trained Bayesian Event Categorization Matrix (B-ECM) model, utilizing the data and prior parameter settings . If there are missing values in Y
, these values are imputed. A trained BayesECM
model is then used with the predict.BayesECM()
function to calculate expected category probabilities.
Before the data in Y
is used with the model, the p-values \in (0,1]
are transformed in an effort to better align the data with some properties of the normal distribution. When transform == "logit"
the inverse of the logistic function Y_{N \times p} = \log\left(\texttt{Y}\right) - \log\left(1-\texttt{Y}\right)
maps the values to the real number line. Values of Y
exactly equal to 0 or 1 cannot be used when transform == "logit"
. Setting the argument transform == "arcsin"
uses the transformation Y_{N\times p} = 2/\pi \times \mathrm{arcsin}\sqrt{Y}
further described in Anderson et al. (2007). From here forward, the variable Y_{N \times p}
should be understood to be the transformation of Y
, where N
is the total number of rows in Y
and p
is the number of discriminant columns in Y
.
The B-ECM model structure can be found in a future publication, with some details from this publication are reproduced here.
B-ECM assumes that all data is generated using a mixture of K
normal distributions, where K
is equal to the number of unique event categories. Each component of the mixture has a unique mean of \mu_k
, and covariance of \Sigma_k
, where k \in \{1 , \dots , K\}
indexes the mixture component. The likelihood of the i^{\mathrm{th}}
event observation y^i_p
of p
discriminants can be written as the sum below.
\sum_{k = 1}^K \pi_k \mathcal{N}(y^i_p; \mu_k, \Sigma_k)
Each Gaussian distribution in the sum is weighted by the scalar variable \pi_k
, where \sum_{k=1}^K \pi_k =1
so that the density integrates to 1.
There are prior distributions on each \mu_k, \Sigma_k
, and \pi
, where \pi
is the vector of mixture weights \{\pi_1, \dots , \pi_K\}
. These prior distributions are detailed below. These parameters are important for understanding the model, however they are integrated out analytically to reduce computation time, resulting in a marginal likelihood p(Y_{N_k \times p}|\eta_k, \Psi_k, \nu_k)
which is a mixture of matrix t-distributions. Y_{N_k \times p}
is a matrix of the total data for the k^{\mathrm{th}}
event category containing N_k
total event observations for training. The totality of the training data can be written as Y_{N \times p}
, where N = N_1 + \dots + N_K
.
BayesECM()
can handle observations where some of the p
discriminants of an observation are missing. The properties of the conditional matrix t-distribution are used to impute the missing values, thereby accounting for the uncertainty related to the missing data.
The posterior distributions p(\mu_k|Y_{N_k \times p}, \eta_k)
, p(\Sigma_k|Y_{N_k \times p}, \Psi_k, \nu_k)
, and p(\pi|Y_{N \times p}, \alpha)
are dependent on the specifications of prior distributions p(\mu_k|\Sigma_k, \eta_k)
, p(\Sigma_k| \Psi_k, \nu_k)
, and p(\pi|\alpha)
.
p(\mu_k|\Sigma_k, \eta_k)
is a multivariate normal distribution with a mean vector of \eta_k
and is conditional on the covariance \Sigma_k
. p(\Sigma_k|\Psi_k, \nu_k)
is an Inverse Wishart distribution with degrees of freedom parameter \nu_k
, or nu
, and scale matrix \Psi_k
, or Psi
. p(\pi|\alpha)
is a Dirichlet distribution with the parameter vector \alpha
of length K
.
The ability to use "default"
priors has been included for ease of use with various settings of the priors
function argument. The default prior hyperparameter values differ for the argument of transform
used, and the values can be inspected by examining the output of the BayesECM()
function. Simply setting priors = "default"
provides the same default values for all \eta_k, \Psi_k, \nu_k
in the mixture. If all prior parameters are to be shared between all event categories, but some non-default values are desirable then supplying a list of a similar structure as priors = list(eta = rep(0, times = ncol(Y) - 1), Psi = "default", nu = "default", alpha = 10)
can be used, where setting a list element "default"
can be exchanged for the correct data structure for the relevant data structure.
If one wishes to use some default values, but not share all parameter values between each event category, or wishes to specify each parameter value individually with no defaults, we suggest running and saving the output BayesECM(Y = Y, BT = c(1,2))$priors
. Note that when specifying eta
or Psi
it is necessary that the row and column order of the supplied values corresponds to the column order of Y
.
Returns an object of class("BayesECM")
. If there are missing data in the supplied argument Y
the object contains Markov-chain Monte-Carlo samples of the imputed missing data. Prior distribution parameters used are always included in the output. The primary use of an object returned from BayesECM()
is to later use this object to categorize unlabeled data with the predict.BayesECM()
function.
csv_use <- "good_training.csv"
file_path <- system.file("extdata", csv_use, package = "ezECM")
training_data <- import_pvals(file = file_path, header = TRUE, sep = ",", training = TRUE)
trained_model <- BayesECM(Y = training_data)