trajectories {clustra} | R Documentation |
Performs k-means clustering on continuous response
measured over time
,
where each mean is defined by a thin plate spline fit to all points in a
cluster. Typically, this function is called by clustra
.
trajectories(
data,
k,
group,
maxdf,
conv = c(10, 0),
mccores = 1,
verbose = FALSE,
...
)
data |
Data table or data frame with response measurements, one per observation.
Column names are |
k |
Number of clusters (groups) |
group |
Vector of initial group numbers corresponding to |
maxdf |
Integer. Basis dimension of smooth term. See |
conv |
A vector of length two, |
mccores |
Integer number of cores to use by |
verbose |
Logical, whether to produce debug output. A value > 1 will plot tps fit lines in each iteration. |
... |
See |
A list with components
deviance
- The final deviance in each cluster added across clusters.
group
- Integer vector of group assignments corresponding to unique id
s.
loss
- Numeric matrix with rows corresponding to unique id
s and one
column for each cluster. Each entry is the mean squared loss for the data in
the id
relative to the cluster model.
k
- An integer giving the requested number of clusters.
k_cl
- An integer giving the converged number of clusters. Can be
smaller than k
when some clusters become too small for degrees of freedom
during convergence.
data_group
- An integer vector, giving group assignment as expanded into
all id
time points.
tps
- A list with k_cl
elements, each an object returned by the
mgcv::bam
fit of a cluster thin plate spline model.
iterations
- An integer giving the number of iterations taken.
counts
- An integer vector giving the number of id
s in each cluster.
counts_df
- An integer vector giving the total number of observations in
each cluster (sum of the number of observations for id
s belonging to the
cluster).
changes
- An integer, giving the number of id
s that changed clusters in
the last iteration. This is zero if converged.
George Ostrouchov and David Gagnon