summary.JANE {JANE}R Documentation

Summarizing JANE fits

Description

S3 summary method for object of class "JANE".

Usage

## S3 method for class 'JANE'
summary(object, true_labels = NULL, initial_values = FALSE, ...)

Arguments

object

An object of S3 class "JANE", a result of a call to JANE.

true_labels

(optional) A numeric, character, or factor vector of known true cluster labels. Must have the same length as number of actors in the fitted network (default is NULL).

initial_values

A logical; if TRUE then summarize fit using the starting parameters used in the EM algorithm (default is FALSE, i.e., the results after the EM algorithm is run are summarized).

...

Unused.

Value

A list of S3 class "summary.JANE" containing the following components (Note: N is the number of actors in the network, K is the number of clusters, and D is the dimension of the latent space):

coefficients

A numeric vector representing the estimated coefficients from the logistic regression model.

p

A numeric vector of length K representing the estimated mixture weights of the finite multivariate normal mixture distribution for the latent positions.

U

A numeric N \times D matrix with rows representing an actor's estimated latent position in a D-dimensional social space.

mus

A numeric K \times D matrix representing the estimated mean vectors of the multivariate normal distributions for the latent positions of the K clusters.

omegas

A numeric D \times D \times K array representing the estimated precision matrices of the multivariate normal distributions for the latent positions of the K clusters.

Z

A numeric N \times K matrix with rows representing the estimated conditional probability that an actor belongs to the cluster K = k for k = 1,\ldots,K.

uncertainty

A numeric vector of length N representing the uncertainty of the i^{th} actor's classification, derived as 1 - max_k Z_{ik}.

cluster_labels

A numeric vector of length N representing the cluster assignment of each actor based on a hard clustering rule of \{h | Z_{ih} = max_k Z_{ik}\}.

input_params

A list with the following components:

  • model: A character string representing the specific model used (i.e., 'NDH', 'RS', or 'RSR')

  • IC_selection: A character string representing the specific information criteria used to select the optimal fit (i.e., 'BIC_logit', 'BIC_mbc', 'ICL_mbc', 'Total_BIC', or 'Total_ICL')

  • case_control: A logical; if TRUE then the case/control approach was utilized

  • DA_type: A character string representing the specific deterministic annealing approach utilized (i.e., 'none', 'cooling', 'heating', or 'hybrid')

  • priors: A list of the prior hyperparameters used. See specify_priors for definitions.

clustering_performance

(only if true_labels is !NULL) A list with the following components:

  • CER: A list with two components: (i) misclassified: The indexes of the misclassified data points in a minimum error mapping between the cluster labels and the known true cluster labels (i.e., true_labels) and (ii) errorRate: The error rate corresponding to a minimum error mapping between the cluster labels and the known true cluster labels (see classError for details)

  • ARI: A numeric value representing the adjusted Rand index comparing the cluster labels and the known true cluster labels (see adjustedRandIndex for details)

  • NMI: A numeric value representing the normalized mutual information comparing the cluster labels and the known true cluster labels (see NMI for details)

  • confusion_matrix: A numeric table representing the confusion matrix comparing the cluster labels and the known true cluster labels.

Examples


# Simulate network
mus <- matrix(c(-1,-1,1,-1,1,1), 
              nrow = 3,
              ncol = 2, 
              byrow = TRUE)
omegas <- array(c(diag(rep(7,2)),
                  diag(rep(7,2)), 
                  diag(rep(7,2))), 
                  dim = c(2,2,3))
p <- rep(1/3, 3)
beta0 <- 1.0
sim_data <- JANE::sim_A(N = 100L, 
                        model = "NDH",
                        mus = mus, 
                        omegas = omegas, 
                        p = p, 
                        beta0 = beta0, 
                        remove_isolates = TRUE)
                        
# Run JANE on simulated data
res <- JANE::JANE(A = sim_data$A,
                  D = 2L,
                  K = 3L,
                  initialization = "GNN", 
                  model = "NDH",
                  case_control = FALSE,
                  DA_type = "none")
                  
# Summarize fit 
summary(res)

# Summarize fit and compare to true cluster labels
summary(res, true_labels = apply(sim_data$Z, 1, which.max))

# Summarize fit using starting values of EM algorithm
summary(res, initial_values = TRUE)


[Package JANE version 0.2.1 Index]