get_nn_importance {ClustAssess} | R Documentation |
Evaluates clustering stability when changing the values of different parameters involved in the graph building step, namely the base embedding, the graph type and the number of neighbours.
get_nn_importance(
object,
n_neigh_sequence,
n_repetitions = 100,
seed_sequence = NULL,
graph_reduction_type = "PCA",
ecs_thresh = 1,
ncores = 1,
transpose = (graph_reduction_type == "PCA"),
graph_type = 2,
algorithm = 4,
...
)
object |
The data matrix. If the graph reduction type is PCA, the object should be an expression matrix, with features on rows and observations on columns; in the case of UMAP, the user could also provide a matrix associated to a PCA embedding. See also the transpose argument. |
n_neigh_sequence |
A sequence of the number of nearest neighbours. |
n_repetitions |
The number of repetitions of applying the pipeline with different seeds; ignored if seed_sequence is provided by the user. |
seed_sequence |
A custom seed sequence; if the value is NULL, the sequence will be built starting from 1 with a step of 100. |
graph_reduction_type |
The graph reduction type, denoting if the graph should be built on either the PCA or the UMAP embedding. |
ecs_thresh |
The ECS threshold used for merging similar clusterings. |
ncores |
The number of parallel R instances that will run the code. If the value is set to 1, the code will be run sequentially. |
transpose |
Logical: whether the input object will be transposed or not. Set to FALSE if the input is an observations X features matrix, and set to TRUE if the input is a features X observations matrix. |
graph_type |
Argument indicating whether the graph should be unweighted (0), weighted (1) or both (2). |
algorithm |
An index indicating which community detection algorithm will
be used: Louvain (1), Louvain refined (2), SLM (3) or Leiden (4). More details
can be found in the Seurat's |
... |
Additional arguments passed to the |
A list having three fields:
n_neigh_k_corresp - list containing the number of the clusters obtained by running the pipeline multiple times with different seed, number of neighbors and graph type (weighted vs unweigted)
n_neigh_ec_consistency - list containing the EC consistency of the partitions obtained at multiple runs when changing the number of neighbors or the graph type
n_different_partitions - the number of different partitions obtained by each number of neighbors
set.seed(2021)
# create an artificial expression matrix
expr_matrix = matrix(c(runif(100*10), runif(100*10, min=5, max=6)), nrow = 200)
rownames(expr_matrix) = as.character(1:200)
nn_importance_obj = get_nn_importance(object = expr_matrix,
n_neigh_sequence = c(10,15,20),
n_repetitions = 10,
graph_reduction_type = "PCA",
algorithm = 1,
transpose = FALSE, # the matrix is already observations x features, so we won't transpose it
# the following parameter is used by the irlba function and is not mandatory
nv = 2)
plot_n_neigh_ecs(nn_importance_obj)