predictLayerHVT {HVT} | R Documentation |
Predict which cell and what level each point in the test dataset belongs to
Description
Predict which cell and what level each point in the test dataset belongs to
Usage
predictLayerHVT(
data,
hvt_mapA,
hvt_mapB,
hvt_mapC,
mad.threshold = 0.2,
normalize = TRUE,
seed = 300,
distance_metric = "L1_Norm",
error_metric = "max",
child.level = 1,
line.width = c(0.6, 0.4, 0.2),
color.vec = c("#141B41", "#6369D1", "#D8D2E1"),
yVar = NULL,
...
)
Arguments
data |
Data Frame. A dataframe containing test dataset. The dataframe should have atleast one variable used while training. The variables from this dataset can also be used to overlay as heatmap |
hvt_mapA |
A list of hvt.results.model obtained from HVT function while performing hierarchical vector quantization on train data |
hvt_mapB |
A list of hvt.results.model obtained from HVT function while performing hierarchical vector quantization on train data with novelty(s) |
hvt_mapC |
A list of hvt.results.model obtained from HVT function while performing hierarchical vector quantization on train data without novelty(s) |
mad.threshold |
A numeric values indicating the permissible Mean Absolute Deviation |
normalize |
Logical. A logical value indicating if the columns in your dataset should be normalized. Default value is TRUE. |
seed |
Numeric. Random Seed. |
distance_metric |
character. The distance metric can be 'Euclidean" or "Manhattan". Euclidean is selected by default. |
error_metric |
character. The error metric can be "mean" or "max". mean is selected by default |
child.level |
A number indicating the level for which the heat map is to be plotted.(Only used if hmap.cols is not NULL) |
line.width |
Vector. A line width vector |
color.vec |
Vector. A color vector |
yVar |
character. Name of the dependent variable(s) |
... |
color.vec and line.width can be passed from here |
Value
Dataframe containing scored predicted layer output
Author(s)
Shubhra Prakash <shubhra.prakash@mu-sigma.com>, Sangeet Moy Das <sangeet.das@mu-sigma.com>, Shantanu Vaidya <shantanu.vaidya@mu-sigma.com>,Somya Shambhawi <somya.shambhawi@mu-sigma.com>
See Also
Examples
data(USArrests)
library("dplyr")
# Split in train and test
train <- USArrests[1:40, ]
test <- USArrests[41:50, ]
hvt_mapA <- list()
hvt_mapA <- HVT(train,
min_compression_perc = 70, quant.err = 0.2,
distance_metric = "L1_Norm", error_metric = "mean",
projection.scale = 10, normalize = TRUE,
quant_method = "kmeans"
)
identified_Novelty_cells <<- c(2, 10)
output_list <- removeNovelty(identified_Novelty_cells, hvt_mapA)
data_with_novelty <- output_list[[1]] %>% dplyr::select(!c("Cell.ID", "Cell.Number"))
hvt_mapB <- HVT(data_with_novelty,
n_cells = 3, quant.err = 0.2,
distance_metric = "L1_Norm", error_metric = "mean",
projection.scale = 10, normalize = TRUE,
quant_method = "kmeans"
)
dataset_without_novelty <- output_list[[2]]
mapA_scale_summary <- hvt_mapA[[3]]$scale_summary
hvt_mapC <- list()
hvt_mapC <- HVT(dataset_without_novelty,
n_cells = 15,
depth = 2, quant.err = 0.2, distance_metric = "L1_Norm",
error_metric = "max", quant_method = "kmeans",
projection.scale = 10, normalize = FALSE, scale_summary = mapA_scale_summary
)
predictions <- list()
predictions <- predictLayerHVT(test, hvt_mapA, hvt_mapB, hvt_mapC)