SFCMeans {geocmeans} | R Documentation |
spatial version of the c-mean algorithm (SFCMeans, FCM_S1)
SFCMeans( data, nblistw, k, m, alpha, lag_method = "mean", maxiter = 500, tol = 0.01, standardize = TRUE, verbose = TRUE, init = "random", seed = NULL )
data |
A dataframe with only numerical variable |
nblistw |
A list.w object describing the neighbours typically produced by the spdep package |
k |
An integer describing the number of cluster to find |
m |
A float for the fuzziness degree |
alpha |
A float representing the weight of the space in the analysis (0 is a typical fuzzy-c-mean algorithm, 1 is balanced between the two dimensions, 2 is twice the weight for space) |
lag_method |
A string indicating if a classical lag must be used ("mean") or if a weighted median must be used ("median") |
maxiter |
An integer for the maximum number of iteration |
tol |
The tolerance criterion used in the evaluateMatrices function for convergence assessment |
standardize |
A boolean to specify if the variable must be centered and reduced (default = True) |
verbose |
A boolean to specify if the progress bar should be displayed |
init |
A string indicating how the initial centers must be selected. "random" indicates that random observations are used as centers. "kpp" use a distance based method resulting in more dispersed centers at the beginning. Both of them are heuristic. |
seed |
An integer used for random number generation. It ensures that the start centers will be the same if the same integer is selected. |
The implementation is based on the following article : doi: 10.1016/j.patcog.2006.07.011.
the matrix of belonging (u) is calculated as follow
u_{ik} = \frac{(||x_{k} - v{_i}||^2 + α||\bar{x_{k}} - v{_i}||^2)^{(-1/(m-1))}}{∑_{j=1}^c(||x_{k} - v{_j}||^2 + α||\bar{x_{k}} - v{_j}||^2)^{(-1/(m-1))}}
the centers of the groups are updated with the following formula
v_{i} = \frac{∑_{k=1}^N u_{ik}^m(x_{k} + α\bar{x_{k}})}{(1 + α)∑_{k=1}^N u_{ik}^m}
with
vi the center of the group vi
xk the data point k
xk_bar the spatially lagged data point k
A named list with
Centers: a dataframe describing the final centers of the groups
Belongings: the final bmembership matrix
Groups: a vector with the names of the most likely group for each observation
Data: the dataset used to perform the clustering (might be standardized)
data(LyonIris) AnalysisFields <-c("Lden","NO2","PM25","VegHautPrt","Pct0_14","Pct_65","Pct_Img", "TxChom1564","Pct_brevet","NivVieMed") dataset <- LyonIris@data[AnalysisFields] queen <- spdep::poly2nb(LyonIris,queen=TRUE) Wqueen <- spdep::nb2listw(queen,style="W") result <- SFCMeans(dataset, Wqueen,k = 5, m = 1.5, alpha = 1.5, standardize = TRUE)