SPClustering: Solution Path Clustering via Concave Penalization

SPClustering is an R package that performs clustering of multidimensional data and provides a small solution path that contains the cluster assignments, cluster center estimates and the estimated number of clusters in the dataset. SPClustering can be applied to noisy data without any pre-filtering; it includes a solution selection function and a function for simulating simple clustered datasets with or without noise and overlap. SPClustering also provides an optional fast clustering implementation based on subsampling, which is recommended for large datasets of size greater than 10,000 data points. When the fast implementation is chosen, only a single clustering solution is obtained with the number of clusters estimated automatically.

Download R package SPClustering (Source) and readme file.

References

[1] Marchetti, Y. and Zhou, Q. (2014). Solution path clustering with adaptive concave penalty. Electronic Journal of Statistics, 8(1): 1569-1603.

[2] Marchetti, Y. and Zhou Q. (2016). Iterative Subsampling in Solution Path Clustering of Noisy Big Data. Statistics and Its Interface, 9: 415-431

SPClustering is a free R package. For more information, please contact zhou@stat.ucla.edu.
Copyright, 2015 - UCLA. All rights reserved.