Pairwise Nearest Neighbor Clustering Method Revisited : Speed-up methods, Best clustering results in respect of the minimization of intra cluster variance, Optimal clustering

Bok av Olli Virmajoki
Clustering is important problem that must be solved as a part of more complicated task in pattern recognition, image analysis and many other fields of science and engineering. The pairwise nearest neighbor method, also known as Ward's method belongs to the class of agglomerative clustering methods. The PNN method generates hierarchical clustering using a sequence of merge operations until the designed number of clusters is obtained. This method selects the cluster pair to be merged so that it increases the given objective function value least. We consider several speed-up methods for the PNN method, e.g., we utilize a k-neighborhood graph for reducing distance calculations. The PNN method can also be adapted for multilevel thresholding, which can be seen as a 1-dimensional special case of the clustering problem. The merge philosophy is also extended, by using the iterative shrinking method. In this way, we get better clustering results. The proposed method is also used as a crossover method in a genetic algorithm, which produces the best clustering results. The PNN algorithm can also be applied to generating optimal clustering.