江苏科技大学工学硕士学位论文IIAbstractIIIAbstractWith the rapid development of information technology,large capacity data management system has appeared in many areas. In order to help users gainvaluable knowledge from mass text, images, video and audio Data, Data Mining technology arises. As one of the main methods of data mining, clustering received people's extensive concern. Clustering algorithm is a kind of useful toolformining data;it has been widely used in data mining, text retrieval and classification, image segmentation and processing, pattern classification, clustering algorithmsmainly includeK-means, K-medoids,K-prototype, PAM,CLARANS, DBSCAN, CURE, ROCK, etc. These classic clustering algorithms usestatic models, only well for processing numerical data or category attribute data, and have a bad effect on the clustering of mixed attribute data. However, there are a lot of mixed attribute data described by categorical variables and numeric variables in practical applications. Therefore, the study of mixed attribute data clustering has important theoretical significance and application existing mixed attribute data clustering algorithms, not only have a small number, but also have sensitive to the order in which the choice of the initial cluster centers and the input of data, vulnerable to the impact of noise points and outliers, it is easy to fall into local optimal solution or the price of obtain the global optimal solution is higher, the clustering result is random and not stable and the accuracy is not high enough, the clustering performance and the quality of clustering is not ideal, and therefore need to constantly improve: (1) algorithm efficiency;(2) with the ability to deal with noise data; (3) find clusters of arbitrary shape; (4) the initial points; (5) the degree of clustering this paper, we conduct in-
混合属性数据的聚类地研究 来自淘豆网m.daumloan.com转载请标明出处.