A Full-Sample Clustering Model Considering Whole Process Optimization of Data
With the continuous increase of data volume and data dimensions, it becomes more and more difficult to improve the accuracy and interpretability of the algorithm only from the clustering algorithm itself. In order to improve the accuracy of the clustering algorithm and improve the interpretability of the clustering results, we propose an Improved feature selection and combined clustering model considering whole process optimization. In this model, we processed the data from the whole process of data mining and carried out clustering analysis. Firstly, we started data preprocessing, and then used the feature selection algorithm of text weight +principal component analysis (PCA) to reduce the feature dimension and obtain important features and data sets for clustering. Secondly, we used the improved Self organizing maps (SOM) neural network and K-means clustering combination model to perform clustering analysis and established clustering algorithm evaluation indicators. Thirdly, we used collaborative filtering to cluster data sets that included missing data to ensure that all sample data can obtain results. Finally, through case analysis, it was verified that the model proposed in this paper had high clustering accuracy and interpretability.
Belum ada ulasan untuk buku ini.