R-高维数据应减少到哪个维?

时间:2019-04-27 00:28:29

标签: r unsupervised-learning

在无监督学习中使用的数据集:1000个观察值,每个数据都有700个变量。

我想使用PCA减小尺寸。方差图的累计比例为: enter image description here

累计方差比例值:

[1] 0.08623735 0.13935632 0.18369306 0.21956945 0.25264076 0.27880897 0.30215856 0.32387425 0.34277313 0.36033330 0.37668481
 [12] 0.39155112 0.40527023 0.41807841 0.43071074 0.44201376 0.45287415 0.46302489 0.47251882 0.48187588 0.49078894 0.49958320
 [23] 0.50774742 0.51554012 0.52300842 0.53007253 0.53688780 0.54334857 0.54968434 0.55579985 0.56173150 0.56744288 0.57301079
 [34] 0.57819947 0.58326686 0.58821438 0.59306662 0.59771578 0.60232132 0.60683615 0.61121305 0.61553964 0.61978395 0.62390998
 [45] 0.62793909 0.63190950 0.63572562 0.63946402 0.64315290 0.64674709 0.65017158 0.65354003 0.65685139 0.66006706 0.66325798
 [56] 0.66640711 0.66948412 0.67249356 0.67547163 0.67840912 0.68128147 0.68410312 0.68687411 0.68957872 0.69224856 0.69490511
 [67] 0.69753686 0.70014095 0.70269422 0.70518963 0.70765978 0.71009371 0.71250267 0.71488029 0.71722103 0.71953559 0.72183730
 [78] 0.72411943 0.72639247 0.72864736 0.73086034 0.73306189 0.73525796 0.73740785 0.73953397 0.74164280 0.74374776 0.74583163
 [89] 0.74790902 0.74995830 0.75198588 0.75400385 0.75600680 0.75800060 0.75997048 0.76192182 0.76387023 0.76580761 0.76773984
[100] 0.76964833 0.77154930 0.77342520 0.77528380 0.77712442 0.77895885 0.78078186 0.78258876 0.78437196 0.78614765 0.78791422 
.....
[201] 0.94121339 0.94197088 0.94272432 0.94347233 0.94421210 0.94494421 0.94567311 0.94639042 0.94710482 0.94780717

当成分数为200时,相应的累积方差比例约为0.941。

200维数据是否适合用于k均值聚类? 如果没有,我该怎么做才能进一步减小数据的尺寸?

0 个答案:

没有答案