R将群集摘要对象转换为数据帧

时间:2017-07-24 15:06:27

标签: r validation cluster-analysis

我正在尝试从使用clValid创建的R群集验证对象中提取验证度量。

当我创建对象并打印完整摘要时,我使用以下

library(clValid)

x <- clValid(iris[, -5], nClust=2:10,
        clMethods=c('hierarchical'), validation='internal')
summary(x)

这个输出是:

Clustering Methods:
 hierarchical 

Cluster sizes:
 2 3 4 5 6 7 8 9 10 

Validation Measures:
                                 2       3       4       5       6       7       8       9      10

hierarchical Connectivity   0.0000  4.4770  8.9929 15.4893 18.4183 24.8464 29.8425 36.8567 39.5607
             Dunn           0.3389  0.1378  0.1540  0.1540  0.1668  0.1624  0.1624  0.1915  0.1915
             Silhouette     0.6867  0.5542  0.4720  0.4307  0.3420  0.3707  0.3659  0.3167  0.3083

Optimal Scores:

             Score  Method       Clusters
Connectivity 0.0000 hierarchical 2       
Dunn         0.3389 hierarchical 2       
Silhouette   0.6867 hierarchical 2       

必需输出

我试图将Validation Measures作为这样的数据框:

                                2       3       4       5       6       7       8       9      10

hierarchical Connectivity   0.0000  4.4770  8.9929 15.4893 18.4183 24.8464 29.8425 36.8567 39.5607
             Dunn           0.3389  0.1378  0.1540  0.1540  0.1668  0.1624  0.1624  0.1915  0.1915
             Silhouette     0.6867  0.5542  0.4720  0.4307  0.3420  0.3707  0.3659  0.3167  0.3083

尝试

当我使用时:

names(summary(x))
attributes(summary(x))

这些都给出了

NULL

我可以使用optimalScores(x)获得最佳分数,但这不适用于validationMeasures(x)

问题

有没有办法从此摘要对象中提取Validation Measures作为data.frame

1 个答案:

答案 0 :(得分:3)

首先,你应该总是尝试

str(x)
Formal class 'clValid' [package "clValid"] with 14 slots
  ..@ clusterObjs:List of 1
  .. ..$ hierarchical:List of 7
  .. .. ..$ merge      : int [1:149, 1:2] -102 -8 -1 -10 -129 -11 -5 -20 -30 -58 ...
  .. .. ..$ height     : num [1:149] 0 0.1 0.1 0.1 0.1 ...
  .. .. ..$ order      : int [1:150] 42 15 16 33 34 37 21 32 44 24 ...
  .. .. ..$ labels     : NULL
  .. .. ..$ method     : chr "average"
  .. .. ..$ call       : language hclust(d = Dist, method = method)
  .. .. ..$ dist.method: chr "euclidean"
  .. .. ..- attr(*, "class")= chr "hclust"
  ..@ measures   : num [1:3, 1:9, 1] 0 0.339 0.687 4.477 0.138 ...
  .. ..- attr(*, "dimnames")=List of 3
  .. .. ..$ : chr [1:3] "Connectivity" "Dunn" "Silhouette"
  .. .. ..$ : chr [1:9] "2" "3" "4" "5" ...
  .. .. ..$ : chr "hierarchical"
  ..@ measNames  : chr [1:3] "Connectivity" "Dunn" "Silhouette"
  ..@ clMethods  : chr "hierarchical"
  ..@ labels     : chr [1:150] "1" "2" "3" "4" ...
  ..@ nClust     : num [1:9] 2 3 4 5 6 7 8 9 10
  ..@ validation : chr "internal"
  ..@ metric     : chr "euclidean"
  ..@ method     : chr "average"
  ..@ neighbSize : num 10
  ..@ annotation : NULL
  ..@ GOcategory : chr "all"
  ..@ goTermFreq : num 0.05
  ..@ call       : language clValid(obj = iris[, -5], nClust = 2:10, clMethods = c("hierarchical"),      validation = "internal")

所以我们可以看到这个包使用并返回S4个对象,其中一个插槽measures似乎就是你想要的那个。

x@measures[,,"hierarchical"]
                     2         3         4          5          6          7
Connectivity 0.0000000 4.4769841 8.9928571 15.4892857 18.4182540 24.8464286
Dunn         0.3389087 0.1378257 0.1540416  0.1540416  0.1668323  0.1624158
Silhouette   0.6867351 0.5541609 0.4719936  0.4306700  0.3419904  0.3707424
                      8          9         10
Connectivity 29.8424603 36.8567460 39.5607143
Dunn          0.1624158  0.1914854  0.1914854
Silhouette    0.3658753  0.3166807  0.3082851