R kohonen - 输入数据是否自动缩放和居中?

时间:2017-10-26 05:41:26

标签: r

我一直关注R Kohonen自组织地图(SOM)的在线示例,该示例表明在计算SOM之前数据应该居中并缩放。

但是,我注意到创建的对象似乎有中心和缩放的属性,在这种情况下,我是否真的通过居中和缩放首先应用冗余步骤?下面的示例脚本

# Load package
require(kohonen)

# Set data
data(iris)

# Scale and centre
dt <- scale(iris[, 1:4],center=TRUE)

# Prepare SOM
set.seed(590507)
som1 <- som(dt,
         somgrid(6,6, "hexagonal"),
         rlen=500,
        keep.data=TRUE)

str(som1)

脚本最后一行的输出是:

List of 13
 $ data            :List of 1
  ..$ : num [1:150, 1:4] -0.898 -1.139 -1.381 -1.501 -1.018 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" 
"Petal.Width"
  .. ..- attr(*, "scaled:center")= Named num [1:4] 5.84 3.06 3.76 1.2
  .. .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" 
"Petal.Length" "Petal.Width"
  .. ..- attr(*, "scaled:scale")= Named num [1:4] 0.828 0.436 1.765 0.762
  .. .. ..- attr(*, "names")= chr [1:4] "Sepal.Length" "Sepal.Width" 
"Petal.Length" "Petal.Width"
 $ unit.classif    : num [1:150] 3 5 5 5 4 2 4 4 6 5 ...
 $ distances       : num [1:150] 0.0426 0.0663 0.0768 0.0744 0.1346 ...
 $ grid            :List of 6
  ..$ pts              : num [1:36, 1:2] 1.5 2.5 3.5 4.5 5.5 6.5 1 2 3 4 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : NULL
  .. .. ..$ : chr [1:2] "x" "y"
  ..$ xdim             : num 6
  ..$ ydim             : num 6
  ..$ topo             : chr "hexagonal"
  ..$ neighbourhood.fct: Factor w/ 2 levels "bubble","gaussian": 1
  ..$ toroidal         : logi FALSE
  ..- attr(*, "class")= chr "somgrid"
 $ codes           :List of 1
  ..$ : num [1:36, 1:4] -0.376 -0.683 -0.734 -1.158 -1.231 ...
  .. ..- attr(*, "dimnames")=List of 2
  .. .. ..$ : chr [1:36] "V1" "V2" "V3" "V4" ...
  .. .. ..$ : chr [1:4] "Sepal.Length" "Sepal.Width" "Petal.Length" 
"Petal.Width"
 $ changes         : num [1:500, 1] 0.0445 0.0413 0.0347 0.0373 0.0337 ...
 $ alpha           : num [1:2] 0.05 0.01
 $ radius          : Named num [1:2] 3.61 0
  ..- attr(*, "names")= chr [1:2] "66.66667%" ""
 $ user.weights    : num 1
 $ distance.weights: num 1
 $ whatmap         : int 1
 $ maxNA.fraction  : int 0
 $ dist.fcts       : chr "sumofsquares"
 - attr(*, "class")= chr "kohonen"

请注意,在输出的第7行和第10行中有对中心和比例的引用。我很感激这里有关于这个过程的解释。

1 个答案:

答案 0 :(得分:1)

您的缩放步骤并不是多余的,因为在源代码中没有缩放,您在7和10中看到的属性是来自火车数据集的属性。 要检查这一点,只需运行并比较这段代码的结果:

# Load package
require(kohonen)

# Set data
data(iris)

# Scale and centre
dt <- scale(iris[, 1:4],center=TRUE)
#compare train datasets
str(dt)
str(as.matrix(iris[, 1:4]))

# Prepare SOM
set.seed(590507)
som1 <- kohonen::som(dt,
                     kohonen::somgrid(6,6, "hexagonal"),
            rlen=500,
            keep.data=TRUE)
#without scaling
som2 <- kohonen::som(as.matrix(iris[, 1:4]),
                     kohonen::somgrid(6,6, "hexagonal"),
                     rlen=500,
                     keep.data=TRUE)
#compare results of som function
str(som1)
str(som2)