R som软件包Kohonen-更新示例到版本3

时间:2019-07-06 21:03:07

标签: r som

我正在尝试使该示例与Kohnonen R库的版本3一起使用。 https://clarkdatalabs.github.io/soms/SOM_NBA

我试图在那里更新代码,但这是不正确的。我得到的结果与示例大致相同,但是在上一幅图中,我看不到任何分类错误,所以我做错了什么。我想我知道我的错误在哪里,但是我不确定这可能是什么。

# https://clarkdatalabs.github.io/soms/SOM_NBA
# https://github.com/clarkdatalabs/soms/issues?q=is%3Aopen+is%3Aissue


library(kohonen)
library(RColorBrewer)
library(RCurl)

NBA <- read.csv(text = getURL("https://raw.githubusercontent.com/clarkdatalabs/soms/master/NBA_2016_player_stats_cleaned.csv"), 
            sep = ",", header = T, check.names = FALSE)

colnames(NBA)

NBA.measures1 <- c("FTA", "2PA", "3PA")
NBA.SOM1 <- som(scale(NBA[NBA.measures1]), grid = somgrid(6, 4, "rectangular"))
plot(NBA.SOM1)

colors <- function(n, alpha = 1) {
rev(heat.colors(n, alpha))
}

plot(NBA.SOM1, type = "counts", palette.name = colors, heatkey = TRUE)

par(mfrow = c(1, 2))
plot(NBA.SOM1, type = "mapping", pchs = 20, main = "Mapping Type SOM")
plot(NBA.SOM1, main = "Default SOM Plot")

NBA.SOM2 <- som(scale(NBA[NBA.measures1]), grid = somgrid(6, 6, "hexagonal", toroidal=TRUE) )

par(mfrow = c(1, 2))
plot(NBA.SOM2, type = "mapping", pchs = 20, main = "Mapping Type SOM")
plot(NBA.SOM2, main = "Default SOM Plot")
plot(NBA.SOM2, type = "dist.neighbours", palette.name = terrain.colors)

NBA.measures2 <- c("FTA", "FT", "2PA", "2P", "3PA", "3P", "AST", "ORB", "DRB", 
               "TRB", "STL", "BLK", "TOV")

training_indices <- sample(nrow(NBA), 200)
NBA.training <- scale(NBA[training_indices, NBA.measures2])
NBA.testing <- scale(NBA[-training_indices, NBA.measures2], center = attr(NBA.training, 
"scaled:center"), scale = attr(NBA.training, "scaled:scale"))

NBA.SOM3 <- xyf(NBA.training, classvec2classmat(NBA$Pos[training_indices]), 
            grid = somgrid(13, 13, "hexagonal", toroidal = TRUE), rlen = 100, 
user.weights = 0.5)

pos.prediction <- predict(NBA.SOM3, newdata = NBA.testing, whatmap = 1)
table(NBA[-training_indices, "Pos"], pos.prediction$prediction[[2]])

NBA.SOM4 <- xyf(scale(NBA[, NBA.measures2]), classvec2classmat(NBA[, "Pos"]), 
            grid = somgrid(13, 13, "hexagonal", toroidal = TRUE), rlen = 300, 
user.weights = 0.7)

par(mfrow = c(1, 2))
plot(NBA.SOM4, type = "codes", main = c("Codes X", "Codes Y"))
NBA.SOM4.hc <- cutree(hclust(dist(getCodes(NBA.SOM4, 2))), 5)
add.cluster.boundaries(NBA.SOM4, NBA.SOM4.hc)

bg.pallet <- c("red", "blue", "yellow", "purple", "green")

# make a vector of just the background colors for all map cells

#I think my error is in this line...
position.predictions <- classmat2classvec(predict(NBA.SOM4)$unit.predictions[[2]])


base.color.vector <- bg.pallet[match(position.predictions, levels(NBA$Pos))]

# set alpha to scale with maximum confidence of prediction
bgcols <- c()
max.conf <- apply(getCodes(NBA.SOM4, 2), 1, max)
for (i in 1:length(base.color.vector)) {
  bgcols[i] <- adjustcolor(base.color.vector[i], max.conf[i])
}

par(mar = c(0, 0, 0, 4), xpd = TRUE)
plot(NBA.SOM4, type = "mapping", pchs = 21, col = "black", bg = 
bg.pallet[match(NBA$Pos, 
levels(NBA$Pos))], bgcol = bgcols)

legend("topright", legend = levels(NBA$Pos), text.col = bg.pallet, bty = "n", 
   inset = c(-0.03, 0))

1 个答案:

答案 0 :(得分:0)

kohonen软件包使用一些随机选择的训练成员通过初始化其nodes属性来构建模型。因此,很少有人会获得与其他人一样的确切最终节点排列。尽管如此,属性值仍将相同,只是排列方式不同。至少,我认为是这样。为了获得精确的排列,应在同一随机种子数生成器下运行两个kohonen模型,即使用set.seed()函数。 在您已经提供的代码中,变量“ position.prediction”包含一些NA值。我认为,如果在分配给“ position.prediction”后再增加一行以省略NA值,则节点背景将全部填充有已经预定义的调色板。因此脚本将是:

# this is your script
position.predictions <- classmat2classvec(predict(NBA.SOM4)$unit.predictions[[2]])

# add this below and continue
position.predictions <- na.omit(position.predictions)

我认为,由于kohonen无法识别其输入模式,因此返回了NA值。