无监督的SOM可视化

时间:2013-06-20 14:08:53

标签: r self-organizing-maps

需要绘制无监督矩形SOM模型的结果。附加要求:1)将每个节点绘制成具有相应观察类别的饼图;图表的大小应反映节点中的样本数。默认plot.kohonen并不适合这种情况。

1 个答案:

答案 0 :(得分:1)

这是一个可能的解决方案。第一个函数som.prep.df是从第二个' som.draw'中调用的,它只有两个参数SOM模型和观察到的训练集类。

som.prep.df <- function(som.model, obs.classes, scaled) {
  require(reshape2)
  lev <- factor(wine.classes)
  df <- data.frame(cbind(unit=som.model$unit.classif, class=as.integer(lev)))
  # create table
  df2 <- data.frame(table(df))
  df2 <- dcast(df2, unit ~ class, value.var="Freq")
  df2$unit <- as.integer(df2$unit)
  # calc sum
  df2$sum <- rowSums(df2[,-1])
  # calc fraction borders of classes in each node
  tmp <- data.frame(cbind(X0=rep(0,nrow(df2)), 
                          t(apply(df2[,-1], 1, function(x) {
                            cumsum(x[1:(length(x)-1)]) / x[length(x)]
                          }))))
  df2 <- cbind(df2, tmp)
  df2 <- melt(df2, id.vars=which(!grepl("^\\d$", colnames(df2))))
  df2 <- df2[,-ncol(df2)]
  # define border for each classs in each node
  tmp <- t(apply(df2, 1, function(x) {
    c(x[paste0("X", as.character(as.integer(x["variable"])-1))], 
      x[paste0("X", as.character(x["variable"]))])
  }))
  tmp <- data.frame(tmp, stringsAsFactors=FALSE)
  tmp <- sapply(tmp, as.numeric)
  colnames(tmp) <- c("ymin", "ymax")
  df2 <- cbind(df2, tmp)
  # scale size of pie charts
  if (is.logical(scaled)) {
    if (scaled) {
      df2$xmax <- log2(df2$sum)
    } else {
      df2$xmax <- df2$sum
    }
  }
  df2 <- df2[,c("unit", "variable", "ymin", "ymax", "xmax")]
  colnames(df2) <- c("unit", "class", "ymin", "ymax", "xmax")
  # replace classes with original levels names
  df2$class <- levels(lev)[df2$class]
  return(df2)
}

som.draw <- function(som.model, obs.classes, scaled=FALSE) {
  # scaled - make or not a logarithmic scaling of the size of each node
  require(ggplot2)
  require(grid)
  g <- som.model$grid
  df <- som.prep.df(som.model, obs.classes, scaled)
  df <- cbind(g$pts, df[,-1])
  df$class <- factor(df$class)
  g <- ggplot(df, aes(fill=class, ymax=ymax, ymin=ymin, xmax=xmax, xmin=0)) +
    geom_rect() +
    coord_polar(theta="y") +
    facet_wrap(x~y, ncol=g$xdim, nrow=g$ydim) + 
    theme(axis.ticks = element_blank(), 
          axis.text.y = element_blank(),
          axis.text.x = element_blank(),
          panel.margin = unit(0, "cm"),
          strip.background = element_blank(),
          strip.text = element_blank(),
          plot.margin = unit(c(0,0,0,0), "cm"),
          panel.background = element_blank(),
          panel.grid = element_blank())
  return(g)
}

用法示例。

require(kohonen)
data(wines)
som.wines <- som(scale(wines), grid = somgrid(5, 5, "rectangular"))

# Non-scaled map
som.draw(som.wines, wine.classes)

enter image description here

# Scaled map
som.draw(som.wines, wine.classes, TRUE)

enter image description here

此功能也可用于监督模型的可视化。但它仅适用于矩形地图。希望这会对某人有所帮助。

有几种可能的改进:

  1. 选择比对数更好的缩放功能。因为现在具有单个样本的节点在缩放后变得不可见。
  2. 将图例添加到整个图中,这将反映节点的大小。
  3. 或者在每个图表上添加有关节点填充的信息。
  4. PS。代码不是很优雅,所以欢迎任何建议和改进。