2D矩阵的不规则聚合

时间:2014-11-25 11:08:54

标签: r matrix binning

我正试图在R中对不规则间隔的对称矩阵进行合并,但我不确定如何继续。我的想法是:

  • 将矩阵重新整形为长格式,聚合并将其投回去?
  • Bin as-is in two dimensions(以某种方式...... tapply,aggregate?)
  • 保持常规分箱,但是对于我的每个(较大的)不规则分档,用它们的总和替换所有内部值?

以下是我正在尝试做的一个例子:

set.seed(42)

# symmetric matrix
a <- matrix(rpois(1e4, 2), 100)
a[upper.tri(a)] <- t(a)[upper.tri(a)]

image(x=1:100, y=1:100, a, asp=1, frame=F, axes=F)

# vector of irregular breaks for binning
breaks <- c(12, 14, 25, 60, 71, 89)

# white line show the desired bins
abline(h=breaks-.5, lwd=2, col="white")
abline(v=breaks-.5, lwd=2, col="white")

symmMat

(目标是上面绘制的每个矩形都根据其中的值的总和来填充。)我很感激任何关于如何最好地解决这个问题的指示。

1 个答案:

答案 0 :(得分:1)

This answer使用tapply提供了一个很好的起点:

b <- melt(a)

bb <- with(b, tapply(value, 
    list(
      y=cut(Var1, breaks=c(0, breaks, Inf), include.lowest=T), 
      x=cut(Var2, breaks=c(0, breaks, Inf), include.lowest=T)
    ),
    sum)
)

bb
#          x
# y          [0,12] (12,14] (14,25] (25,60] (60,71] (71,89] (89,Inf]
#  [0,12]      297      48     260     825     242     416      246
#  (12,14]      48       3      43     141      46      59       33
#  (14,25]     260      43     261     794     250     369      240
#  (25,60]     825     141     794    2545     730    1303      778
#  (60,71]     242      46     250     730     193     394      225
#  (71,89]     416      59     369    1303     394     597      369
#  (89,Inf]    246      33     240     778     225     369      230

然后可以使用基本图和rect将这些绘制成矩形框 - 即:

library("reshape2")
library("magrittr")

bsq <- melt(bb)

# convert range notation to numerics
getNum <- . %>%
  # rm brackets
  gsub("\\[|\\(|\\]|\\)", "", .) %>%
  # split digits and convert
  strsplit(",") %>%
  unlist %>% as.numeric

y <- t(sapply(bsq[,1], getNum))
x <- t(sapply(bsq[,2], getNum))

# normalise bin intensity by area
bsq$size <- (y[,2] - y[,1]) * (x[,2] - x[,1])
bsq$norm <- bsq$value / bsq$size

# draw rectangles on top of empty plot
plot(1:100, 1:100, type="n", frame=F, axes=F)
rect(ybottom=y[,1], ytop=y[,2],
     xleft=x[,1], xright=x[,2], 
     col=rgb(colorRamp(c("white", "steelblue4"))(bsq$norm / max(bsq$norm)), 
             alpha=255*(bsq$norm / max(bsq$norm)), max=255),
     border="white")

enter image description here