展开计数矩阵

时间:2015-12-21 18:36:22

标签: r

说我有这个矩阵:

set.seed(10)
mat <- matrix(sample(0:3, 25, TRUE), ncol = 5)
rownames(mat) <- month.abb[1:5]
colnames(mat) <- state.name[1:5]

mat

##     Alabama Alaska Arizona Arkansas California
## Jan       2      0       2        1          3
## Feb       1      1       2        0          2
## Mar       1      1       0        1          3
## Apr       2      2       2        1          1
## May       0      1       1        3          1

我正在寻找一种有效的(非基础解决方案欢迎)方式来扩展每个观察/行n次(n由该行中的最大值决定)并填写1或0,如下所示(我是不确定这种技术是否有名称,但如果有人对此扩展有评论,我很想知道它的名称是什么;谷歌搜索更容易。)

##     Alabama Alaska Arizona Arkansas California
## Jan       1      0       1        1          1
## Jan       1      0       1        0          1
## Jan       0      0       0        0          1
## Feb       1      1       1        0          1
## Feb       0      0       1        0          1
## Mar       1      1       0        1          1
## Mar       0      0       0        0          1
## Mar       0      0       0        0          1
## Apr       1      1       1        1          1
## Apr       1      1       1        0          0
## May       0      1       1        1          1
## May       0      0       0        1          0
## May       0      0       0        1          0

3 个答案:

答案 0 :(得分:4)

这也是......某种程度看似有效:

maxs = mat[cbind(seq_len(nrow(mat)), max.col(mat, "first"))]
(mat[rep(seq_len(nrow(mat)), maxs), ] >= sequence(maxs)) + 0L
#    Alabama Alaska Arizona Arkansas California
#Jan       1      0       1        1          1
#Jan       1      0       1        0          1
#Jan       0      0       0        0          1
#Feb       1      1       1        0          1
#Feb       0      0       1        0          1
#Mar       1      1       0        1          1
#Mar       0      0       0        0          1
#Mar       0      0       0        0          1
#Apr       1      1       1        1          1
#Apr       1      1       1        0          0
#May       0      1       1        1          1
#May       0      0       0        1          0
#May       0      0       0        1          0

答案 1 :(得分:3)

我不知道这个扩展被称为什么,但这是一种方法:

expand.row <- function(x) {
  out <- matrix(rep(rep(1:0, times=length(x)), c(rbind(x,max(x)-x))), ncol=length(x))
  colnames(out) <- names(x)
  return(out)
}

mat2 <- do.call(rbind,apply(mat,1,expand.row))
rownames(mat2) <- rep(rownames(mat), apply(mat, 1, max))

##     Alabama Alaska Arizona Arkansas California
## Jan       1      0       1        1          1
## Jan       1      0       1        0          1
## Jan       0      0       0        0          1
## Feb       1      1       1        0          1
## Feb       0      0       1        0          1
## Mar       1      1       0        1          1
## Mar       0      0       0        0          1
## Mar       0      0       0        0          1
## Apr       1      1       1        1          1
## Apr       1      1       1        0          0
## May       0      1       1        1          1
## May       0      0       0        1          0
## May       0      0       0        1          0

希望有人会出现并提供一个知名包装的明显功能,但也许这会让你度过难关。

答案 2 :(得分:2)

这是一个dplyr选项:

library(dplyr)

# Expand the number of rows
mat.exp = mat[rep(rownames(mat), apply(mat, 1, max)),]

# Get the 1s and 0s right
mat.exp = mat.exp %>% as.data.frame %>% add_rownames %>%
  group_by(rowname) %>%
  mutate_each(funs(c(rep(1,.[1]), rep(0,n() - .[1]))))

# Convert back to matrix and add back rownames
mat.exp = as.matrix(mat.exp[,-1])
rownames(mat.exp) = rep(rownames(mat), apply(mat, 1, max))

mat.exp

        Alabama Alaska Arizona Arkansas California
    Jan       1      0       1        1          1
    Jan       1      0       1        0          1
    Jan       0      0       0        0          1
    Feb       1      1       1        0          1
    Feb       0      0       1        0          1
    Mar       1      1       0        1          1
    Mar       0      0       0        0          1
    Mar       0      0       0        0          1
    Apr       1      1       1        1          1
    Apr       1      1       1        0          0
    May       0      1       1        1          1
    May       0      0       0        1          0
    May       0      0       0        1          0