通过填充NaN R来创建同质矩阵

时间:2017-04-08 23:28:38

标签: r matrix

我有几个不同大小的矩阵,订单略有不同。我正在尝试组织矩阵,以便我可以平均它们。最简单的方法(我认为)是创建相等的矩阵,然后使用之前建议的解决方案之一,例如, Reduce("+", my.list) / length(my.list)

我在想,有可能创建一个10x10的模板矩阵,然后将每个矩阵应用到模板中,这样如果应用的矩阵不是10x10(例如它的4x4),矩阵的其余部分将填充NaN。我提供了三个示例矩阵和三个矩阵,看起来我希望输出看起来。

三个样本矩阵:

           0         1  2         3    4 5 6         7   8 9
0  0.7134503 0.0000000  0 0.0000000 0.00 0 0 0.0000000 0.0 0
1  0.6800000 0.0000000  0 0.0000000 0.00 0 0 0.0000000 0.0 0
2  0.2352941 0.2941176  0 0 0.0000000 0.00 0 0.4117647 0.0 0
3  0.3333333 0.0000000  0 0.0000000 0.00 0 0 0.0000000 0.2 0
4  0.0000000 0.0000000  0 0.0000000 0.00 0 0 0.0000000 0.0 0
5  0.5000000 0.0000000  0 0.0000000 0.25 0 0 0.0000000 0.0 0
6  0.6000000 0.4000000  0 0.0000000 0.00 0 0 0.0000000 0.0 0
7  0.5250000 0.0000000  0 0.0000000 0.00 0 0 0.0000000 0.0 0
8  0.6060606 0.0000000  0 0.2121212 0.00 0 0 0.0000000 0.0 0
9  0         0          0 0         0    0 0 0         0   0

          0   1         2         3         4 5 7   8 9
0 0.5550000 0.0 0.0000000 0.2200000 0.0000000 0 0 0.0 0
1 0.6363636 0.0 0.2727273 0.0000000 0.0000000 0 0 0.0 0
2 0.4516129 0.0 0.0000000 0.2580645 0.0000000 0 0 0.0 0
3 0.4150943 0.0 0.0000000 0.3679245 0.0000000 0 0 0.0 0
4 0.7647059 0.0 0.0000000 0.2352941 0.0000000 0 0 0.0 0
5 0.4285714 0.0 0.0000000 0.0000000 0.0000000 0 0 0.0 0
7 0.2000000 0.2 0.2000000 0.2000000 0.0000000 0 0 0.2 0
8 0.3000000 0.0 0.0000000 0.7000000 0.0000000 0 0 0.0 0
9 0.5555556 0.0 0.0000000 0.0000000 0.2222222 0 0 0.0 0

          0 2         3 4 7 8
0 0.4020101 0 0.5075377 0 0 0
2 0.0000000 0 0.0000000 0 0 0
3 0.6322581 0 0.2322581 0 0 0
4 0.0000000 0 0.0000000 0 0 0
7 0.0000000 0 0.0000000 0 0 0
8 0.4883721 0 0.3488372 0 0 0

期望的输出:

           0         1 2  3         4    5 6 7         8   9
0  0.7134503 0.0000000  0 0 0.0000000 0.00 0 0 0.0000000 0.0
1  0.6800000 0.0000000  0 0 0.0000000 0.00 0 0 0.0000000 0.0
2  0.2352941 0.2941176  0 0 0.0000000 0.00 0 0 0.4117647 0.0
3  0.3333333 0.0000000  0 0 0.0000000 0.00 0 0 0.0000000 0.2
4  0.0000000 0.0000000  0 0 0.0000000 0.00 0 0 0.0000000 0.0
5  0.5000000 0.0000000  0 0 0.0000000 0.25 0 0 0.0000000 0.0
6  0.6000000 0.4000000  0 0 0.0000000 0.00 0 0 0.0000000 0.0
7  0.5250000 0.0000000  0 0 0.0000000 0.00 0 0 0.0000000 0.0
8  0.6060606 0.0000000  0 0 0.2121212 0.00 0 0 0.0000000 0.0
9  0.7272727 0.0000000  0 0 0.0000000 0.00 0 0 0.0000000 0.0

          0   1         2         3         4 5 6 7   8 9
0 0.5550000 0.0 0.0000000 0.2200000 0.0000000 0 NA 0.0 0
1 0.6363636 0.0 0.2727273 0.0000000 0.0000000 0 NA 0.0 0
2 0.4516129 0.0 0.0000000 0.2580645 0.0000000 0 NA 0.0 0
3 0.4150943 0.0 0.0000000 0.3679245 0.0000000 0 NA 0.0 0
4 0.7647059 0.0 0.0000000 0.2352941 0.0000000 0 NA 0.0 0
5 0.4285714 0.0 0.0000000 0.0000000 0.0000000 0 NA 0.0 0
6 NA        NA  NA        NA        NA        NANA NA  NA
7 0.2000000 0.2 0.2000000 0.2000000 0.0000000 0 NA 0.2 0
8 0.3000000 0.0 0.0000000 0.7000000 0.0000000 0 NA 0.0 0
9 0         0   0         0         0         0 NA 0   0

          0 1  2         3 4 5 6 7 8 9
0 0.4020101 NA 0 0.5075377 0 NANA0 0 NA
1    NA     NA NA     NA   NANANANANANA  
2 0.0000000 NA 0 0.0000000 0 0 0NANANA
3 0.6322581 NA 0 0.2322581 0 0 0NANANA
4 0.0000000 NA 0 0.0000000 0 0 0NANANA
5     NA    NANA      NA   NANA NA NA NA
6     NA    NANA      NA   NANA NA NA NA
7 0.0000000 NA 0 0.0000000 0 0 0NANANA
8 0.4883721 NA 0 0.3488372 0 0 0NANANA
9     NA    NANA      NA   NANA NA NA NA

1 个答案:

答案 0 :(得分:2)

快速方法:在列表中获取一组唯一列和rownames。创建具有这些维度的新矩阵,然后使用子集机制(按行和列名称)分配值。

# some dummy data
m1 <- matrix(1:4, 2, dimnames=list(0:1, c(0,3)))
m2 <- matrix(1:9, 3, dimnames=list(0:2, 0:2))
lst <- list(m1, m2)
#> lst
#[[1]]
#  0 3
#0 1 3
#1 2 4

#[[2]]
#  0 1 2
#0 1 4 7
#1 2 5 8
#2 3 6 9

# Get unique col and row names
nc <- sort(unique(unlist(lapply(lst, colnames))))
nr <- sort(unique(unlist(lapply(lst, rownames))))

# loop through matrices
lst2 <- lapply(lst , function(x) {
  out = matrix(NA, ncol=length(nc), nrow=length(nr), dimnames=list(nr, nc))
  idx = as.matrix(expand.grid(rownames(x), colnames(x)))
  out[idx] <- x
  out
  })
# lst2
#[[1]]
#   0  1  2  3
#0  1 NA NA  3
#1  2 NA NA  4
#2 NA NA NA NA

#[[2]]
#  0 1 2  3
#0 1 4 7 NA
#1 2 5 8 NA
#2 3 6 9 NA

关于您使用Reduce("+", my.list) / length(my.list)的一条评论是,如果有NA,那么总和不会像(我认为)那样工作。但是可以通过

获得它们
s <- simplify2array(lst2)
rowMeans(s, dim=2, na.rm = TRUE)
#  0 1 2   3
#0 1 4 7   3
#1 2 5 8   4
#2 3 6 9 NaN

获得手段的另一种方法

d <- Reduce(function(...) merge(..., by=c("Var1", "Var2"), all=TRUE), lapply(lst, reshape2::melt))
v <- rowMeans(d[-(1:2)], na.rm = TRUE)
xtabs(v ~ Var1 + Var2, data=d)
#    Var2
#Var1 0 1 2 3
#   0 1 4 7 3
#   1 2 5 8 4
#   2 3 6 9 0