我有一个data.frame DF如下:
u <- c(14381, 20547, 17172, 17753, 667, 17753, 914, 10802, 3346, 17753,
667, 11113, 914, 914, 17753, 11113, 10802, 20547, 14381, 11113,
139, 17753, 17172, 10802, 14381, 20547, 139, 14381, 17753, 10802,
10802, 139, 11113, 10802, 11113, 3346, 11113, 11113, 11113, 10802,
17172, 20547, 914, 17172, 3346, 139, 11113, 139, 914, 10802,
14381, 10802, 17172, 10802, 3346, 17172, 10802, 20547, 15679, 17753,
11113, 11113, 667, 15679, 667, 1204, 355, 1204, 400, 14351,
16405, 12760, 16405, 12760, 11072, 1204, 14351, 265, 16405, 4993,
400, 355, 16405, 4993, 355, 14351, 14351, 14351, 400, 11021,
11072, 1204, 12760, 265, 12760, 265, 400, 265, 1204, 12760,
16405, 11072, 16405, 1204, 11072, 11021, 265, 11072, 18309, 11021,
18309, 4993, 12760, 1204, 11021, 18309, 18309, 265, 14351, 14351,
12759, 12759, 4993, 11038, 12759, 12759, 11038, 12759, 18309, 18309,
1, 4, 4, 3, 6, 1, 1, 2, 10, 11,
1, 2, 1, 7, 1, 2, 1, 1, 1, 1,
5, 1, 2, 3, 2, 2, 2, 2, 1, 1,
5, 1, 7, 2, 1, 2, 2, 2, 2, 1,
2, 2, 1, 4, 1, 3, 1, 1, 2, 3,
2, 3, 1, 1, 2, 1, 1, 1, 1, 1,
1, 2, 2, 1, 1)
DF <- as.data.frame(matrix(u, ncol = 3, nrow = 65, byrow = FALSE))
现在,我需要构建一个矩阵MAT,如下所示:
问题是,如何从上面的数据框架中有效地构建矩阵?我目前的做法如下:
DF[, 1] <- as.character(DF[, 1]) # turn into characters
DF[, 2] <- as.character(DF[, 2]) # turn into characters
rows <- unique(DF[,1]) # get the row names
cols <- unique(DF[,2]) # get the column names
MAT <- matrix(0, nrow = length(rows), ncol = length(cols)) # prefill with 0's
dimnames(MAT) <- list(rows, cols)
for (i in 1:nrow(DF)) {
MAT[DF[i, 1], DF[i, 2]] <- DF[i, 3]
}
这有效,但似乎效率不高。因为我需要重复这个任务大约10K次,所以效率会得到回报。 如何绕过循环(不断复制MAT)并更有效地完成这项工作?我在想dplyr或data.table,但实际上并不知道如何使用这些包。 有人可以帮忙吗?
答案 0 :(得分:6)
使用tidyr
library(tidyr)
spread(DF, V2, V3, fill = 0)