我有一个相当大的表,这是一个示例:
dput(data)
structure(list(ID = 1:5, area = c(6L, 6L, 6L, 6L, 6L), ERC = c(1L,
1L, 1L, 1L, 1L), a = c(33, 34, 35, 38, 39), b = c(38, 41, 45,
8, NA), c = c(53, 35, 38, 39, 53), d = c(32, 33, 65, 36, 34)), .Names = c("ID",
"area", "ERC", "a", "b", "c", "d"), row.names = c(NA, -5L), class = "data.frame")
我想创建一个观察缺席/存在的矩阵。结果应如下所示:
dput(result)
structure(list(ID = c(1, 2), `32` = c(1, 0), `33` = c(1, 1),
`34` = c(0, 1), `35` = c(0, 1), `37` = c(0, 0), `38` = c(0,
0), `41` = c(1, 0), `53` = c(0, 1), `54` = c(1, 0)), .Names = c("ID",
"32", "33", "34", "35", "37", "38", "41", "53", "54"), row.names = c(NA,
-2L), class = "data.frame")
由于第I列有因子(观察),1表示存在,0表示缺席。
有没有办法在不为每列生成简单的矩阵列的情况下进行此操作?
答案 0 :(得分:2)
library(reshape2)
dcast(melt(DF, id.vars = c("ID", "area", "ERC")),
ID ~ value, fill = 0, fun.aggregate = length)
# ID 8 32 33 34 35 36 38 39 41 45 53 65 NA
#1 1 0 1 1 0 0 0 1 0 0 0 1 0 0
#2 2 0 0 1 1 1 0 0 0 1 0 0 0 0
#3 3 0 0 0 0 1 0 1 0 0 1 0 1 0
#4 4 1 0 0 0 0 1 1 1 0 0 0 0 0
#5 5 0 0 0 1 0 0 0 1 0 0 1 0 1
答案 1 :(得分:1)
这里有一种使用基本R的可能方法。
输入数据:
input
ID area ERC a b c d
1 1 6 1 33 38 53 32
2 2 6 1 34 41 35 33
3 3 6 1 35 45 38 65
4 4 6 1 38 8 39 36
5 5 6 1 39 NA 53 34
使用f
:
apply
unique_n<-unique(as.numeric(unlist(input[,c(4:7)])))
f<-function(input_1,unique_n)
{
return(as.numeric(unique_n %in% input_1[c(4:7)]))
}
count_n<-t(apply(input,1,f,unique_n=unique_n))
colnames(count_n)<-unique_n
cbind(ID=input[,1],count_n)
ID 33 34 35 38 39 41 45 8 <NA> 53 32 65 36
[1,] 1 1 0 0 1 0 0 0 0 0 1 1 0 0
[2,] 2 1 1 1 0 0 1 0 0 0 0 0 0 0
[3,] 3 0 0 1 1 0 0 1 0 0 0 0 1 0
[4,] 4 0 0 0 1 1 0 0 1 0 0 0 0 1
[5,] 5 0 1 0 0 1 0 0 0 1 1 0 0 0