说我有这个数据帧:
id time
1 A 1
2 D 1
3 E 3
4 H 1
5 I 4
6 J 3
7 L 4
8 M 5
9 N 6
10 O 5
11 P 6
12 Q 7
13 R 7
14 S 2
15 T 6
16 U 8
17 V 4
18 W 2
我希望将其转换为具有8行和18列的二进制矩阵(数据帧中的ID数)。矩阵应以全零开始。 'time'中的值指的是第一行,其中每列可以出现'1'(字母的顺序是指列的编号,所以在这种情况下,A = 1,D = 2,H = 4等)。在列中出现1之后,它应该自动填充到第8行。
我想出了这个笨拙的代码,但它涉及循环,我不得不认为我错过了一个更优雅的解决方案。
tmp1 <- unlist(tmp$time)
out <- matrix(0, nrow(tmp), 8)
for(i in 1:nrow(tmp)){ out[i,tmp1[i]]<-1}
out <- apply(out,1,cumsum)
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18]
[1,] 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
[2,] 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1
[3,] 1 1 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1
[4,] 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 1 1
[5,] 1 1 1 1 1 1 1 1 0 1 0 0 0 1 0 0 1 1
[6,] 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 1
[7,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1
[8,] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
答案 0 :(得分:2)
您的数据:
tmp <- data.frame(id = c("A", "D", "E", "H", "I", "J", "L", "M", "N", "O", "P", "Q", "R", "S", "T", "U", "V", "W"),
time = c(1L, 1L, 3L, 1L, 4L, 3L, 4L, 5L, 6L, 5L, 6L, 7L, 7L, 2L, 6L, 8L, 4L, 2L),
stringsAsFactors = FALSE)
稍微简单一些:
out2 <- sapply(tmp$time, function(i) c(rep(0, i-1), rep(1,8-i+1)))
与您的输出相同(并且快一点)。
答案 1 :(得分:1)
以下一种方法是从 qdapTools 中唱出我的mtabulate
:
library(qdapTools)
t(mtabulate(lapply(split(dat$time, dat$id), `:`, length(unique(dat$time)))))
## A D E H I J L M N O P Q R S T U V W
## 1 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0
## 2 1 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 1
## 3 1 1 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 1
## 4 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 1 1
## 5 1 1 1 1 1 1 1 1 0 1 0 0 0 1 0 0 1 1
## 6 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 0 1 1
## 7 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 1 1
## 8 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1