如何按行更新R data.table的所有列?

时间:2015-06-22 02:25:40

标签: r data.table

我正在尝试通过从其他data.tables中获取片段并将它们组合来创建data.table对象。这是一个简单的例子:

a <- data.frame(x=1:30)
b <- data.frame(x=10:39)
c <- data.frame(x=20:49)

d <- data.frame(x=50:79)
e <- data.frame(x=60:89)
f <- data.frame(x=70:99)

DT <- data.table(matrix(ncol = 3, nrow = 30))
for (i in seq.int(from = 1, to = 30, by = 3)) {
  set(DT,i,.SD,cbind(a[i,],b[i,],c[i,]))
  set(DT,(i+1),.SD,cbind(d[i,],e[i,],f[i,]))
  set(DT,(i+2),.SD,"")
}

然而,这不起作用。我哪里做错了?任何人都可以推荐一种更好的方法来实现这种效果吗?在R中,我总觉得这样有点不安。

所需的输出应该是这样的:(显示前几行)

     x  x  x
 1:  1 10 20
 2: 50 60 70
 3:   
 4:  2 11 21
 5: 51 61 71
 6:  
 7:  3 12 22
 8: 52 62 72
 9:         
10:  4 13 23

3 个答案:

答案 0 :(得分:3)

首先,你有冲突的类,因为你试图将数字和字符分配给相同的列。因此,我将指定NA而不是""

也就是说,这是使用data.table进行一些修改的解决方案,因此它可以运行:

DT <- data.table(matrix(0, ncol = 3, nrow = 30))
j = 1
for (i in seq.int(from = 1, to = 30, by = 3)) {
  DT[i,names(DT):=list(a[j,],b[j,],c[j,]), with = FALSE]
  DT[i+1,names(DT):=list(d[j,],e[j,],f[j,]), with = FALSE]
  DT[(i+2),names(DT):=NA, with = FALSE]
  j = j + 1
}
DT
    V1 V2 V3
 1:  1 10 20
 2: 50 60 70
 3: NA NA NA
 4:  2 11 21
 5: 51 61 71
 6: NA NA NA
 7:  3 12 22
 8: 52 62 72
 9: NA NA NA
10:  4 13 23
11: 53 63 73
12: NA NA NA
13:  5 14 24
14: 54 64 74
15: NA NA NA
16:  6 15 25
17: 55 65 75
18: NA NA NA
19:  7 16 26
20: 56 66 76
21: NA NA NA
22:  8 17 27
23: 57 67 77
24: NA NA NA
25:  9 18 28
26: 58 68 78
27: NA NA NA
28: 10 19 29
29: 59 69 79
30: NA NA NA
    V1 V2 V3

使用apply(而不使用data.table)的另一种解决方案:

df <- apply(cbind(a,b,c,d,e,f), 1, function(x) rbind(data.frame(x=x[1], y=x[2], z=x[3]), 
                                               data.frame(x=x[4], y=x[5], z=x[6]), 
                                               data.frame(x=NA, y = NA, z = NA)))
df <- do.call("rbind", df)

答案 1 :(得分:3)

以@ TimBiegeleisen的答案为基础,由于某种原因被删除:

library(data.table)
pt1 <- data.table(a,b,c)
pt2 <- data.table(d,e,f)
out <- rbind(pt1,pt2)
out[c(rbind(matrix(seq(1,nrow(out)),byrow=TRUE,nrow=2),NA))]

上面out的行索引类似于:1 31 NA 2 32 NA 3 33 NA,因此它抓取每组数据的第一行并将它们放在一起。 NA索引会导致行的所有NA s。

#     x  x  x
# 1:  1 10 20
# 2: 50 60 70
# 3: NA NA NA
# 4:  2 11 21
# 5: 51 61 71
# 6: NA NA NA
# 7:  3 12 22
# 8: 52 62 72
# 9: NA NA NA
#10:  4 13 23
#...

答案 2 :(得分:2)

可能有更有效的方法:

    rows2<-seq.int(1,30,3)
rows3<-1:10
n2<-length(rows3)
h1<-list(a[rows3,],b[rows3,],c[rows3,])
h2<-list(d[rows3,],e[rows3,],f[rows3,])
h3<-list(rep("",n2),rep("",n2),rep("",n2))

DT <- data.table(matrix(0,ncol = 3, nrow = 30))
for (j in 1:3) {
  set(DT,i=rows2,j=j,value=h1[[j]])
  set(DT,i=rows2+1,j=j,value=h2[[j]])
  set(DT,i=rows2+2,j=j,value=h3[[j]])
}
    DT
    V1 V2 V3
 1:  1 10 20
 2: 50 60 70
 3: NA NA NA
 4:  2 11 21
 5: 51 61 71
 6: NA NA NA
 7:  3 12 22
 8: 52 62 72
 9: NA NA NA
10:  4 13 23
11: 53 63 73
12: NA NA NA
13:  5 14 24
14: 54 64 74
15: NA NA NA
16:  6 15 25
17: 55 65 75
18: NA NA NA
19:  7 16 26
20: 56 66 76
21: NA NA NA
22:  8 17 27
23: 57 67 77
24: NA NA NA
25:  9 18 28
26: 58 68 78
27: NA NA NA
28: 10 19 29
29: 59 69 79
30: NA NA NA
    V1 V2 V3