data.table to long基于一列向量重复另一列

时间:2015-08-13 18:42:19

标签: r data.table

我有一个data.table,其中有两列是矢量列表。

   x     y        z
1: 1 1,2,3  8, 9,10
2: 2   5,6        3
3: 3 18,19      1,2

我希望通过一个向量列表(z`)进行拉伸和取消列表,但相应地保留并重复另一列是向量列表。我saw how to almost do this但收到错误,如下所示:

library(data.table)

dat <- data.frame(
    x = 1:3,
    stringsAsFactors = FALSE
)

dat[['y']] <- list(1:3, 5:6, 18:19)
dat[['z']] <- list(8:10, 3, 1:2)

setDT(dat)

# inefficient way to get what I want 
a <- unlist(dat[['z']]) 
dat <- dat[rep(1:nrow(dat), sapply(z, length)), ]
dat[['z']] <- a

dat

   x     y  z
1: 1 1,2,3  8
2: 1 1,2,3  9
3: 1 1,2,3 10
4: 2   5,6  3
5: 3 18,19  1
6: 3 18,19  2

# trying to do this the data.table way
# Works but dropped column
dat[, .(z = as.integer(unlist(z))), by = .(x)]

# does not work (gives error)
dat[, .(z = as.integer(unlist(z))), by = .(x, y)]

Error in `[.data.table`(dat, , .(z = as.integer(unlist(z))), by = .(x,  : 
  column or expression 2 of 'by' or 'keyby' is type list. Do not quote column names. Usage: DT[,sum(colC),by=list(colA,month(colB))]

1 个答案:

答案 0 :(得分:6)

只需将y列添加到j-expression

即可
dat[, .(y, z = as.integer(unlist(z))), by = x]
#   x     y  z
#1: 1 1,2,3  8
#2: 1 1,2,3  9
#3: 1 1,2,3 10
#4: 2   5,6  3
#5: 3 18,19  1
#6: 3 18,19  2