我有一些与Subsetting a data.table using another data.table和Subset a data.table by matching columns of another data.table
相同的问题 dt
是一样的。
dt
id year event
1: 2 2005 1
2: 2 2006 1
3: 2 2007 1
4: 4 2008 1
5: 4 2009 1
6: 2 2005 0
7: 4 2006 0
8: 4 2007 0
9: 2 2008 0
dt <- data.table(id = c(2,2,2,4,4,2,4,4,2), year = c(2005:2009,2005:2008),
event = rep(1:0, times=c(5, 4)))
但是,dt1
有点不同
dt1
year performance event
1: 2005 1000 1
2: 2006 1001 1
3: 2007 1002 1
4: 2008 1003 1
5: 2009 1004 1
6: 2005 1005 0
7: 2006 1006 0
8: 2007 1007 0
9: 2008 1008 0
dt1 <- data.table(year = c(2005:2009,2005:2008), performance = 1000:1008,
event = rep(1:0, times=c(5, 4)))
我希望根据dt1
dt
和事件分组id
。期望的输出将是两个不同的data.tables:
dt1.sub1
year performance event
1: 2005 1000 1
2: 2006 1001 1
3: 2007 1002 1
4: 2005 1005 0
5: 2008 1008 0
dt1.sub2
year performance event
1: 2008 1003 1
2: 2009 1004 1
3: 2006 1006 0
4: 2007 1007 0
有没有办法在不使用合并的情况下实现这一目标?
答案 0 :(得分:2)
我们可以使用split
创建list
'data.tables'。
lst <- split(dt1, dt$id)
names(lst) <- paste0('dt1.sub', seq_along(lst))
lst
#$dt1.sub1
# year performance event
#1: 2005 1000 1
#2: 2006 1001 1
#3: 2007 1002 1
#4: 2005 1005 0
#5: 2008 1008 0
#$dt1.sub2
# year performance event
#1: 2008 1003 1
#2: 2009 1004 1
#3: 2006 1006 0
#4: 2007 1007 0
最好在list
内工作。但是,如果确实需要,则可以使用data.table
list2env
个对象
list2env(lst, envir = .GlobalEnv)
答案 1 :(得分:2)
dt[dt1, on = c('year', 'event')][, .(list(.SD)), by = id]$V1
#[[1]]
# year event performance
#1: 2005 1 1000
#2: 2006 1 1001
#3: 2007 1 1002
#4: 2005 0 1005
#5: 2008 0 1008
#
#[[2]]
# year event performance
#1: 2008 1 1003
#2: 2009 1 1004
#3: 2006 0 1006
#4: 2007 0 1007