我有一个像这样的数据框:
thedata <- data.frame(group= c(0,0,0,0,0,1,1,1,1,1)
,experiment = c(0,0,1,1,1,0,0,1,1,1)
,time = c(1,2,3,4,5,1,2,3,4,5))
我希望将此数据帧拆分为具有连续时间元素的数据帧列表,因此列表中第一个数据帧的输出将如下所示:
$`1`
group experiment time
0 0 1
0 0 2
0 1 3
1 0 1
1 0 2
1 1 3
列表中的第二个数据框:
$`2`
group experiment time
0 0 1
0 0 2
0 1 3
0 1 4
1 0 1
1 0 2
1 1 3
1 1 4
第三个数据帧:
$`3`
group experiment time
0 0 1
0 0 2
0 1 3
0 1 4
0 1 5
1 0 1
1 0 2
1 1 3
1 1 4
1 1 5
如上所述,&#39;分裂&#39;只有在“实验”时才开始发生。 = 1。
目标是在此列表上运行回归(使用不同但结构相似的数据)。
答案 0 :(得分:0)
这是一个非常难看的解决方案
library(data.table)
setDT(thedata) # convert to data.table
thedata[, split := cumsum(experiment), by = group] # create a splitting variable
thelist = split(thedata, thedata$split) # split the data frame into experiments
newlist = list(rbind(thelist[[1]], thelist[[(2)]])) # start to append list elements
for(i in 2:(length(thelist) - 1)){
# append each new experiment to the old appended data frame in the newlist
newlist[[i]] = rbind(newlist[[i-1]], thelist[[(i+1)]])
}
结果
newlist
[[1]]
group experiment time split
1: 0 0 1 0
2: 0 0 2 0
3: 1 0 1 0
4: 1 0 2 0
5: 0 1 3 1
6: 1 1 3 1
[[2]]
group experiment time split
1: 0 0 1 0
2: 0 0 2 0
3: 1 0 1 0
4: 1 0 2 0
5: 0 1 3 1
6: 1 1 3 1
7: 0 1 4 2
8: 1 1 4 2
[[3]]
group experiment time split
1: 0 0 1 0
2: 0 0 2 0
3: 1 0 1 0
4: 1 0 2 0
5: 0 1 3 1
6: 1 1 3 1
7: 0 1 4 2
8: 1 1 4 2
9: 0 1 5 3
10: 1 1 5 3