我有一个数据框
mydata = data.table(MyTimes = as.POSIXct(c("2015-01-01 00:00:03","2015-01-01 00:00:04","2015-01-01 00:00:18","2015-01-01 00:00:48","2015-01-01 00:00:48","2015-01-01 00:00:54","2015-01-01 00:01:12","2015-01-01 00:01:45"),tz = "GMT"),othercol= c(1,2,3,4,5,6,7))
mydata
MyTimes othercol
1: 2015-01-01 00:00:03 1
2: 2015-01-01 00:00:04 2
3: 2015-01-01 00:00:18 3
4: 2015-01-01 00:00:48 4
5: 2015-01-01 00:00:48 5
6: 2015-01-01 00:00:54 6
7: 2015-01-01 00:01:12 7
8: 2015-01-01 00:01:45 1
数据按时间排序,我想将这个数据帧分成2个数据帧,有两个条件:
所以在这个例子中有8行,我想在中间打破它。每个4行但是通知00:00:48将在两个数据帧中,并且基于上面的点#2是不可能的。这意味着当你休息时,你不能打破相同的秒。
所以这里的数据框可能是
data frame 1:
MyTimes othercol
2015-01-01 00:00:03 1
2015-01-01 00:00:04 2
2015-01-01 00:00:18 3
2015-01-01 00:00:48 4
2015-01-01 00:00:48 5
data frame 2:
2015-01-01 00:00:54 6
2015-01-01 00:01:12 7
2015-01-01 00:01:45 1
或者它可以是这样的:
data frame1:
2015-01-01 00:00:03 1
2015-01-01 00:00:04 2
2015-01-01 00:00:18 3
data frame2:
2015-01-01 00:00:48 4
2015-01-01 00:00:48 5
2015-01-01 00:00:54 6
2015-01-01 00:01:12 7
2015-01-01 00:01:45 1
无论哪种方式,00:00:48都在同一个数据框中
答案 0 :(得分:1)
这个怎么样?
split(mydata, as.numeric(mydata$MyTimes) < median(as.numeric(mydata$MyTimes)))
$`FALSE`
MyTimes secondcol
1: 2015-01-01 00:00:48 4
2: 2015-01-01 00:00:48 5
3: 2015-01-01 00:00:54 6
4: 2015-01-01 00:01:12 7
5: 2015-01-01 00:01:45 8
$`TRUE`
MyTimes secondcol
1: 2015-01-01 00:00:03 1
2: 2015-01-01 00:00:04 2
3: 2015-01-01 00:00:18 3
答案 1 :(得分:0)
不像@DatamineR的解决方案那么优雅,但使用游程编码的替代方案是
library(data.table)
mydata[, grp := rleid(MyTimes)] ## put times into groups
split(mydata, mydata$grp >= ceiling(max(mydata$grp)/2))
$`FALSE`
MyTimes othercol grp
1: 2015-01-01 00:00:03 1 1
2: 2015-01-01 00:00:04 2 2
3: 2015-01-01 00:00:18 3 3
$`TRUE`
MyTimes othercol grp
1: 2015-01-01 00:00:48 4 4
2: 2015-01-01 00:00:48 5 4
3: 2015-01-01 00:00:54 6 5
4: 2015-01-01 00:01:12 7 6
5: 2015-01-01 00:01:45 8 7