我是R的新手,在计算和与R中的日期进行比较时遇到了一些麻烦。基本上,我需要数据框:
df1< - (2921行)
DateTime
1 2013-06-01 00:00:00
2 2013-06-01 03:00:00
3 2013-06-01 06:00:00
4 2013-06-01 09:00:00
5 2013-06-01 12:00:00
6 2013-06-01 15:00:00
7 2013-06-01 18:00:00
8 2013-06-01 21:00:00
9 2013-06-02 00:00:00
10 2013-06-02 03:00:00
和df2< - (70,816行)
Create.Date.Time Service Closing.Date.Time
1 2013-06-01 12:59:00 AV 2013-06-01 13:59:00
2 2013-06-02 07:56:00 SERVICE684793 2013-06-02 08:59:00
3 2013-06-02 09:39:00 SERVICE684793 2013-06-03 12:01:00
4 2013-06-02 14:14:00 SERVICE684796 2013-06-02 14:55:00
5 2013-06-02 17:20:00 SERVICE684797 2013-06-03 12:06:00
6 2013-06-03 07:20:00 SERVICE684793 2013-06-03 07:39:00
7 2013-06-03 08:02:00 SERVICE684839 2013-06-03 12:09:00
8 2013-06-03 08:04:00 SERVICE684841 2013-06-04 08:05:00
9 2013-06-03 08:04:00 SERVICE684841 2013-06-05 08:06:00
10 2013-06-03 08:08:00 SERVICE684841 2013-06-03 08:08:00
我的任务是为每个df2$Create.Date.time
获取i in df1$DateTime
的累积计数。换句话说,我希望计算df2$Create.Date.Time
小于或等于每个df1$DateTime
的实例数。
例如,对于df1$DateTime = 2013-06-02 18:00:00
,df2$Create.Date.Time
的累积计数为5(Create.Date.Time
中有2013-06-02 18:00:00
早于df$2
的{{1}} })。
我还需要为每项服务做同样的事情。
我已经尝试将日期(所有这些都是类"POSIXct" "POSIXt"
)转换为秒然后进行比较,但我一直遇到奇怪的错误。我将不胜感激任何帮助。
答案 0 :(得分:0)
尝试:
library(lubridate)
df1New <- within(df1, {
Createtime <- period_to_seconds(hms(strftime(DateTime, "%H:%M:%S")))
Date <- as.Date(DateTime)
})
df2New <- within(df2, {
Createtime1 <- period_to_seconds(hms(strftime(Create.Date.Time, "%H:%M:%S")))
Date <- as.Date(Create.Date.Time)
})
df1New$Num.Closed <- unsplit(lapply(split(df1New, df1New$Date), function(x) {
x2 <- df2New[df2New$Date %in% x$Date, ]
unlist(lapply(1:nrow(x), function(i) {
x1 <- x[i, ]
sum(x2$Createtime1 <= x1$Createtime)
}))
}), df1New$Date)
df1New[,-(2:3)]
# DateTime Num.Closed
#1 2013-06-01 00:00:00 0
#2 2013-06-01 03:00:00 0
#3 2013-06-01 06:00:00 0
#4 2013-06-01 09:00:00 0
#5 2013-06-01 12:00:00 0
#6 2013-06-01 15:00:00 1
#7 2013-06-01 18:00:00 1
#8 2013-06-01 21:00:00 1
#9 2013-06-02 00:00:00 0
#10 2013-06-02 03:00:00 0
df1 <- structure(list(DateTime = c("2013-06-01 00:00:00", "2013-06-01 03:00:00",
"2013-06-01 06:00:00", "2013-06-01 09:00:00", "2013-06-01 12:00:00",
"2013-06-01 15:00:00", "2013-06-01 18:00:00", "2013-06-01 21:00:00",
"2013-06-02 00:00:00", "2013-06-02 03:00:00")), .Names = "DateTime", class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10"))
df2 <- structure(list(Create.Date.Time = c("2013-06-01 12:59:00", "2013-06-02 07:56:00",
"2013-06-02 09:39:00", "2013-06-02 14:14:00", "2013-06-02 17:20:00",
"2013-06-03 07:20:00", "2013-06-03 08:02:00", "2013-06-03 08:04:00",
"2013-06-03 08:04:00", "2013-06-03 08:08:00"), Service = c("AV",
"SERVICE684793", "SERVICE684793", "SERVICE684796", "SERVICE684797",
"SERVICE684793", "SERVICE684839", "SERVICE684841", "SERVICE684841",
"SERVICE684841"), Closing.Date.Time = c("2013-06-01 13:59:00",
"2013-06-02 08:59:00", "2013-06-03 12:01:00", "2013-06-02 14:55:00",
"2013-06-03 12:06:00", "2013-06-03 07:39:00", "2013-06-03 12:09:00",
"2013-06-04 08:05:00", "2013-06-05 08:06:00", "2013-06-03 08:08:00"
)), .Names = c("Create.Date.Time", "Service", "Closing.Date.Time"
), class = "data.frame", row.names = c("1", "2", "3", "4", "5",
"6", "7", "8", "9", "10"))