R中的日期差异计算

时间:2016-07-25 18:15:50

标签: r date time

我有奇怪的格式化日期和时间数据,需要计算R中的差异。我们将非常感谢您的帮助。谢谢。

TimeStart           TimeEnd
May  1 2016  1:00AM May  1 2016  1:28AM
May  1 2016  1:01AM May  1 2016  1:21AM
May  1 2016  1:00PM May  1 2016  1:13PM
May  1 2016  1:00PM May  4 2016  5:42PM
May  1 2016  1:02PM May  1 2016  1:37PM
May  1 2016  1:02PM May  1 2016  1:14PM
May  1 2016  1:02PM May  1 2016  1:39PM
May  1 2016  1:02PM May  1 2016  1:18PM 

3 个答案:

答案 0 :(得分:0)

查看?strptime以了解如何设置日期/时间对象的格式。

library(data.table)
dat <- read.table(text = "May  1 2016  1:00AM May  1 2016  1:28AM
                   May  1 2016  1:01AM May  1 2016  1:21AM
                   May  1 2016  1:00PM May  1 2016  1:13PM
                   May  1 2016  1:00PM May  4 2016  5:42PM
                   May  1 2016  1:02PM May  1 2016  1:37PM
                   May  1 2016  1:02PM May  1 2016  1:14PM
                   May  1 2016  1:02PM May  1 2016  1:39PM
                   May  1 2016  1:02PM May  1 2016  1:18PM")

dat2 <- setDT(dat)[ , list(start = paste(V1, V2, V3, V4),
                           end = paste(V5, V6, V7, V8))]
dat2[] <- lapply(dat2, as.POSIXct, format = "%B %d %Y %H:%M%p")
dat2[ , diff := end - start]
dat2
#                  start                 end      diff
# 1: 2016-05-01 01:00:00 2016-05-01 01:28:00   28 mins
# 2: 2016-05-01 01:01:00 2016-05-01 01:21:00   20 mins
# 3: 2016-05-01 01:00:00 2016-05-01 01:13:00   13 mins
# 4: 2016-05-01 01:00:00 2016-05-04 05:42:00 4602 mins
# 5: 2016-05-01 01:02:00 2016-05-01 01:37:00   35 mins
# 6: 2016-05-01 01:02:00 2016-05-01 01:14:00   12 mins
# 7: 2016-05-01 01:02:00 2016-05-01 01:39:00   37 mins
# 8: 2016-05-01 01:02:00 2016-05-01 01:18:00   16 mins

答案 1 :(得分:0)

在dplyr中,

library(dplyr)

       # parse datetimes
df %>% mutate_all(as.POSIXct, format = '%b %d %Y %I:%M%p') %>% 
    # add column with time difference
    mutate(elapsed = TimeEnd - TimeStart)

##              TimeStart             TimeEnd   elapsed
## 1 2016-05-01 01:00:00 2016-05-01 01:28:00   28 mins
## 2 2016-05-01 01:01:00 2016-05-01 01:21:00   20 mins
## 3 2016-05-01 13:00:00 2016-05-01 13:13:00   13 mins
## 4 2016-05-01 13:00:00 2016-05-04 17:42:00 4602 mins
## 5 2016-05-01 13:02:00 2016-05-01 13:37:00   35 mins
## 6 2016-05-01 13:02:00 2016-05-01 13:14:00   12 mins
## 7 2016-05-01 13:02:00 2016-05-01 13:39:00   37 mins
## 8 2016-05-01 13:02:00 2016-05-01 13:18:00   16 mins

或等效于基础R,

df$TimeStart <- as.POSIXct(df$TimeStart, format = '%b %d %Y %I:%M%p')
df$TimeEnd <- as.POSIXct(df$TimeEnd, format = '%b %d %Y %I:%M%p')
df$elapsed <- df$TimeEnd - df$TimeStart

df
##              TimeStart             TimeEnd   elapsed
## 1 2016-05-01 01:00:00 2016-05-01 01:28:00   28 mins
## 2 2016-05-01 01:01:00 2016-05-01 01:21:00   20 mins
## 3 2016-05-01 13:00:00 2016-05-01 13:13:00   13 mins
## 4 2016-05-01 13:00:00 2016-05-04 17:42:00 4602 mins
## 5 2016-05-01 13:02:00 2016-05-01 13:37:00   35 mins
## 6 2016-05-01 13:02:00 2016-05-01 13:14:00   12 mins
## 7 2016-05-01 13:02:00 2016-05-01 13:39:00   37 mins
## 8 2016-05-01 13:02:00 2016-05-01 13:18:00   16 mins

数据

df <- structure(list(TimeStart = c("May 1 2016 1:00AM", "May 1 2016 1:01AM", 
    "May 1 2016 1:00PM", "May 1 2016 1:00PM", "May 1 2016 1:02PM", 
    "May 1 2016 1:02PM", "May 1 2016 1:02PM", "May 1 2016 1:02PM"
    ), TimeEnd = c("May 1 2016 1:28AM", "May 1 2016 1:21AM", "May 1 2016 1:13PM", 
    "May 4 2016 5:42PM", "May 1 2016 1:37PM", "May 1 2016 1:14PM", 
    "May 1 2016 1:39PM", "May 1 2016 1:18PM")), class = "data.frame", row.names = c(NA, 
    -8L), .Names = c("TimeStart", "TimeEnd"))

答案 2 :(得分:0)

我更喜欢使用lubridate来做这样的事情。它是一个简单的包,可以使用一致的命名方案来解析日期时间。

library(lubridate)

首先使用mdy_hm

将日期字符转换为日期时间对象
df2 <- apply(df, 2, mdy_hm)

然后计算持续时间中的秒数。如果有足够的秒数,它会自动告诉你多少分钟。

dseconds(df2[,2]-df2[,1])

结果如下所示

[1] "1680s (~28 minutes)"     "1200s (~20 minutes)"    
[3] "780s (~13 minutes)"      "276120s (~4602 minutes)"
[5] "2100s (~35 minutes)"     "720s (~12 minutes)"     
[7] "2220s (~37 minutes)"     "960s (~16 minutes)"