PER组两个日期之间的平均时间差

时间:2017-01-26 21:49:43

标签: r date aggregate summary lubridate

我有一个包含用户及其访问日期的数据框。我试图找出每组访问次数之间的平均时差。 输出将是几天或一天中的一小部分

require(lubridate)
so <- data.frame(visit_dates = c("12/4/2016","12/6/2016","12/7/2016","12/3/2016","12/7/2016","12/10/2016"), person = c("1","1","1","2","2","2"))


so$visit_dates <- mdy(format(as.POSIXct(strptime(so$visit_dates,"%m/%d/%Y",tz = "")),format = "%m/%d/%Y"))

输出看起来像:

person    avgTimeBetweenVisit
1                 2.5
2                 3.5

2 个答案:

答案 0 :(得分:1)

那是怎么回事:

{{1}}

该链接帮助我解决了“差异”问题。 diff operation within a group, after a dplyr::group_by()

答案 1 :(得分:1)

尝试data.table:

require(lubridate)
require(data.table)
so <- data.frame(visit_dates = c("12/4/2016","12/6/2016","12/7/2016","12/3/2016","12/7/2016","12/10/2016"), person = c("1","1","1","2","2","2"))


so$visit_dates <- mdy(format(as.POSIXct(strptime(so$visit_dates,"%m/%d/%Y",tz = "")),format = "%m/%d/%Y"))
so <- data.table(so, key = c("person", "visit_dates"))
res <- so[, .(avgTimeBetweenVisit = mean(diff(visit_dates))), by = person]
print(res)
# person avgTimeBetweenVisit
# 1:      1            1.5 days
# 2:      2            3.5 days