我已经对以下代码的其他SO答案进行了三角测量,但却遇到了错误信息。搜索了类似的错误和解决方案的SO,但还没有能够弄明白,所以请帮助。
对于每个组(" id"),我想获得连续行的开始时间之间的差异。
可重复数据:
require(dplyr)
df <-data.frame(id=as.numeric(c("1","1","1","2","2","2")),
start= c("1/31/17 10:00","1/31/17 10:02","1/31/17 10:45",
"2/10/17 12:00", "2/10/17 12:20","2/11/17 09:40"))
time <- strptime(df$start, format = "%m/%d/%y %H:%M")
df %>%
group_by(id)%>%
mutate(diff = time - lag(time),
diff_mins = as.numeric(diff, units = 'mins'))
给我发错误:
mutate_impl(.data,dots)中的错误: 列
diff
必须是长度3(组大小)或1,而不是6 另外:警告信息: 在unclass(time1)中 - unclass(time2): 较长的物体长度不是较短物体长度的倍数
答案 0 :(得分:1)
你的意思是这样吗?
此处不需要helm upgrade
,分组lag
上的简单diff
就足够了。
time
答案 1 :(得分:0)
您可以使用lag
和difftime
(按Hadley):
df %>%
mutate(time = as.POSIXct(start, format = "%m/%d/%y %H:%M")) %>%
group_by(id) %>%
mutate(diff = difftime(time, lag(time)))
# A tibble: 6 x 4
# Groups: id [2]
id start time diff
<dbl> <fct> <dttm> <time>
1 1. 1/31/17 10:00 2017-01-31 10:00:00 <NA>
2 1. 1/31/17 10:02 2017-01-31 10:02:00 2
3 1. 1/31/17 10:45 2017-01-31 10:45:00 43
4 2. 2/10/17 12:00 2017-02-10 12:00:00 <NA>
5 2. 2/10/17 12:20 2017-02-10 12:20:00 20
6 2. 2/11/17 09:40 2017-02-11 09:40:00 1280