按ID分组并获得R中日期的差异

时间:2017-04-10 22:18:02

标签: r

我有一个如下所示的数据集:

  id  Type   Sale     SaleDate  Time    Cat     LoadType      LoadDate
 A11   ABC    123   15/11/2016 00:00    AAA       Unload    23/11/2016
 A11   ABC    123   15/11/2016 00:00    AAA         Load    17/11/2016
A556   ABC    444   09/01/2017 00:00    VVV       Unload    17/01/2017
A556   ABC    444   09/01/2017 00:00    VVV         Load    17/01/2017

我想为每个id获取LoadDate之间的区别。例如,它应该返回

id   ....   LoadDate    DifferenceInDays
A11  ....   23/11/2016      6   
A11  ....   17/11/2016      6

对于具有相同ID的两行,DifferenceInDays应该相同。

2 个答案:

答案 0 :(得分:2)

您可以按id分组,然后计算max(LoadDate) - min(LoadDate)。假设您的数据框名为myData

library(dplyr)
  myData %>% 
  mutate(SaleDate = as.Date(SaleDate, "%d/%m/%Y"), 
         LoadDate = as.Date(LoadDate, "%d/%m/%Y")) %>% 
  group_by(id) %>% 
  summarise(DifferenceInDays = max(LoadDate) - min(LoadDate))

结果:

     id   DifferenceInDays 
  <chr>             <time>
1   A11             6 days
2  A556             0 days

如果要将列添加到原始数据框,请使用mutate()代替summarise()

答案 1 :(得分:1)

我会用data.table

来做
require('data.table')

# Your example data, in a data.frame
df = read.table(text='id  Type Sale SaleDate  Time    Cat     LoadType  LoadDate
A11 ABC 123 15/11/2016 00:00    AAA     Unload  23/11/2016
A11 ABC 123 15/11/2016 00:00    AAA     Load    17/11/2016
A556    ABC 444 09/01/2017 00:00    VVV     Unload  17/01/2017
A556    ABC 444 09/01/2017 00:00    VVV     Load    17/01/2017', header=T)

# convert to a data.table...
dt = data.table(df, key='id')

# ... with the right format for the date
dt[, LoadDate := as.IDate(LoadDate, format='%d/%m/%Y')]

# computes the difference in days, by ID:
dt[, DifferenceInDays := diff(range(LoadDate)), by=id]

这给出了所需的输出:

> dt
     id Type Sale   SaleDate  Time Cat LoadType   LoadDate DifferenceInDays
1:  A11  ABC  123 15/11/2016 00:00 AAA   Unload 2016-11-23                6
2:  A11  ABC  123 15/11/2016 00:00 AAA     Load 2016-11-17                6
3: A556  ABC  444 09/01/2017 00:00 VVV   Unload 2017-01-17                0
4: A556  ABC  444 09/01/2017 00:00 VVV     Load 2017-01-17                0