R如何从下面的下一个日期减去日期并分组?

时间:2018-02-05 23:43:30

标签: r

我有以下数据(这是一个示例,我的数据集非常大):

ID| Date        
A | 2010-12-30 
A | 2010-12-13 
A | 2010-08-23 
B | 2011-06-24 
B | 2011-06-13 
B | 2010-02-20 

我需要做的是根据ID计算日期之间的差异。计算必须从第一行开始,并在下面的行中减去日期。

因此,对于上述数据,所需的输出将是下面的DateDiff列:

ID| Date        | DateDiff
A | 2010-12-30  | 17       (which is 2010-12-30 - 2010-12-13) 
A | 2010-12-13  | 112      (which is 2010-12-13 - 2010-08-23) 
A | 2010-08-23  | 0        (this result should be 0 as the ID (A) does not match the ID below (B)   
B | 2011-06-24  | 11       (which is 2011-06-24 - 2011-06-13)
B | 2011-06-13  | 478      (which is 2011-06-13 - 2010-02-20)
B | 2010-02-20  | 0        (this result is 0 again as there is no ID in the next row thus the ID (B) does not match the ID below)

我使用了以下与所需结果接近的代码:

df$DateDiff <- ave(as.numeric(df$DATE), df$ID, FUN=function(x) c(0,abs(diff(x))))

但是,它通过从第一个日期减去第二个日期来计算,因此在第一行中输入0,如下所示:

ID| Date        | DateDiff
A | 2010-12-30  | 0 (as there is no date above)       
A | 2010-12-13  | 17
A | 2010-08-23  | 112 (as it calculates the diff between the first date in ID (B) from the last in ID (A)  
B | 2011-06-24  | 0
B | 2011-06-13  | 11 
B | 2010-02-20  | 478        

正如您所看到的,结果很接近,但并不完全存在。我已经搜索了很长时间但很难找到解决方案。

3 个答案:

答案 0 :(得分:3)

在我看来,你的代码是完美的,除了你创建的函数,它应该是c(abs(diff(x)),0)。即计算差值然后在结尾处加上0。

示例:

ID <- c("A", "A", "A", "B", "B", "B")
DATE <- as.Date(c("2010-12-30",
            "2010-12-13",
            "2010-08-23",
            "2011-06-2",
            "2011-06-13",
            "2010-02-20"
))
df <- data.frame(ID, DATE)

df$DateDiff <- ave(as.numeric(df$DATE), df$ID, FUN=function(x) c(abs(diff(x)),0))

这是输出:

{{3}}

答案 1 :(得分:1)

您可以使用dplyr::leadDate改为1

df %>%
  group_by(ID) %>%
  mutate(DateDiff = abs(dplyr::lead(Date, 1, default=NA) - Date))

# A tibble: 6 x 3
# Groups: ID [2]
  # ID    Date       DateDiff
  # <chr> <date>     <time>  
# 1 A     2010-12-30 17      
# 2 A     2010-12-13 112     
# 3 A     2010-08-23 <NA>    
# 4 B     2011-06-24 11      
# 5 B     2011-06-13 478     
# 6 B     2010-02-20 <NA> 

df <- read.table(text="ID Date        
A 2010-12-30 
A 2010-12-13 
A 2010-08-23 
B 2011-06-24 
B 2011-06-13 
B 2010-02-20", header=TRUE, stringsAsFactors=FALSE)
library(lubridate)
df$Date <- ymd(df$Date)

答案 2 :(得分:1)

$http
  .get(serverUrl)
  .then(function(data){
   //data is link to pdf
   $window.open(data);
});     

基地R:

library(data.table)
setDT(data)[,.(Date,c(-diff(Date),0)),by=ID]
                   ID       Date       V2
1:                 A  2010-12-30  17 days
2:                 A  2010-12-13 112 days
3:                 A  2010-08-23   0 days
4:                 B  2011-06-24  11 days
5:                 B  2011-06-13 478 days
6:                 B  2010-02-20   0 days