我有以下数据(这是一个示例,我的数据集非常大):
ID| Date
A | 2010-12-30
A | 2010-12-13
A | 2010-08-23
B | 2011-06-24
B | 2011-06-13
B | 2010-02-20
我需要做的是根据ID计算日期之间的差异。计算必须从第一行开始,并在下面的行中减去日期。
因此,对于上述数据,所需的输出将是下面的DateDiff列:
ID| Date | DateDiff
A | 2010-12-30 | 17 (which is 2010-12-30 - 2010-12-13)
A | 2010-12-13 | 112 (which is 2010-12-13 - 2010-08-23)
A | 2010-08-23 | 0 (this result should be 0 as the ID (A) does not match the ID below (B)
B | 2011-06-24 | 11 (which is 2011-06-24 - 2011-06-13)
B | 2011-06-13 | 478 (which is 2011-06-13 - 2010-02-20)
B | 2010-02-20 | 0 (this result is 0 again as there is no ID in the next row thus the ID (B) does not match the ID below)
我使用了以下与所需结果接近的代码:
df$DateDiff <- ave(as.numeric(df$DATE), df$ID, FUN=function(x) c(0,abs(diff(x))))
但是,它通过从第一个日期减去第二个日期来计算,因此在第一行中输入0,如下所示:
ID| Date | DateDiff
A | 2010-12-30 | 0 (as there is no date above)
A | 2010-12-13 | 17
A | 2010-08-23 | 112 (as it calculates the diff between the first date in ID (B) from the last in ID (A)
B | 2011-06-24 | 0
B | 2011-06-13 | 11
B | 2010-02-20 | 478
正如您所看到的,结果很接近,但并不完全存在。我已经搜索了很长时间但很难找到解决方案。
答案 0 :(得分:3)
在我看来,你的代码是完美的,除了你创建的函数,它应该是c(abs(diff(x)),0)
。即计算差值然后在结尾处加上0。
示例:
ID <- c("A", "A", "A", "B", "B", "B")
DATE <- as.Date(c("2010-12-30",
"2010-12-13",
"2010-08-23",
"2011-06-2",
"2011-06-13",
"2010-02-20"
))
df <- data.frame(ID, DATE)
df$DateDiff <- ave(as.numeric(df$DATE), df$ID, FUN=function(x) c(abs(diff(x)),0))
这是输出:
{{3}}
答案 1 :(得分:1)
您可以使用dplyr::lead
将Date
改为1
df %>%
group_by(ID) %>%
mutate(DateDiff = abs(dplyr::lead(Date, 1, default=NA) - Date))
# A tibble: 6 x 3
# Groups: ID [2]
# ID Date DateDiff
# <chr> <date> <time>
# 1 A 2010-12-30 17
# 2 A 2010-12-13 112
# 3 A 2010-08-23 <NA>
# 4 B 2011-06-24 11
# 5 B 2011-06-13 478
# 6 B 2010-02-20 <NA>
df <- read.table(text="ID Date
A 2010-12-30
A 2010-12-13
A 2010-08-23
B 2011-06-24
B 2011-06-13
B 2010-02-20", header=TRUE, stringsAsFactors=FALSE)
library(lubridate)
df$Date <- ymd(df$Date)
答案 2 :(得分:1)
$http
.get(serverUrl)
.then(function(data){
//data is link to pdf
$window.open(data);
});
基地R:
library(data.table)
setDT(data)[,.(Date,c(-diff(Date),0)),by=ID]
ID Date V2
1: A 2010-12-30 17 days
2: A 2010-12-13 112 days
3: A 2010-08-23 0 days
4: B 2011-06-24 11 days
5: B 2011-06-13 478 days
6: B 2010-02-20 0 days