R中同一个体内第一个日期和最后一个日期之间的差异

时间:2018-02-07 21:37:47

标签: r date difference

下午好 我不是R用户,但我需要在RFID中的第一个日期和最后一个日期之间获得差异,以创建新的列X.因此,第一个值需要为1(不为零),第二个为2 ,. ..,n。

这是一个数据示例。

提前致谢。

RFID              visit_date   ADFI   location
985152014315936   2017-11-25   2133   16
985152014315936   2017-11-26   2186   16
985152014315936   2017-11-27   3489   16
985152014315936   2017-11-28   2432   16
985152014315937   2017-11-24     15   17
985152014315937   2017-11-25   1512   17
985152014315937   2017-11-26   2378   17
985152014315937   2017-11-27   3241   17
985152014315938   2017-11-24    584   17
985152014315938   2017-11-25   1689   17
985152014315938   2017-11-26   2807   17
985152014315938   2017-11-27   2369   17
985152014315938   2017-11-28   2576   17
985152014315939   2017-11-25   1084   17
985152014315939   2017-11-26   3489   17
985152014315939   2017-11-27   2630   17
985152014315939   2017-11-28   3585   17
985152014315939   2017-11-29   3433   17
985152014315939   2017-11-30   2962   17

2 个答案:

答案 0 :(得分:1)

以下是使用dplyrlubridate的解决方案:

require(tidyverse);
require(lubridate);

df %>% group_by(RFID) %>% mutate(X = max(ymd(visit_date)) - min(ymd(visit_date)));
## A tibble: 19 x 5
## Groups:   RFID [4]
#              RFID visit_date  ADFI location X
#             <dbl> <fct>      <int>    <int> <time>
# 1 985152014315936 2017-11-25  2133       16 3
# 2 985152014315936 2017-11-26  2186       16 3
# 3 985152014315936 2017-11-27  3489       16 3
# 4 985152014315936 2017-11-28  2432       16 3
# 5 985152014315937 2017-11-24    15       17 3
# 6 985152014315937 2017-11-25  1512       17 3
# 7 985152014315937 2017-11-26  2378       17 3
# 8 985152014315937 2017-11-27  3241       17 3
# 9 985152014315938 2017-11-24   584       17 4
#10 985152014315938 2017-11-25  1689       17 4
#11 985152014315938 2017-11-26  2807       17 4
#12 985152014315938 2017-11-27  2369       17 4
#13 985152014315938 2017-11-28  2576       17 4
#14 985152014315939 2017-11-25  1084       17 5
#15 985152014315939 2017-11-26  3489       17 5
#16 985152014315939 2017-11-27  2630       17 5
#17 985152014315939 2017-11-28  3585       17 5
#18 985152014315939 2017-11-29  3433       17 5
#19 985152014315939 2017-11-30  2962       17 5

样本数据

df <- read.table(text =
    "RFID              visit_date   ADFI   location
985152014315936   2017-11-25   2133   16
985152014315936   2017-11-26   2186   16
985152014315936   2017-11-27   3489   16
985152014315936   2017-11-28   2432   16
985152014315937   2017-11-24     15   17
985152014315937   2017-11-25   1512   17
985152014315937   2017-11-26   2378   17
985152014315937   2017-11-27   3241   17
985152014315938   2017-11-24    584   17
985152014315938   2017-11-25   1689   17
985152014315938   2017-11-26   2807   17
985152014315938   2017-11-27   2369   17
985152014315938   2017-11-28   2576   17
985152014315939   2017-11-25   1084   17
985152014315939   2017-11-26   3489   17
985152014315939   2017-11-27   2630   17
985152014315939   2017-11-28   3585   17
985152014315939   2017-11-29   3433   17
985152014315939   2017-11-30   2962   17", header = T)

答案 1 :(得分:0)

使用data.table:

data <- data.table(data)
data[, diff := max(as.Date(visit_date)) - min(as.Date(visit_date)), by = RFID]

如果你想加1:

data[, diff := max(as.Date(visit_date)) - min(as.Date(visit_date)) + 1, by = RFID]