删除R中的旧日期行

时间:2017-09-19 12:45:21

标签: r date time

我有桌子:

Date  | Column1 | Column2
------+---------+--------
6/1/1 | A       | 3
5/1/1 | B       | 4
4/1/1 | C       | 5
1/1/1 | A       | 1
7/1/1 | B       | 2
1/1/1 | C       | 3

我需要表格:

Date  | Column1 | Column2
------+---------+--------
6/1/1 | A       | 3
4/1/1 | C       | 5
7/1/1 | B       | 2

如何根据两个条件(Column1Column2)删除旧行?

1 个答案:

答案 0 :(得分:0)

按日期分组,在组内按降序排列,然后将第一行保留为slice,就像这样

library(dplyr)
ans <- df %>%
         group_by(Column1, Column2) %>%
         arrange(desc(as.Date(Date))) %>%   # will sort within group now
         slice(1) %>%              # keep first row entry of each group
         ungroup()

您的错误正在发生,因为您的日期格式有点滑稽。我建议使用比基本R日期时间函数

更强大的lubridate::parse_date_time
library(lubridate)
library(dplyr)
ans <- df %>%
         group_by(Column1, Column2) %>%
         arrange(desc(parse_date_time(Date, format="mdy"))) %>%   # will sort within group now
                                                                  # the date format is specified as month-day-year
         slice(1) %>%              # keep first row entry of each group
         ungroup()

修改

基于@count的有用评论,我们可以将dplyr链简化为

library(lubridate)
library(dplyr)
ans <- df %>%
         group_by(Column1, Column2) %>%
         slice(which.max(parse_date_time(Date, format="mdy"))) %>%              # keep max-Date row entry of each group
         ungroup()