删除具有相同日期时间戳

时间:2017-12-10 12:14:50

标签: r

我的数据采用以下格式:

DF <- data.frame(ids = c("uniqueid1", "uniqueid1", "uniqueid1", "uniqueid2", "uniqueid2", "uniqueid2", "uniqueid2", "uniqueid3", "uniqueid3", "uniqueid3", "uniqueid4", "uniqueid4", "uniqueid4"), stock_year = c("April 2014", "March 2012", "April 2014", "January 2017", "January 2016", "January 2015", "January 2014", "November 2011", "November 2011", "December 2009", "August 2001", "July 2000", "May 1999"))

         ids    stock_year
1  uniqueid1    April 2014
2  uniqueid1    March 2012
3  uniqueid1    April 2014
4  uniqueid2  January 2017
5  uniqueid2  January 2016
6  uniqueid2  January 2015
7  uniqueid2  January 2014
8  uniqueid3 November 2011
9  uniqueid3 November 2011
10 uniqueid3 December 2009
11 uniqueid4   August 2001
12 uniqueid4     July 2000
13 uniqueid4      May 1999

如何在stock_year列中完全删除具有相同ID的行具有相同值?

预期结果的示例输出是:

 DF <- data.frame(ids = c("uniqueid2", "uniqueid2", "uniqueid2", "uniqueid2", "uniqueid4", "uniqueid4", "uniqueid4"), stock_year = c("January 2017", "January 2016", "January 2015", "January 2014", "August 2001", "July 2000", "May 1999"))


        ids   stock_year
1 uniqueid2 January 2017
2 uniqueid2 January 2016
3 uniqueid2 January 2015
4 uniqueid2 January 2014
5 uniqueid4  August 2001
6 uniqueid4    July 2000
7 uniqueid4     May 1999

1 个答案:

答案 0 :(得分:2)

我们可以按照&#39; ID分组并检查重复项filter那些&#39; ID&#39;没有重复

library(dplyr)
DF %>%
  group_by(ids) %>%
  filter(!anyDuplicated(stock_year))

或使用ave

中的base R
DF[with(DF, ave(as.character(stock_year), ids, FUN=anyDuplicated)!=0),]