我在产犊栏上有一个数据框命令desc,如下所示。
Served Calved ProfileID
1 2015-07-29 2017-05-07 1346
2 2015-07-29 2017-05-06 2645
3 2016-06-12 2017-05-05 3687
4 2016-05-19 2017-05-05 3687
5 2015-05-21 2017-05-05 3687
6 2013-05-08 2017-05-05 3687
7 2015-08-08 2016-05-04 4235
8 2015-06-14 2016-05-04 4235
9 2015-05-31 2016-05-04 4235
10 2013-08-13 2014-05-02 5425
11 2013-07-23 2014-05-02 5425
12 2012-03-01 2014-05-02 5425
13 2017-07-11 2013-04-22 5425
14 2012-11-01 2013-04-22 5425
15 2015-12-23 2013-04-22 5425
16 2014-05-10 2013-04-22 5425
我想从Calved列中删除重复项,根据calved列中的日期保留一个来自ProfileID列的注释,如此
Served Calved ProfileID
1 2015-07-29 2017-05-07 1346
2 2015-07-29 2017-05-06 2645
3 2016-06-12 2017-05-05 3687
7 2015-08-08 2016-05-04 4235
10 2013-08-13 2014-05-02 5425
13 2017-07-11 2013-04-22 5425
我使用
实现了这一目标on_served_profileID<-master_arranged[!duplicated(master_arranged[c("Calved","ProfileID")]),]
我想添加和条件,以便从Calved列中选择的行小于服务列,而不仅仅是每个日期中第一次出现的行。
对于输出的第13行,我宁愿这是第14行,因为服务列比这样的生成列小,而不是给我第一次遵守Calved列中的每个日期。
Served Calved ProfileID
1 2015-07-29 2017-05-07 1346
2 2015-07-29 2017-05-06 2645
3 2016-06-12 2017-05-05 3687
7 2015-08-08 2016-05-04 4235
10 2015-08-13 2014-05-02 5425
14 2012-11-01 2013-04-22 5425
我尝试了以及:
的变体on_served_profileID<-master_arranged[!duplicated(master_arranged[c("Calved","ProfileID")])& master_arranged$Served < master_arranged$Calved,]
这是为了尝试选择小于服务遵守的小兵遵守,因此&amp;条件&#34; $服务&lt; $产犊&#34;
非常感谢任何帮助
答案 0 :(得分:1)
希望这有帮助!
library(dplyr)
df$Served <- as.Date(df$Served)
df$Calved <- as.Date(df$Calved)
df %>%
group_by(Calved, ProfileID) %>%
summarise(Served = Served[first(which(Served < Calved))]) %>%
arrange(desc(Calved))
输出是:
Calved ProfileID Served
1 2017-05-07 1346 2015-07-29
2 2017-05-06 2645 2015-07-29
3 2017-05-05 3687 2016-06-12
4 2016-05-04 4235 2015-08-08
5 2014-05-02 5425 2013-08-13
6 2013-04-22 5425 2012-11-01
示例数据:
df <- structure(list(Served = c("2015-07-29", "2015-07-29", "2016-06-12",
"2016-05-19", "2015-05-21", "2013-05-08", "2015-08-08", "2015-06-14",
"2015-05-31", "2013-08-13", "2013-07-23", "2012-03-01", "2017-07-11",
"2012-11-01", "2015-12-23", "2014-05-10"), Calved = c("2017-05-07",
"2017-05-06", "2017-05-05", "2017-05-05", "2017-05-05", "2017-05-05",
"2016-05-04", "2016-05-04", "2016-05-04", "2014-05-02", "2014-05-02",
"2014-05-02", "2013-04-22", "2013-04-22", "2013-04-22", "2013-04-22"
), ProfileID = c(1346L, 2645L, 3687L, 3687L, 3687L, 3687L, 4235L,
4235L, 4235L, 5425L, 5425L, 5425L, 5425L, 5425L, 5425L, 5425L
)), .Names = c("Served", "Calved", "ProfileID"), class = "data.frame", row.names = c("1",
"2", "3", "4", "5", "6", "7", "8", "9", "10", "11", "12", "13",
"14", "15", "16"))