我有类似于此的数据集,并希望在所有不同列的值匹配时获得SORT_DT的最早日期。请帮我解决这个问题
df <- fread("CUST_NO ID_NO SYMBOL AUTO_CREATE_DT CLASS_TYPE SORT_DT
107 10120 1 2014-05-12 G/L 2015-01-09
107 10120 1 2014-05-12 G/L 2015-11-10
107 10120 1 2014-05-12 G/L 2014-06-18
107 10120 1 2014-05-12 G/L 2014-05-13
107 10120 1 2014-05-12 G/L 2015-07-10
107 10120 1 2014-05-12 G/L 2015-10-09
107 10120 1 2014-05-12 G/L 2016-04-08
107 10120 1 2014-05-12 G/L 2016-01-08
107 10120 1 2014-05-12 G/L 2016-12-22
107 10120 1 2014-05-12 G/L 2017-01-13
107 10120 1 2014-05-12 G/L 2016-07-08
108 10120 1 2014-05-12 G/L 2017-04-14
108 10120 1 2014-05-12 G/L 2017-04-17
108 10120 1 2014-05-12 G/L 2016-08-31
108 10120 1 2014-05-12 G/L 2015-04-10
108 10120 1 2014-05-12 G/L 2016-12-22")
输出应如下
CUST_NO ID_NO SYMBOL AUTO_CREATE_DT CLASS_TYPE SORT_DT
1 107 10120 1 2014-05-12 G/L 2014-05-13
2 108 10120 1 2014-05-12 G/L 2015-04-10
答案 0 :(得分:0)
试试这个:
df2 <- aggregate(df, list(df$CUST_NO, df$ID_NO, df$SYMBOL, df$AUTO_CREATE_DT, df$CLASS_TYPE), FUN = min)
df2 <- df2[c("CUST_NO", "ID_NO", "SYMBOL", "AUTO_CREATE_DT", "CLASS_TYPE", "SORT_DT")]
答案 1 :(得分:0)
aggregate(SORT_DT ~ ., data = df, min)
# CUST_NO ID_NO SYMBOL AUTO_CREATE_DT CLASS_TYPE SORT_DT
# 1 107 10120 1 2014-05-12 G/L 2014-05-13
# 2 108 10120 1 2014-05-12 G/L 2015-04-10