获取r中包含最大日期的所有相应行

时间:2016-05-29 11:43:15

标签: r max

我的数据框与此类似,但还有30个变量

Data frame

我只需要每个“代码”的最大日期的相应值 所以输出应该是这样的

Required form

任何专家都可以帮我解决这个问题。

2 个答案:

答案 0 :(得分:4)

我们可以使用data.table。转换' data.frame'到' data.table' (setDT(df1)),按'代码'分组,我们order'日期'下降并获得head的第一行。

library(data.table)
setDT(df1)[order(-as.Date(Date, '%m/%d/%Y')), head(.SD, 1), by = code]
#       code bill      Date Type Month   KM
#1: C111574885   50 9/25/2015  red     9 1070
#2: C111519730  200 6/25/2015 blue     6  350
#3: D100000468   40  6/4/2015  red     6 1240
#4: D100000470  500 3/13/2015  red     3 1000

order之后,我们可以使用unique by'代码'获得第一行(具有最大日期)。

unique(setDT(df1)[order(code, -as.Date(Date, '%m/%d/%Y'))], by = 'code')
#  bill       code      Date Type Month   KM
#1:  200 C111519730 6/25/2015 blue     6  350
#2:   50 C111574885 9/25/2015  red     9 1070
#3:   40 D100000468  6/4/2015  red     6 1240
#4:  500 D100000470 3/13/2015  red     3 1000

数据

df1 <-  structure(list(bill = c(100, 200, 500, 900, 150, 50, 40), 
code = c("C111519730", 
"C111519730", "D100000470", "C111574885", "C111574885", "C111574885", 
"D100000468"), Date = c("4/9/2015", "6/25/2015", "3/13/2015", 
"1/9/2015", "9/20/2015", "9/25/2015", "6/4/2015"), Type = c("red", 
"blue", "red", "red", "blue", "red", "red"), Month = c(4, 6, 
3, 1, 9, 9, 6), KM = c(100, 350, 1000, 450, 900, 1070, 1240)),
 .Names = c("bill", 
"code", "Date", "Type", "Month", "KM"),
 row.names = c(NA, -7L), class = "data.frame")

答案 1 :(得分:3)

也可以使用dplyr(假设您的数据是名为dt的数据框)来完成:

library(dplyr)

dt %>% group_by(code) %>% filter(Date == max(Date))