我有一个像这样的数据框:
> df
Keyword Date Pos Bid
a 1/14/14 1 5
a 1/15/14 1 5
a 1/16/14 1 5
b 2/4/14 5 9
b 2/5/14 2 9
b 2/5/14 2 9
c 3/21/14 3 5
c 3/23/14 1 9
c 3/23/14 2 10
我能够过滤,以便我得到最新的观察结果:
Late = ddply(df, 'Keyword', function(x) {Date = as.Date(x$Date, '%m/%d/%y')
x[Date == max(Date),
c('Keyword', 'Date', 'Pos', 'Bid')]})
> Late
Keyword Date Pos Bid
a 1/16/14 1 5
b 2/5/14 2 9
b 2/5/14 2 9
c 3/23/14 1 9
c 3/23/14 2 10
现在,我要拥有唯一的关键字,唯一日期,最低位置和最高出价:
WANT THIS:
> Late
Keyword Date Pos Bid
a 1/16/14 1 5
b 2/5/14 2 9
c 3/23/14 1 10
所以我做了另一个ddply:
Late = ddply(Late, .(Keyword, Date), function(x) c(Keyword = unique(x$Keyword),
Date = unique(as.Date(x$Date, '%m/%d/%y')),
Pos = min(x$Pos),
Bid = max(x$Bid)))
但这给了我垃圾日期:
> Late
Keyword Date Pos Bid
a 16086 1 5
b 16086 2 9
c 16088 1 10
我为Date尝试了各种代码,但它们不起作用。我错过了什么?
感谢。
答案 0 :(得分:1)
尝试
ddply(df, .(Keyword), function(x) c(Date = as.character(x$Date[which.max(as.Date(x$Date, '%m/%d/%y'))]),
Pos = min(x$Pos),
Bid = max(x$Bid)))
## Keyword Date Pos Bid
## 1 a 1/16/14 1 5
## 2 b 2/5/14 2 9
## 3 c 3/23/14 1 10