在数据框中子集当天的最后一个条目

时间:2013-11-04 16:02:00

标签: r dataframe

> dataframe

    time        value                date
1      2           28 2013-07-08 15:13:35
2      4            8 2013-07-08 15:14:06
3      7            2 2013-07-08 15:43:05
4      8           25 2013-07-09 16:30:41
5     11           12 2013-07-09 19:04:40
6     14           41 2013-07-09 19:20:14
7     18           12 2013-07-10 00:39:04
8     22           12 2013-07-10 08:27:02

有没有人知道如何在数据框中对每天的最后一个条目进行子集化? 即获得:

    time        value                date
3      7            2 2013-07-08 15:43:05
6     14           41 2013-07-09 19:20:14
8     22           12 2013-07-10 08:27:02

非常感谢!

2 个答案:

答案 0 :(得分:2)

我喜欢用data.table这样做。我们假设您的data.frame被称为df,然后是......

#  Load required package
require( data.table )
dt <- data.table( df )

#  Make dates out of your date-time column
dt[ , date1 := as.Date( date ) ]

#  Subset to last row in each group
dt[ , .SD[.N] , by = date1 ]
#        date1 time value                date
#1: 2013-07-08    7     2 2013-07-08 15:43:05
#2: 2013-07-09   14    41 2013-07-09 19:20:14
#3: 2013-07-10   22    12 2013-07-10 08:27:02

答案 1 :(得分:1)

这是使用bytail的基本R方式。

df<-read.table(text="    time        value                date
1      2           28 '2013-07-08 15:13:35'
2      4            8 '2013-07-08 15:14:06'
3      7            2 '2013-07-08 15:43:05'
4      8           25 '2013-07-09 16:30:41'
5     11           12 '2013-07-09 19:04:40'
6     14           41 '2013-07-09 19:20:14'
7     18           12 '2013-07-10 00:39:04'
8     22           12 '2013-07-10 08:27:02'", header=TRUE, stringsAsFactors=FALSE)

days <- cut(as.POSIXct(df$date), breaks='days')
results <- by(df, INDICES=days, FUN=tail, n=1)
do.call(rbind, results)

#            time value                date
# 2013-07-08    7     2 2013-07-08 15:43:05
# 2013-07-09   14    41 2013-07-09 19:20:14
# 2013-07-10   22    12 2013-07-10 08:27:02