> dataframe
time value date
1 2 28 2013-07-08 15:13:35
2 4 8 2013-07-08 15:14:06
3 7 2 2013-07-08 15:43:05
4 8 25 2013-07-09 16:30:41
5 11 12 2013-07-09 19:04:40
6 14 41 2013-07-09 19:20:14
7 18 12 2013-07-10 00:39:04
8 22 12 2013-07-10 08:27:02
有没有人知道如何在数据框中对每天的最后一个条目进行子集化? 即获得:
time value date
3 7 2 2013-07-08 15:43:05
6 14 41 2013-07-09 19:20:14
8 22 12 2013-07-10 08:27:02
非常感谢!
答案 0 :(得分:2)
我喜欢用data.table
这样做。我们假设您的data.frame
被称为df
,然后是......
# Load required package
require( data.table )
dt <- data.table( df )
# Make dates out of your date-time column
dt[ , date1 := as.Date( date ) ]
# Subset to last row in each group
dt[ , .SD[.N] , by = date1 ]
# date1 time value date
#1: 2013-07-08 7 2 2013-07-08 15:43:05
#2: 2013-07-09 14 41 2013-07-09 19:20:14
#3: 2013-07-10 22 12 2013-07-10 08:27:02
答案 1 :(得分:1)
这是使用by
和tail
的基本R方式。
df<-read.table(text=" time value date
1 2 28 '2013-07-08 15:13:35'
2 4 8 '2013-07-08 15:14:06'
3 7 2 '2013-07-08 15:43:05'
4 8 25 '2013-07-09 16:30:41'
5 11 12 '2013-07-09 19:04:40'
6 14 41 '2013-07-09 19:20:14'
7 18 12 '2013-07-10 00:39:04'
8 22 12 '2013-07-10 08:27:02'", header=TRUE, stringsAsFactors=FALSE)
days <- cut(as.POSIXct(df$date), breaks='days')
results <- by(df, INDICES=days, FUN=tail, n=1)
do.call(rbind, results)
# time value date
# 2013-07-08 7 2 2013-07-08 15:43:05
# 2013-07-09 14 41 2013-07-09 19:20:14
# 2013-07-10 22 12 2013-07-10 08:27:02