R - 仅按时间序列对一小时进行子集化,并进行多次每小时观察

时间:2014-02-28 22:24:49

标签: r time-series

我的数据集包含多个每小时观察。我想通过每小时只拉一个观察来对数据进行子集化。

我试过了:

> subset(date, !duplicated(ceiling(hour(date)))) 

但它只从57000总日期向量中提取了一个24长度的向量。

> subset(date, !duplicated(hour(date)))
 [1] "2012-12-31 23:55:00 PST" "2012-12-31 22:25:00 PST" "2012-12-31 21:40:00 PST"     "2012-12-31 20:55:00 PST" "2012-12-31 19:25:00 PST"
 [6] "2012-12-31 18:40:00 PST" "2012-12-31 17:55:00 PST" "2012-12-31 16:25:00 PST" "2012-12-31 15:40:00 PST" "2012-12-31 14:55:00 PST"
[11] "2012-12-31 13:25:00 PST" "2012-12-31 12:40:00 PST" "2012-12-31 11:55:00 PST" "2012-12-31 09:40:00 PST" "2012-12-31 08:55:00 PST"
[16] "2012-12-31 07:25:00 PST" "2012-12-31 06:40:00 PST" "2012-12-31 05:55:00 PST" "2012-12-31 04:25:00 PST" "2012-12-31 03:40:00 PST"
[21] "2012-12-31 02:10:00 PST" "2012-12-31 00:40:00 PST" "2012-12-30 10:25:00 PST" "2012-12-30 01:25:00 PST"

> head(date,100)
[1] "2012-12-31 23:55:00 PST" "2012-12-31 23:10:00 PST" "2012-12-31 22:25:00 PST" "2012-12-31 21:40:00 PST" "2012-12-31 20:55:00 PST"
[6] "2012-12-31 20:10:00 PST" "2012-12-31 19:25:00 PST" "2012-12-31 18:40:00 PST" "2012-12-31 17:55:00 PST" "2012-12-31 17:10:00 PST"
[11] "2012-12-31 16:25:00 PST" "2012-12-31 15:40:00 PST" "2012-12-31 14:55:00 PST" "2012-12-31 14:10:00 PST" "2012-12-31 13:25:00 PST"
[16] "2012-12-31 12:40:00 PST" "2012-12-31 11:55:00 PST" "2012-12-31 11:10:00 PST" "2012-12-31 09:40:00 PST" "2012-12-31 08:55:00 PST"
[21] "2012-12-31 08:10:00 PST" "2012-12-31 07:25:00 PST" "2012-12-31 06:40:00 PST" "2012-12-31 05:55:00 PST" "2012-12-31 05:10:00 PST"
[26] "2012-12-31 04:25:00 PST" "2012-12-31 03:40:00 PST" "2012-12-31 02:10:00 PST" "2012-12-31 00:40:00 PST" "2012-12-30 23:55:00 PST"
[31] "2012-12-30 23:10:00 PST" "2012-12-30 22:25:00 PST" "2012-12-30 21:40:00 PST" "2012-12-30 20:55:00 PST" "2012-12-30 20:10:00 PST"
[36] "2012-12-30 19:25:00 PST" "2012-12-30 17:55:00 PST" "2012-12-30 16:25:00 PST" "2012-12-30 15:40:00 PST" "2012-12-30 14:55:00 PST"
[41] "2012-12-30 14:10:00 PST" "2012-12-30 13:25:00 PST" "2012-12-30 12:40:00 PST" "2012-12-30 11:55:00 PST" "2012-12-30 11:10:00 PST"
[46] "2012-12-30 10:25:00 PST" "2012-12-30 09:40:00 PST" "2012-12-30 08:55:00 PST" "2012-12-30 08:10:00 PST" "2012-12-30 07:25:00 PST"
[51] "2012-12-30 06:40:00 PST" "2012-12-30 05:55:00 PST" "2012-12-30 05:10:00 PST" "2012-12-30 04:25:00 PST" "2012-12-30 03:40:00 PST"
[56] "2012-12-30 02:55:00 PST" "2012-12-30 02:10:00 PST" "2012-12-30 01:25:00 PST" "2012-12-30 00:40:00 PST" "2012-12-29 23:55:00 PST"
[61] "2012-12-29 23:10:00 PST" "2012-12-29 22:25:00 PST" "2012-12-29 20:55:00 PST" "2012-12-29 20:10:00 PST" "2012-12-29 19:25:00 PST"
[66] "2012-12-29 18:40:00 PST" "2012-12-29 17:55:00 PST" "2012-12-29 17:10:00 PST" "2012-12-29 16:25:00 PST" "2012-12-29 15:40:00 PST"
[71] "2012-12-29 14:55:00 PST" "2012-12-29 14:10:00 PST" "2012-12-29 13:25:00 PST" "2012-12-29 12:40:00 PST" "2012-12-29 11:55:00 PST"
[76] "2012-12-29 11:10:00 PST" "2012-12-29 10:25:00 PST" "2012-12-29 09:40:00 PST" "2012-12-29 08:55:00 PST" "2012-12-29 08:10:00 PST"
[81] "2012-12-29 07:25:00 PST" "2012-12-29 06:40:00 PST" "2012-12-29 05:55:00 PST" "2012-12-29 05:10:00 PST" "2012-12-29 04:25:00 PST"
[86] "2012-12-29 03:40:00 PST" "2012-12-29 02:55:00 PST" "2012-12-29 02:10:00 PST" "2012-12-29 01:25:00 PST" "2012-12-29 00:40:00 PST"
[91] "2012-12-28 23:55:00 PST" "2012-12-28 23:10:00 PST" "2012-12-28 21:40:00 PST" "2012-12-28 20:55:00 PST" "2012-12-28 20:10:00 PST"
[96] "2012-12-28 19:25:00 PST" "2012-12-28 18:40:00 PST" "2012-12-28 17:10:00 PST" "2012-12-28 15:40:00 PST" "2012-12-28 14:55:00 PST"

> length(date)
[1] 57169

> length(subset(date, !duplicated(hour(date))))
[1] 24

> class(date)
[1] "POSIXlt" "POSIXt" 

有什么想法吗?

1 个答案:

答案 0 :(得分:0)

date <- seq(as.POSIXct("2014-01-01"), as.POSIXct("2014-01-10"), length.out = 100000)
## [1] "2014-01-01 00:00:00 EST" "2014-01-01 00:00:07 EST"                                                                                                                                                                                    
## [3] "2014-01-01 00:00:15 EST" "2014-01-01 00:00:23 EST"                                                                                                                                                                                    
## [5] "2014-01-01 00:00:31 EST" "2014-01-01 00:00:38 EST"                                                                                                                                                                                    

subset(date, !duplicated(format(date, "%Y%m%d%H")))
## [1] "2014-01-01 00:00:00 EST" "2014-01-01 01:00:00 EST"                                                                                                                                                                                    
## [3] "2014-01-01 02:00:00 EST" "2014-01-01 03:00:00 EST"                                                                                                                                                                                    
## [5] "2014-01-01 04:00:01 EST" "2014-01-01 05:00:01 EST"