按周数列表(R)过滤时间序列

时间:2016-04-30 16:06:50

标签: r time-series subset trading

我有一个数据框,其中包含每30分钟采样一次的时间序列(2016年)。我需要制作一个包含每周三10:30:00的子集,如果一周不包含星期日到Wedneday的假期,并且每周四11:00:00,如果一周包含假日从星期日到Wedneday。这将创建EIA石油周报发布的时间表。我不想使用xts。

我知道如何按星期几和时间进行分组。但我不知道如何在包含日期列表中的日期的周上对条件进行子集化。我怎么能这样做?

下面的代码按星期几和时间创建子集,而不按假期过滤。它还包括用作过滤器的假日日期列表。

#Make time sequence every 30mins with Time & DayWk columns 
Calendar30mn <- as.data.frame(seq(as.POSIXlt("2016/1/1 00:00:00"), as.POSIXlt("2016/12/31 23:59:59"), by="30 mins"))
colnames(Calendar30mn) <- "DateTime"
Calendar30mn$Time <- strftime(Calendar30mn$DateTime, format="%H:%M:%S")
Calendar30mn$DayWk <- weekdays(Calendar30mn$DateTime)

#List of US Federal holidays falling on Sunday/Monday/Tuesday/Wedneday 
FedHolidaysSuntoWed <- structure(c(16818, 16846, 16951, 16986, 17049, 17161, 17084), class = "Date")  

-----

#Subset for Wednesday 10:30:00
EIAOildates1 <- subset (Calendar30mn, Time == "10:30:00" & DayWk == "Wednesday")

#Subset for Thursday 11:00:00
EIAOildates2 <- subset (Calendar30mn, Time == "11:00:00" & DayWk == "Thursday")

#Bind subsets and set reverse order (most recent at the top)
EIAOildates <- rbind(EIAOildates1, EIAOildates2)

以上代码生成EIAOildates1,其中包含周三10:30:00的子集。如果FedHolidaysSuntoWed中没有该周的任何一天,我希望该子集仅包含星期三10:30:00。反之为EIAOildates2

1 个答案:

答案 0 :(得分:0)

这就是答案:

library(lubridate)

#Make time sequence every 30mins with Time & DayWk & WkNumber columns 
Calendar30mn <- as.data.frame(seq(as.POSIXlt("2016/1/1 00:00:00"), as.POSIXlt("2016/12/31 23:59:59"), by="30 mins"))
colnames(Calendar30mn) <- "DateTime"
Calendar30mn$Time <- strftime(Calendar30mn$DateTime, format="%H:%M:%S")
Calendar30mn$DayWk <- weekdays(Calendar30mn$DateTime)
Calendar30mn$WkNumber <- week(Calendar30mn$DateTime)

#List of US Federal holidays falling on Sunday/Monday/Tuesday/Wedneday & Corresponding WkNumber
FedHolidaysSuntoWed <- structure(c(16818, 16846, 16951, 16986, 17049, 17161, 17084), class = "Date")  
FedHolidaysSuntoWedWkNumber <- week(FedHolidaysSuntoWed)

#Subset for Wednesday 10:30:00
EIAOildates1 <- subset (Calendar30mn, Time == "10:30:00" & DayWk == "Wednesday"
                        & !(Calendar30mn$WkNumber %in% FedHolidaysSuntoWedWkNumber))

#Subset for Thursday 11:00:00
EIAOildates2 <- subset (Calendar30mn, Time == "11:00:00" & DayWk == "Thursday"
                        & (Calendar30mn$WkNumber %in% FedHolidaysSuntoWedWkNumber))

#Bind and sort subsets  
EIAOildates <- rbind(EIAOildates1, EIAOildates2)
EIAOildates <- EIAOildates[(order(as.Date(EIAOildates$DateTime))),]

这是EIAOildates输出的示例:

                 DateTime     Time     DayWk WkNumber
262   2016-01-06 10:30:00 10:30:00 Wednesday        1
598   2016-01-13 10:30:00 10:30:00 Wednesday        2
983   2016-01-21 11:00:00 11:00:00  Thursday        3
1270  2016-01-27 10:30:00 10:30:00 Wednesday        4
1606  2016-02-03 10:30:00 10:30:00 Wednesday        5
1942  2016-02-10 10:30:00 10:30:00 Wednesday        6
2327  2016-02-18 11:00:00 11:00:00  Thursday        7
16726 2016-12-14 10:30:00 10:30:00 Wednesday       50
17062 2016-12-21 10:30:00 10:30:00 Wednesday       51
17447 2016-12-29 11:00:00 11:00:00  Thursday       52