xts - 如何在一周中的每一天进行子集化

时间:2016-05-14 08:39:56

标签: r xts

我理解类似的问题已得到解答。我的问题是我有1533分钟的2033天的时间序列数据。我想为每一天(周一至周日)绘制系列剧。例如,周一的平均值如何。

我尝试使用.indexwday进行分组,但当天的系列从13:00开始。

我是新手,所以如果我需要提供更多详细信息,请告诉我。

样本数据(xts)

  • 2008-01-01 00:00:00 16
  • 2008-01-01 00:15:00 56
  • 2008-01-01 00:30:00 136
  • 2008-01-01 00:45:00 170
  • 2008-01-01 01:00:00 132

...

  • 2013-07-25 22:30:00 95
  • 2013-07-25 22:45:00 82
  • 2013-07-25 23:00:00 66
  • 2013-07-25 23:15:00 65
  • 2013-07-25 23:30:00 66
  • 2013-07-25 23:45:00 46

下面的情节可能更符合我的想法(这是所有星期一的平均值)

enter image description here

3 个答案:

答案 0 :(得分:4)

这是另一种解决方案,它不依赖于xts和zoo之外的其他软件包。

# example data
ix <- seq(as.POSIXct("2008-01-01"), as.POSIXct("2013-07-26"), by="15 min")
set.seed(21)
x <- xts(sample(200, length(ix), TRUE), ix)

# aggregate by 15-minute observations for each weekday
a <- lapply(split.default(x, format(index(x), "%A")),         # split by weekday
  function(x) aggregate(x, format(index(x), "%H:%M"), mean))  # aggregate by 15-min
# merge aggregated data into one zoo object, ordering columns
z <- do.call(merge, a)[,c("Sunday", "Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday")]
# convert index to POSIXct to make plotting easier
index(z) <- as.POSIXct(index(z), format="%H:%M")
# plot
plot(z, type="l", nc=1, ylim=range(z), main="Average daily volume", las=1)

设置ylim会强制每个绘图具有相同的y轴范围。否则它们将取决于每个单独的系列,如果值变化很大,这可能使它们难以比较。

enter image description here

答案 1 :(得分:2)

试试这个:

    #Get necessary packages
install.packages("lubridate")
install.packages("magrittr")
install.packages("dplyr")
install.packages("ggplot2")
install.packages("scales")

#Import packages
library(lubridate,warn=F)
library(dplyr,warn=F)
library(magrittr)
library(ggplot2,warn=F)
library(scales, warn=F)

#Getting the data
tstart = as.POSIXct('2008-01-01 00:00:00')
tend = as.POSIXct('2013-07-25 23:45:00')
ttimes <- seq(from = tstart,to=tend,by='15 mins')
tvals <- sample(seq(1,200),length(ttimes),T)
tsdata <- data.frame(Dates=ttimes,Vals=tvals)
tsdata <- tsdata %>% mutate(DayofWeek = wday(Dates,label=T), Hours = as.POSIXct(strftime(Dates,format="%H:%M:%S"),format="%H:%M:%S"))

#Pick a day at a time. I am using Mondays for this example.
tsdata_monday <- tsdata %>% filter(DayofWeek=='Mon') %>% group_by(Hours) %>% summarise(meanVals=mean(Vals)) %>% as.data.frame()

#Plotting the graph of mean values versus times for Monday:
ggplot(tsdata_monday) + aes(x=Hours,y=meanVals) + geom_line() + scale_x_datetime(breaks=date_breaks("4 hour"), labels=date_format("%H:%M"))

Monday plot

#If you want you can go ahead and plot all the days. But please keep in mind
#that this does not look good at all. Too many plots for the plot window to
#Display nicely.
alltsdata <- tsdata %>% group_by(DayofWeek, Hours) %>% summarise(MeanVals=mean(Vals)) %>% as.data.frame()

ggplot(alltsdata) + aes(x=Hours,y=MeanVals) + geom_line() + scale_x_datetime(breaks=date_breaks("4 hour"), labels=date_format("%H:%M")) + facet_grid(.~DayofWeek)

Full plot

我建议您一次绘制一天,或使用for loop或其中一个apply函数变体来绘制图表。

此外,在按星期几过滤时,请注意缩短日期如下:

unique(tsdata$DayofWeek)
[1] Tues  Wed   Thurs Fri   Sat   Sun   Mon 

希望它有所帮助。

答案 2 :(得分:1)

(sex = 'Unknown'):完全符合您的要求。(假设您的数据称为apply.daily和xts-object)

d.xts

另一种解决方案是使用apply.daily(d.xts,sum)

aggregate

请注意,答案略有不同:aggregate(d.xts,as.Date(index(d.xts)),sum) apply.daily开始,而start(d.xts) to end(d.xts)从午夜到午夜开始。