ts()函数 - 定义唯一四分位数

时间:2018-05-14 03:45:02

标签: r time-series

我的数据集看起来像这样(下图)。

我试图将这些数据转换为ts()对象,方法是将数据按季度列分组为列出的季节。因此,例如Q1将是SU15,Q2将是FA15,依此类推。此外,并非所有季节都有相同数量的数据点。我试过了

 DF_ts <- ts(DF$Variable, frequency = 52, 
           start=decimal_date(ymd(DF$Date[1])))

DF_ts_2 <- aggregate(each_style_ts, nfrequency=4)

但是我注意到Quartiles没有按我希望的方式分组。我还有一个非常大的数据集,我需要在某种程度上实现自动化。以下示例仅仅是示例。我还需要确保它们有序:(即:2015年夏季,2015年秋季,2015年冬季,2016年春季)

&#13;
&#13;
Date	Season	Variable
6/20/15	SU15	67859
6/27/15	SU15	75251
7/4/15	SU15	100085
7/11/15	FA15	98760
7/18/15	FA15	95053
7/25/15	FA15	91286
8/1/15	FA15	88573
8/8/15	FA15	23084
8/15/15	FA15	31939
8/22/15	FA15	31445
8/29/15	FA15	30854
9/5/15	FA15	21890
9/12/15	FA15	29948
9/19/15	FA15	54254
9/26/15	FA15	52819
10/3/15	FA15	51974
10/10/15	WN15	55826
10/17/15	WN15	53300
10/24/15	WN15	52442
10/31/15	WN15	23084
11/7/15	WN15	31939
11/14/15	WN15	31445
11/21/15	WN15	30854
11/28/15	WN15	21890
12/5/15	WN15	29948
12/12/15	WN15	54254
12/19/15	WN15	52819
12/26/15	WN15	51974
1/2/16	WN15	55826
1/9/16	SP16	53300
1/16/16	SP16	52442
1/23/16	SP16	23084
1/30/16	SP16	31939
2/6/16	SP16	31445
2/13/16	SP16	30854
2/20/16	SP16	21890
2/27/16	SP16	29948
3/5/16	SP16	54254
3/12/16	SP16	52819
3/19/16	SP16	51974
3/26/16	SP16	55826
4/2/16	SP16	53300
		
&#13;
&#13;
&#13;

dput格式的数据。

DF <-
structure(list(Date = structure(c(28L, 29L, 33L, 30L, 31L, 32L, 
34L, 38L, 35L, 36L, 37L, 42L, 39L, 40L, 41L, 9L, 6L, 7L, 8L, 
10L, 14L, 11L, 12L, 13L, 18L, 15L, 16L, 17L, 2L, 5L, 1L, 3L, 
4L, 22L, 19L, 20L, 21L, 26L, 23L, 24L, 25L, 27L, 28L, 29L, 33L, 
30L, 31L, 32L, 34L, 38L, 35L, 36L, 37L, 42L, 39L, 40L, 41L, 9L, 
6L, 7L, 8L, 10L, 14L, 11L, 12L, 13L, 18L, 15L, 16L, 17L, 2L, 
5L, 1L, 3L, 4L, 22L, 19L, 20L, 21L, 26L, 23L, 24L, 25L, 27L), .Label = c("1/16/16", 
"1/2/16", "1/23/16", "1/30/16", "1/9/16", "10/10/15", "10/17/15", 
"10/24/15", "10/3/15", "10/31/15", "11/14/15", "11/21/15", "11/28/15", 
"11/7/15", "12/12/15", "12/19/15", "12/26/15", "12/5/15", "2/13/16", 
"2/20/16", "2/27/16", "2/6/16", "3/12/16", "3/19/16", "3/26/16", 
"3/5/16", "4/2/16", "6/20/15", "6/27/15", "7/11/15", "7/18/15", 
"7/25/15", "7/4/15", "8/1/15", "8/15/15", "8/22/15", "8/29/15", 
"8/8/15", "9/12/15", "9/19/15", "9/26/15", "9/5/15"), class = "factor"), 
    Season = structure(c(3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    4L, 4L, 4L, 4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 
    2L, 2L, 2L, 3L, 3L, 3L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 
    1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 4L, 
    4L, 4L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L
    ), .Label = c("FA15", "SP16", "SU15", "WN15"), class = "factor"), 
    Variable = c(67859L, 75251L, 100085L, 98760L, 95053L, 91286L, 
    88573L, 23084L, 31939L, 31445L, 30854L, 21890L, 29948L, 54254L, 
    52819L, 51974L, 55826L, 53300L, 52442L, 23084L, 31939L, 31445L, 
    30854L, 21890L, 29948L, 54254L, 52819L, 51974L, 55826L, 53300L, 
    52442L, 23084L, 31939L, 31445L, 30854L, 21890L, 29948L, 54254L, 
    52819L, 51974L, 55826L, 53300L, 67859L, 75251L, 100085L, 
    98760L, 95053L, 91286L, 88573L, 23084L, 31939L, 31445L, 30854L, 
    21890L, 29948L, 54254L, 52819L, 51974L, 55826L, 53300L, 52442L, 
    23084L, 31939L, 31445L, 30854L, 21890L, 29948L, 54254L, 52819L, 
    51974L, 55826L, 53300L, 52442L, 23084L, 31939L, 31445L, 30854L, 
    21890L, 29948L, 54254L, 52819L, 51974L, 55826L, 53300L)), class = "data.frame", row.names = c(NA, 
-84L))

2 个答案:

答案 0 :(得分:0)

我不确定以下是否符合您的要求 我使用包zoo来按季度aggregate.zoo时间序列。该软件包具有as.yearqtr函数,该函数返回给定日期的季度。

library(zoo)

DF$Date <- as.Date(DF$Date, "%m/%d/%y")

z <- zoo(DF$Variable, order.by = DF$Date)
aggregate(z, as.yearqtr(index(z)))
#2015 Q2 2015 Q3 2015 Q4 2016 Q1 2016 Q2 
# 286220 1499980 1083498 1091202  106600

答案 1 :(得分:0)

问题从未定义所需的输出,因此我们假设我们希望将每年/每季度1200的平均值作为Variable系列找到。

计算两个字母的季节ts,然后计算季度seasonqtr为第1季度,SP为第2季度等。同时计算两位数年份SU并从中计算yy类对象"yearqtr"。然后使用yqzoo(Variable)汇总yq。 (如果需要一些不同的聚合函数而不是mean,只需用它替换mean。)最后将生成的动物园对象mean转换为ts。

ag

,并提供:

library(zoo)

season <- gsub("\\d", "", DF$Season)
qtr <- match(season, c("SP", "SU", "FA", "WN"))
yy <- sub("..", "", DF$Season)
yq <- as.yearqtr(paste(yy, qtr), "%y %q")

ag <- aggregate(zoo(DF$Variable), yq, mean)
as.ts(ag)