按照Netezza中的滚动日期间隔进行分组

时间:2016-02-03 17:37:55

标签: postgresql aggregate netezza

我在Netezza有一张看起来像这样的表

Date         Stock    Return
2015-01-01   A        xxx
2015-01-02   A        xxx
2015-01-03   A        0
2015-01-04   A        0
2015-01-05   A        xxx
2015-01-06   A        xxx
2015-01-07   A        xxx
2015-01-08   A        xxx
2015-01-09   A        xxx
2015-01-10   A        0
2015-01-11   A        0
2015-01-12   A        xxx
2015-01-13   A        xxx
2015-01-14   A        xxx
2015-01-15   A        xxx
2015-01-16   A        xxx
2015-01-17   A        0
2015-01-18   A        0
2015-01-19   A        xxx
2015-01-20   A        xxx

数据代表各种股票和日期的股票回报。我需要做的是按给定的间隔和该间隔的日期对数据进行分组。另一个困难是周末(0)必须打折(忽略公共假期)。并且第一个间隔的开始日期应该是任意日期。

例如,我的出局应该看起来像这样

Interval    Q01    Q02    Q03    Q04    Q05
1           xxx    xxx    xxx    xxx    xxx
2           xxx    xxx    xxx    xxx    xxx
3           xxx    xxx    xxx    xxx    xxx 
4           xxx    xxx    xxx    xxx    xxx

此输出表示长度为5个工作日的间隔,平均回报为结果,以上述原始数据表示, 开始日期1月1日,第1个间隔包括1/2/5/6/7(3个和4个周末被忽略)Q01将是第1个,Q02是第2个,Q03是第5个等等。第二个间隔从8/9开始/ 12/13/14

我尝试失败的是使用

CEIL(CAST(EXTRACT(DOY FROM DATE) AS FLOAT) / CAST (10 AS FLOAT)) AS interval
EXTRACT(DAY FROM DATE) % 10 AS DAYinInterval

我也尝试过使用滚动计数器和可变的开始日期,将我的DOY设置为零,这样的s.th

CEIL(CAST(EXTRACT(DOY FROM DATE) - EXTRACT(DOY FROM 'start-date' AS FLOAT) / CAST (10 AS FLOAT)) AS Interval

最接近我期望的一件事就是这个     SUM(Number)OVER(按日期排序按日期排序ASC第10行)AS计数器

不幸的是,它从1到10,然后是11s,它应该从1再到10。

我很想知道如何以优雅的方式实现这一点。感谢

1 个答案:

答案 0 :(得分:1)

我不完全确定我理解这个问题,但我 我可能会这样做,所以我会在这个问题上采取行动一些窗口化的聚合和子查询。

这里是样本数据,在工作日插入一些随机的非零数据。

    DATE    | STOCK | RETURN
------------+-------+--------
 2015-01-01 | A     |     16
 2015-01-02 | A     |     80
 2015-01-03 | A     |      0
 2015-01-04 | A     |      0
 2015-01-05 | A     |     60
 2015-01-06 | A     |     25
 2015-01-07 | A     |     12
 2015-01-08 | A     |      1
 2015-01-09 | A     |     81
 2015-01-10 | A     |      0
 2015-01-11 | A     |      0
 2015-01-12 | A     |     35
 2015-01-13 | A     |     20
 2015-01-14 | A     |     69
 2015-01-15 | A     |     72
 2015-01-16 | A     |     89
 2015-01-17 | A     |      0
 2015-01-18 | A     |      0
 2015-01-19 | A     |    100
 2015-01-20 | A     |     67
(20 rows)

这是我对它的看法,带有嵌入式评论。

select avg(return),
   date_period,
   day_period
from (
        -- use row_number to generate a sequential value for each DOW,
        -- with a WHERE to filter out the weekends
      select date,
         stock,
         return,
         date_period ,
         row_number() over (partition by date_period order by date asc) day_period
      from (
            -- bin out the entries by date_period using the first_value of the entire set as the starting point
            -- modulo 7
            select date,
               stock,
               return,
               date + (first_value(date) over (order by date asc) - date) % 7 date_period
            from stocks
            where date >= '2015-01-01'
            -- setting the starting period date here
         )
         foo
      where extract (dow from date) not in (1,7)
   )
   foo
group by date_period, day_period
order by date_period asc;

结果:

    AVG     | DATE_PERIOD | DAY_PERIOD
------------+-------------+------------
  16.000000 | 2015-01-01  |          1
  80.000000 | 2015-01-01  |          2
  60.000000 | 2015-01-01  |          3
  25.000000 | 2015-01-01  |          4
  12.000000 | 2015-01-01  |          5
   1.000000 | 2015-01-08  |          1
  81.000000 | 2015-01-08  |          2
  35.000000 | 2015-01-08  |          3
  20.000000 | 2015-01-08  |          4
  69.000000 | 2015-01-08  |          5
  72.000000 | 2015-01-15  |          1
  89.000000 | 2015-01-15  |          2
 100.000000 | 2015-01-15  |          3
  67.000000 | 2015-01-15  |          4
(14 rows)

将开始日期更改为' 2015-01-03'看它是否适当调整:

...
from stocks
where date >= '2015-01-03'
...

结果:

   AVG     | DATE_PERIOD | DAY_PERIOD
------------+-------------+------------
  60.000000 | 2015-01-03  |          1
  25.000000 | 2015-01-03  |          2
  12.000000 | 2015-01-03  |          3
   1.000000 | 2015-01-03  |          4
  81.000000 | 2015-01-03  |          5
  35.000000 | 2015-01-10  |          1
  20.000000 | 2015-01-10  |          2
  69.000000 | 2015-01-10  |          3
  72.000000 | 2015-01-10  |          4
  89.000000 | 2015-01-10  |          5
 100.000000 | 2015-01-17  |          1
  67.000000 | 2015-01-17  |          2
(12 rows)