查找以下数据集的平均值

时间:2014-06-13 20:46:17

标签: sql oracle plsql

以下是数据。

select * from  (
    select to_date('20140601','YYYYMMDD') log_date, null weight  from dual
    union
    select to_date('20140601','YYYYMMDD')+1 log_date, 0 weight   from dual
    union
    select to_date('20140601','YYYYMMDD')+2 log_date, 4 weight   from dual
    union
    select to_date('20140601','YYYYMMDD')+3 log_date, 4 weight  from dual
    union
    select to_date('20140601','YYYYMMDD')+4 log_date, null weight from dual
    union
    select to_date('20140601','YYYYMMDD')+5 log_date, 8 weight  from dual);

Log_date   weight  avg_weight
---------------------------------- 
6/1/2014   NULL    0    (0/1) Since no previous data, I consider it as 0
6/2/2014   0       0    ((0+0)/2)
6/3/2014   4       4/3  ((0+0+4)/3)
6/4/2014   4       2    (0+0+4+4)/4
6/5/2014   NULL    2    (0+0+4+4+2)/5   Since it is NULL I want to take previous day avg = 2
6/6/2014   8       3    (0+0+4+4+2+8)/6 =3

因此上述数据的平均值应为3。

如何在SQL而不是PLSQL中实现这一点。感谢任何帮助。

1 个答案:

答案 0 :(得分:0)

我刚刚学会了如何使用递归CTE,真的很兴奋!希望这会有所帮助...

; WITH RawData (log_Date, Weight) AS (
                select cast('2014-06-01' as SMALLDATETIME)+0, null
    UNION ALL   select cast('2014-06-01' as SMALLDATETIME)+1, 0
    UNION ALL   select cast('2014-06-01' as SMALLDATETIME)+2, 4
    UNION ALL   select cast('2014-06-01' as SMALLDATETIME)+3, 4
    UNION ALL   select cast('2014-06-01' as SMALLDATETIME)+4, null
    UNION ALL   select cast('2014-06-01' as SMALLDATETIME)+5, 8
)
, IndexedData (Id, log_Date, Weight) AS (
    SELECT ROW_NUMBER() OVER (ORDER BY log_Date)
         , log_Date
         , Weight
    FROM RawData
)
, ResultData (Id, log_Date, Weight, total, avg_weight) AS (
    SELECT    Id
            , log_Date
            , Weight
            , CAST(CASE WHEN Weight IS NULL THEN 0 ELSE Weight END AS FLOAT)
            , CAST(CASE WHEN Weight IS NULL THEN 0 ELSE Weight END AS FLOAT)
        FROM IndexedData
        WHERE Id = 1
    UNION ALL
    SELECT    i.Id
            , i.log_Date
            , i.Weight
            , CAST(r.total + CASE WHEN i.Weight IS NULL THEN r.avg_weight ELSE i.Weight END AS FLOAT)
            , CAST(r.total + CASE WHEN i.Weight IS NULL THEN r.avg_weight ELSE i.Weight END AS FLOAT) / i.Id
    FROM ResultData r
    JOIN IndexedData i ON i.Id = r.Id + 1
)
SELECT Log_Date, Weight, avg_weight FROM ResultData
OPTION (MAXRECURSION 0)

这给出了输出:

Log_Date                Weight      avg_weight
----------------------- ----------- ----------------------
2014-06-01 00:00:00     NULL        0
2014-06-02 00:00:00     0           0
2014-06-03 00:00:00     4           1.33333333333333
2014-06-04 00:00:00     4           2
2014-06-05 00:00:00     NULL        2
2014-06-06 00:00:00     8           3

请注意,在我的回答中,我修改了"数据"你问题的一部分,因为它没有为我编译。它仍然是相同的数据,希望它有所帮助。

编辑:默认情况下,MAXRECURSION设置为100.这意味着查询不适用于超过101行的原始数据。通过添加OPTION (MAXRECURSION 0),我删除了此限制,以便查询适用于所有输入数据。但是,如果查询未经过彻底测试,则可能会导致无限递归。