在MySQL中创建运行方差列

时间:2014-01-11 00:14:10

标签: mysql

我正在寻找在MySQL表中创建一个列来计算另一列的最后五个值的运行方差(或标准偏差,无论哪个最简单)。我目前正在使用三个变量对数据进行排序:ID,日期和计数器(计数器从每个ID日期配对的1上升)。所以每次新的ID日期组合开始时,我都希望重置这个新的方差列。以下是我想要的一些示例:

+----+--------+---------+-------+--------------------------+
| ID |  date  | counter | value |        var(value)        |
+----+--------+---------+-------+--------------------------+
| 11 | 1/1/13 |       1 | 2.1   | var(2.1)                 |
| 11 | 1/1/13 |       2 | 2.4   | var(2.1,2.4)             |
| 11 | 1/1/13 |       3 | 2.3   | var(2.1,2.4,2.3)         |
| 11 | 1/1/13 |       4 | 2.5   | var(2.1,2.4,2.3,2.5)     |
| 11 | 1/1/13 |       5 | 2.3   | var(2.1,2.4,2.3,2.5,2.3) |
| 11 | 1/1/13 |       6 | 2.5   | var(2.4,2.3,2.5,2.3,2.5) |
| 11 | 1/3/13 |       1 | 5.4   | var(5.4)                 |
| 11 | 1/3/13 |       2 | 4.3   | var(5.4,4.3)             |
| 11 | 1/3/13 |       3 | 3.4   | var(5.4,4.3,3.4)         |
| 11 | 1/3/13 |       4 | 2.1   | var(5.4,4.3,3.4,2.1)     |
+----+--------+---------+-------+--------------------------+

有谁知道如何在MySQL中完成这项工作?我没有找到任何类似于此问题的解决方案。

非常感谢!

2 个答案:

答案 0 :(得分:2)

您可以使用合适的聚合函数分组自我加入,例如VARIANCE()

SELECT   a.*, VARIANCE(b.value)
FROM     my_table a
    JOIN my_table b ON b.ID       = a.ID
                   AND b.date     = a.date
                   AND b.counter <= a.counter
GROUP BY a.ID, a.date, a.counter, a.value

sqlfiddle上查看。

答案 1 :(得分:0)

方差是每个值与平均值之差的平方和的平均值。所以,你可以通过大量的连接和算术来做到这一点。像这样:

select t.*,
       (case when t1.date is null then 0
             when t2.date is null
             then (pow(t.value - (t.value + t1.value) / 2, 2) +
                  pow(t1.value - (t.value + t1.value) / 2, 2))/2
             when t3.date is null
             then (pow(t.value - (t.value + t1.value + t2.value) / 3, 2) +
                   pow(t1.value - (t.value + t1.value + t2.value) / 3, 2) + 
                   pow(t2.value - (t.value + t1.value + t2.value) / 3, 2)
                  ) / 3
             when t4.date is null
             then (pow(t.value - (t.value + t1.value + t2.value + t3.value) / 4, 2) +
                   pow(t1.value - (t.value + t1.value + t2.value + t3.value) / 4, 2) + 
                   pow(t2.value - (t.value + t1.value + t2.value + t3.value) / 4, 2) +
                   pow(t3.value - (t.value + t1.value + t2.value + t3.value) / 4, 2)
                  ) / 4
             else (pow(t.value - (t.value + t1.value + t2.value + t3.value + t4.value) / 5, 2) +
                   pow(t1.value - (t.value + t1.value + t2.value + t3.value + t4.value) / 5, 2) + 
                   pow(t2.value - (t.value + t1.value + t2.value + t3.value + t4.value) / 5, 2) +
                   pow(t3.value - (t.value + t1.value + t2.value + t3.value + t4.value) / 5, 2) +
                   pow(t4.value - (t.value + t1.value + t2.value + t3.value + t4.value) / 5, 2)
                  ) / 5
        end) as var
from t left outer join
     t t1
     on t.date = t1.date and t.counter = t1.counter + 1 left outer join
     t t2
     on t.date = t2.date and t.counter = t2.counter + 2 left outer join
     t t3
     on t.date = t3.date and t.counter = t3.counter + 3 left outer join
     t t4
     on t.date = t4.date and t.counter = t4.counter + 4;