Question

我有这张桌子：

create table t (value int, dt date);

 value |     dt     
-------+------------
    10 | 2012-10-30
    15 | 2012-10-29
  null | 2012-10-28
  null | 2012-10-27
     7 | 2012-10-26

我想要这个输出：

 value |     dt     
-------+------------
    10 | 2012-10-30
     5 | 2012-10-29
     5 | 2012-10-28
     5 | 2012-10-27
     7 | 2012-10-26

我希望当按降序排序表时，将空值以及前一个非空值替换为先前非空值的平均值。在此示例中，值15是接下来的两个空值的先前非空值。所以15/3 = 5。

SQL Fiddle

Answer 1

我找到了一个非常简单的解决方案：

SELECT max(value) OVER (PARTITION BY grp)
      / count(*)  OVER (PARTITION BY grp) AS value
      ,dt
FROM   (
   SELECT *, count(value) OVER (ORDER BY dt DESC) AS grp
   FROM   t
   ) a;

-> sqlfiddle

由于count()忽略NULL值，您可以使用运行计数（窗口函数中的默认值）快速对值进行分组（ - ＆gt; grp）。

每个组都有一个非空值，因此我们可以使用min / max / sum在另一个窗口函数中获得相同的结果。在count(*)中除以成员数量（NULL这次，以计算grp值！），我们就完成了。

Answer 2

作为一个谜题，这是一个解决方案......在实践中，它可能会根据您的数据的性质而可怕地执行。无论如何，请观察您的索引：

create database tmp;
create table t (value float, dt date); -- if you use int, you need to care about rounding
insert into t values (10, '2012-10-30'), (15, '2012-10-29'), (null, '2012-10-28'), (null, '2012-10-27'), (7, '2012-10-26');

select t1.dt, t1.value, t2.dt, t2.value, count(*) cnt 
from t t1, t t2, t t3 
where 
    t2.dt >= t1.dt and t2.value is not null 
    and not exists (
        select * 
        from t 
        where t.dt < t2.dt and t.dt >= t1.dt and t.value is not null
    ) 
    and t3.dt <= t2.dt 
    and not exists (
        select * 
        from t where t.dt >= t3.dt and t.dt < t2.dt and t.value is not null
    ) 
group by t1.dt;

+------------+-------+------------+-------+-----+
| dt         | value | dt         | value | cnt |
+------------+-------+------------+-------+-----+
| 2012-10-26 |     7 | 2012-10-26 |     7 |   1 |
| 2012-10-27 |  NULL | 2012-10-29 |    15 |   3 |
| 2012-10-28 |  NULL | 2012-10-29 |    15 |   3 |
| 2012-10-29 |    15 | 2012-10-29 |    15 |   3 |
| 2012-10-30 |    10 | 2012-10-30 |    10 |   1 |
+------------+-------+------------+-------+-----+
5 rows in set (0.00 sec)

select dt, value/cnt 
from (
    select t1.dt , t2.value, count(*) cnt 
    from t t1, t t2, t t3 
    where 
        t2.dt >= t1.dt and t2.value is not null 
        and not exists (
            select * 
            from t 
            where t.dt < t2.dt and t.dt >= t1.dt and t.value is not null
        ) 
    and t3.dt <= t2.dt 
    and not exists (
        select * 
        from t 
        where t.dt >= t3.dt and t.dt < t2.dt and t.value is not null
    ) 
    group by t1.dt
) x;

+------------+-----------+
| dt         | value/cnt |
+------------+-----------+
| 2012-10-26 |         7 |
| 2012-10-27 |         5 |
| 2012-10-28 |         5 |
| 2012-10-29 |         5 |
| 2012-10-30 |        10 |
+------------+-----------+
5 rows in set (0.00 sec)

说明：

t1是原始表
t2是表中具有非空值的最小日期
t3是介于两者之间的所有行，因此我们可以将其他行分组并计算

抱歉，我不能更清楚。这对我来说也很困惑： - ）

平均难以定义分区

2 个答案: