Greenplum 4.3:填写缺失值

时间:2018-04-12 04:29:00

标签: sql greenplum

我正在尝试使用Greenplum Database 4.3.23.0的SQL

得到一张包含

等数据的表格
ID  ID2    Code Type_ID Status  y_id    Latest_flag
10  10001   205   7        P    114         Y
10  10001   205   7        P    116       NULL
10  10002   205   6        P    116         Y
10  10002   205   6        P    120         Y

期望输出

ID  ID2    Code Type_ID Status  y_id    Latest_flag
10  10001   205   7        P    114         Y
**10    10001   205   7        P    115     Y**
10  10001   205   7        P    116       NULL
10  10002   205   6        P    116         Y
**10    10002   205   6        P    117     Y**
**10    10002   205   6        P    118     Y
10  10002   205   6        P    119         Y**
10  10002   205   6        P    120         Y

SQL查询到现在为止

with a as
(
select
  id,
  id2,
  generate_series(minAD, maxAD, 1) dt
from (
       select
         id,
         id2,
         min(y_id) minAD,
         max(y_id) maxAD
       from table1
       where id in (10) 
       group by id, id2) a
)
select distinct a.* from (
select
a.id,
a.id2,
a.dt,
code,
type_id,
status,
Latest_flag,
from a
left join (select id,
           id2,
           y_id,
           t1.code,
           t1.type_id,
           t1.status,
           t1.latest_flag
         from table1 t1 where latest_flag = 'Y' 
        ) t1 on
               t1.id = a.id and t1.id2 = a.id2
and y_id <= dt 
order by t1.id, t1.id2, y_id desc) a

输出

ID  ID2    Code Type_ID Status  y_id    Latest_flag
10  10001   205   7        P    114         Y
10  10001   205   7        P    115         Y
**10    10001   205   7        P    116         Y**
10  10002   205   6        P    116         Y
10  10002   205   6        P    117         Y
10  10002   205   6        P    118         Y
10  10002   205   6        P    119         Y
10  10002   205   6        P    120         Y

如果我有一个带有所有y_id值的ID而没有丢失数据,那么就不需要做任何事情了。这必须根据Lead_flag =&#39; Y&#39;来填补空白。

谢谢

1 个答案:

答案 0 :(得分:0)

你很近,你可以编写一个子查询来从表中获取latest_flag IS NULL,然后LEFT JOIN Generate_series CTE。如果子查询的Latest_flagNULL,则它们将匹配,因此请使用CASE WHEN表达式来过滤Latest_flag

像这样:

WITH a 
     AS (SELECT a.*, 
                Generate_series(minad, maxad, 1) dt 
         FROM   (SELECT id, 
                        id2, 
                        code, 
                        type_id, 
                        status, 
                        Min(y_id) minAD, 
                        Max(y_id) maxAD 
                 FROM   table1 
                 WHERE  id IN ( 10 ) 
                 GROUP  BY id, 
                           id2, 
                           code, 
                           type_id, 
                           status) a) 
SELECT a.id, 
       a.id2, 
       a.code, 
       a.type_id, 
       a.status, 
       dt  "y_id", 
       CASE 
         WHEN t.id IS NULL 
              AND t.id2 IS NULL THEN 'Y' 
         ELSE  NULL 
       END Latest_flag 
FROM   a 
       LEFT JOIN (SELECT * 
                  FROM   table1 
                  WHERE  latest_flag IS NULL) t 
              ON a.id = t.id 
                 AND a.id2 = t.id2 
                 AND a.dt = t.y_id 

sqlfiddle:http://sqlfiddle.com/#!15/05b28/28