仅选择两个列在分区中均为空或零的行

时间:2017-10-19 21:31:31

标签: postgresql select multiple-columns partition

我正在使用PG 9.6。

我花了很多时间寻找解决方案但找不到解决方案。我提出了自己的想法,但似乎可以用更优雅的方式完成。

如何仅选择由两列组合定义的分区中的行(即date和unique_id的组合),这两列具有至少四个从rn 1开始的连续行,在两列var1中没有null或零值VAR2?感谢。

date     unique_id   var1   var2      rn
01-03-03 abc         .5       .4       1
01-03-03 abc         .3       .1       2
01-03-03 abc         .6       .3       3
01-03-03 abc         .4       .1       4
01-03-03 abc         .3       .3       5
01-03-03 abc         .7       .8       6
01-03-03 xyz         .4       .1       1
01-03-03 xyz         .6       .5       2
01-03-03 xyz         .5       .3       3
01-03-03 xyz         .3       null     4
01-03-03 xyz         .1       .2       5
01-03-03 xyz         .1       .9       6
01-03-03 efg         .2       .3       1
01-03-03 efg         .3       .8       2
01-03-03 efg         .4       .2       3
01-03-03 efg          0         0      4
01-03-03 efg          0         0      5
01-03-03 efg          0         0      6
01-03-03 lmn         .4       .1       1
01-03-03 lmn         .6       .5       2
01-03-03 lmn         .5       .3       3
01-03-03 lmn         .3       .9       4
01-03-03 lmn         null     null     5
01-03-03 lmn         null     null     6

选择的理想结果:

date     unique_id   var1   var2      rn
01-03-03 abc         .5       .4       1
01-03-03 abc         .3       .1       2
01-03-03 abc         .6       .3       3
01-03-03 abc         .4       .1       4
01-03-03 abc         .3       .3       5
01-03-03 abc         .7       .8       6
01-03-03 lmn         .4       .1       1
01-03-03 lmn         .6       .5       2
01-03-03 lmn         .5       .3       3
01-03-03 lmn         .3       .9       4

这就是我所做的(有效但丑陋):

SELECT *
FROM(

SELECT
unique_id,
date,
var1,
var2,

sum(c1) OVER (PARTITION BY unique_id, date) as sum_c1,
sum(c2) OVER (PARTITION BY unique_id, date) as sum_c2,
sum(c3) OVER (PARTITION BY unique_id, date) as sum_c3



FROM

table1,

LATERAL (SELECT
            CASE
            WHEN rn <= 4 AND var1 IS NOT NULL AND var1 <> 0 AND var2 <> 0 AND 
            var2 IS NOT NULL THEN 1
            ELSE 0
            END   as c1
    ) as test1,

LATERAL (SELECT
            CASE
            WHEN rn <= 5 AND var1 IS NOT NULL AND var1 <> 0 AND var2 <> 0 AND 
            var2 IS NOT NULL THEN 1
            ELSE 0
            END   as c2
    ) as test2,

LATERAL (SELECT
            CASE
            WHEN rn <= 6 AND var1 IS NOT NULL AND var1 <> 0 AND var2 <> 0 AND 
            var2 IS NOT NULL THEN 1
            ELSE 0
            END   as c1
    ) as test3

) t

WHERE
  sum_c1 = 4 AND var1 IS NOT NULL AND var1 <> 0 AND var2 <> 0 AND var2 IS NOT 
  NULL 
OR
  sum_c2 = 5 AND var1 IS NOT NULL AND var1 <> 0 AND var2 <> 0 AND var2 IS NOT 
  NULL 
OR
  sum_c3 = 6 AND var1 IS NOT NULL AND var1 <> 0 AND var2 <> 0 AND var2 IS NOT 
  NULL 

ORDER BY
date, unique_id, rn

1 个答案:

答案 0 :(得分:1)

select * from table1 as t1
where not exists (
  select 1 from table1 as t2
  where
    t1.date = t2.date and
    t1.unique_id = t2.unique_id and
    t2.rn <= 4 and
    coalesce(t2.var1, 0) <> 0 and
    coalesce(t2.var2, 0) <> 0)