本月到上个月的唯一ID,而上个月到本月缺少ID

时间:2019-02-04 07:11:51

标签: sql postgresql

我正在尝试获取与上个月相比本月的唯一ID,以及从上个月到本月的唯一缺失ID。我有一个表,其中有两个字段,即:快照名和资产rcsysid。我已为此加载了201901、201902和201812的示例数据。理想情况下,我将这些数据追溯到201401,但对于试用版,我将使用3个月的数据进行此操作。

请提出如何改进此查询的建议,以使其涵盖追溯到201401的所有月份。

我已经使用下面的查询来获取快照名称,incoming_sysid(在上个月不存在的本月中新的快照)和outgoing_sysid(在本月中的前一个月中缺失)。我在Windows 7 64位平台上使用PostgreSQL运行了此查询。

WITH incoming AS
  (SELECT snapshotname,
          count(DISTINCT assetsrcsysid) AS incoming_sysid
   FROM public.sampleassetsrcsysid
   WHERE snapshotname = '201901'
     AND assetsrcsysid NOT IN
       (SELECT assetsrcsysid
        FROM public.sampleassetsrcsysid
        WHERE snapshotname = '201812' )
   GROUP BY snapshotname),

     outgoing AS
  (SELECT snapshotname,
          count(DISTINCT assetsrcsysid) AS outgoing_sysid
   FROM public.sampleassetsrcsysid
   WHERE snapshotname = '201812'
     AND assetsrcsysid NOT IN
       (SELECT assetsrcsysid
        FROM public.sampleassetsrcsysid
        WHERE snapshotname = '201901' )
   GROUP BY snapshotname)

SELECT incoming.snapshotname,
       incoming.incoming_sysid,
       outgoing.outgoing_sysid
FROM incoming,
     outgoing
WHERE 1=1;

预期结果:

snapshotname  incoming_sysid  outgoing_sysid
201902          4               3
201901          3               5

虚拟数据:

snapshotname    assetsrcsysid
201901  s1
201901  s2
201901  s3
201901  s4
201901  s5
201901  s6
201901  s15
201812  s1
201812  s2
201812  s3
201812  s4
201812  s7
201812  s9
201812  s10
201812  s11
201812  s12
201902  s1
201902  s2
201902  s3
201902  s13
201902  s17
201902  s19
201902  s20
201902  s5

enter image description here

1 个答案:

答案 0 :(得分:0)

假设我正确理解了您的问题,那么我将使用窗口函数采用完全不同的方法。这从两件事开始:

  • 每个snapshotnameassetsrcsysid都有一条记录,这使得使用窗口函数更加容易。
  • 期间的枚举,因此它们只是数字。

然后,我们可以使用lead()lag()来确定给定的assetsrcsysid是在下个月还是上个月,从而大大简化了查询:

with s as (
      select s.*,
             dense_rank() over (order by snapshotname) as snapshotname_index
      from (select distinct snapshotname, assetsrcsysid
            from public.sampleassetsrcsysid s
           ) s
     )
select snapshotname,
       sum( (prev_snapshotname_index = snapshotname_index - 1)::int ) as num_incoming,
       sum( (next_snapshotname_index = snapshotname_index + 1)::int ) as num_outcoming
from (select s.*,
             lag(snapshotname_index) over (partition by assetsrcsysid order by assetsrcsysid) as prev_snapshotname_index,
             lead(snapshotname_index) over (partition by assetsrcsysid order by assetsrcsysid) as next_snapshotname_index
      from s
     ) s
group by snapshotname
order by snapshotname;