在PostgreSQL中查找对称对

时间:2016-12-30 03:51:54

标签: sql postgresql

我有一个复杂的查询要在大型PostgreSQL表上执行。以下是数据样本:

enter image description here

我的目标是使用字符to_fromy填充n列。

让第一行作为示例 - start = 48749中的值和end中的值= 50699.如果存在中的另一行 在值为反向的表中,即end = 48749,start = 50699中的值,我想填写to_from两行 y。如果逆不存在,则第一行应填充n。这里的关键是遍历每一行并在表格中搜索它的反向。如果找到反转,则应插入y。但是,如果有多个行包含反转,则只有第一个反行应该会收到y

我知道我应该按照

的方式构建我的查询
SELECT  *
FROM    mytable 
WHERE   NOT EXISTS
AND
WHERE EXISTS

但我不确定如何产生我想要的输出。我应该创建一个重复的表并从那里开始吗?有关从哪里开始或采取步骤的任何指导?

以下是输出应该是什么样子的示例(如果它是10行)。一旦记录用于一对,它就不能用于另一个记录。

所以:

> my_table
   ogc_fid track_fid start_gid end_gid to_from
1        1         1       100      82       y
2        2         2        82     100       y
3        3         3       100      82       y
4        4         4       100      32       n
5        5         5        82     100       y
6        6         6        82     100       y
7        7         7        82     100       n
8        8         8       100      82       y
9        9         9        34     100       n
10      10        10        31     100       n

4 个答案:

答案 0 :(得分:0)

我认为您希望使用row_number()join来识别匹配中的第一个:

select t.*,
       coalesce(t2.new_to_from, 'n') as new_to_from
from (select t.*,
             row_number() over (partition by start, end order by start) as seqnum
      from t
     ) t left join
     (select t.*, 'y' as new_to_from,
             row_number() over (partition by start, end order by start) as seqnum
      from t
     ) t2
     on t2.start = t.end and t2.end = t.start and
        t2.seqnum = 1 and t.seqnum = 1;

答案 1 :(得分:0)

您可以使用greatestleast来获取反向行数。如果存在多个此类行,请将y分配给第一个此类对,否则请指定n

SELECT ogc_fid,
       track_fid,
       wkb_geo,
       start_gid,
       end_gid,
       CASE
           WHEN count(*) over(partition BY grtst,lst) > 1 THEN 'y'
                --AND row_number() over(partition BY grtst,lst
                                      --ORDER BY track_fid)<=2 THEN 'y'
           WHEN count(*) over(partition BY grtst,lst) = 1 THEN 'n'
       END AS to_from
FROM
  (SELECT ogc_fid,
          track_fid,
          wkb_geo,
          start_gid,
          end_gid,
          greatest(start_gid,end_gid) AS grtst,
          least(start_gid,end_gid) AS lst
   FROM mytable) t

答案 2 :(得分:0)

为每个start_gid和end_gid编号。然后使用LEASTGREATEST查看gid组合(100/82 = 82/100)并查看哪些记录没有合作伙伴(即与该行号组合中没有其他记录)

select
  ogc_fid, track_fid, start_gid, end_gid, to_from,
  case when count(*) over (partition by small_gid, large_gid, rn) = 1 then 'n' else 'y' end
from
(
  select 
    ogc_fid, track_fid, start_gid, end_gid, to_from,
    least(start_gid, end_gid) as small_gid,
    greatest(start_gid, end_gid) as large_gid,
    row_number() over(partition by start_gid, end_gid order by track_fid) as rn
  from mytable
) numbered;

答案 3 :(得分:0)

EXISTS()产生一个布尔值,可以在CASE WHEN ...条件表达式中使用:

UPDATE mytable t
SET to_from = CASE WHEN EXISTS( SELECT * FROM mytable x
                          WHERE x.start_gid = t.end_gid
                          AND x.end_gid = t.start_gid )
                        AND NOT EXISTS( SELECT * FROM mytable nx
                          WHERE nx.start_gid = t.start_gid
                          AND nx.end_gid = t.end_gid
                          AND nx.ogc_fid > t.ogc_fid -- tie-breaker :: only the first will get a 'y'
                        )
                THEN 'y' ELSE 'n' END
        ;