我有一个复杂的查询要在大型PostgreSQL表上执行。以下是数据样本:
我的目标是使用字符to_from
或y
填充n
列。
让第一行作为示例 - start
= 48749中的值和end
中的值= 50699.如果存在中的另一行> 在值为反向的表中,即end
= 48749,start
= 50699中的值,我想填写to_from
列两行 y
。如果逆不存在,则第一行应填充n
。这里的关键是遍历每一行并在表格中搜索它的反向。如果找到反转,则应插入y
。但是,如果有多个行包含反转,则只有第一个反行应该会收到y
。
我知道我应该按照
的方式构建我的查询SELECT *
FROM mytable
WHERE NOT EXISTS
AND
WHERE EXISTS
但我不确定如何产生我想要的输出。我应该创建一个重复的表并从那里开始吗?有关从哪里开始或采取步骤的任何指导?
以下是输出应该是什么样子的示例(如果它是10行)。一旦记录用于一对,它就不能用于另一个记录。
所以:
> my_table
ogc_fid track_fid start_gid end_gid to_from
1 1 1 100 82 y
2 2 2 82 100 y
3 3 3 100 82 y
4 4 4 100 32 n
5 5 5 82 100 y
6 6 6 82 100 y
7 7 7 82 100 n
8 8 8 100 82 y
9 9 9 34 100 n
10 10 10 31 100 n
答案 0 :(得分:0)
我认为您希望使用row_number()
和join
来识别匹配中的第一个:
select t.*,
coalesce(t2.new_to_from, 'n') as new_to_from
from (select t.*,
row_number() over (partition by start, end order by start) as seqnum
from t
) t left join
(select t.*, 'y' as new_to_from,
row_number() over (partition by start, end order by start) as seqnum
from t
) t2
on t2.start = t.end and t2.end = t.start and
t2.seqnum = 1 and t.seqnum = 1;
答案 1 :(得分:0)
您可以使用greatest
和least
来获取反向行数。如果存在多个此类行,请将y
分配给第一个此类对,否则请指定n
。
SELECT ogc_fid,
track_fid,
wkb_geo,
start_gid,
end_gid,
CASE
WHEN count(*) over(partition BY grtst,lst) > 1 THEN 'y'
--AND row_number() over(partition BY grtst,lst
--ORDER BY track_fid)<=2 THEN 'y'
WHEN count(*) over(partition BY grtst,lst) = 1 THEN 'n'
END AS to_from
FROM
(SELECT ogc_fid,
track_fid,
wkb_geo,
start_gid,
end_gid,
greatest(start_gid,end_gid) AS grtst,
least(start_gid,end_gid) AS lst
FROM mytable) t
答案 2 :(得分:0)
为每个start_gid和end_gid编号。然后使用LEAST
和GREATEST
查看gid组合(100/82 = 82/100)并查看哪些记录没有合作伙伴(即与该行号组合中没有其他记录)
select
ogc_fid, track_fid, start_gid, end_gid, to_from,
case when count(*) over (partition by small_gid, large_gid, rn) = 1 then 'n' else 'y' end
from
(
select
ogc_fid, track_fid, start_gid, end_gid, to_from,
least(start_gid, end_gid) as small_gid,
greatest(start_gid, end_gid) as large_gid,
row_number() over(partition by start_gid, end_gid order by track_fid) as rn
from mytable
) numbered;
答案 3 :(得分:0)
EXISTS()
产生一个布尔值,可以在CASE WHEN ...
条件表达式中使用:
UPDATE mytable t
SET to_from = CASE WHEN EXISTS( SELECT * FROM mytable x
WHERE x.start_gid = t.end_gid
AND x.end_gid = t.start_gid )
AND NOT EXISTS( SELECT * FROM mytable nx
WHERE nx.start_gid = t.start_gid
AND nx.end_gid = t.end_gid
AND nx.ogc_fid > t.ogc_fid -- tie-breaker :: only the first will get a 'y'
)
THEN 'y' ELSE 'n' END
;