平台: PostgreSQL
表格:
shortlist: name (text), city (text)...
data1: name (text), ranking (integer), score1 (double)...
data2: name (text), ranking (integer), score1 (double)...
data3: name (text), ranking (integer), score1 (double)...
data4: name (text), ranking (integer), score1 (double)...
数量相似的数据表格数量有限。
我想将shortlist
中的每一行加入每个data
表中由similarity(shortlist.name, data#.name)
确定的名称最相似的行。
相同想法的伪代码:
for each s_row in shortlist:
select shortlist.*
join (SELECT data1.*, similarity(s_row.name, data1.name) AS sim FROM data1 ORDER BY sim DESC LIMIT 1)
join (SELECT data2.*, similarity(s_row.name, data2.name) AS sim FROM data2 ORDER BY sim DESC LIMIT 1)
join (SELECT data3.*, similarity(s_row.name, data3.name) AS sim FROM data3 ORDER BY sim DESC LIMIT 1)
join (SELECT data4.*, similarity(s_row.name, data4.name) AS sim FROM data4 ORDER BY sim DESC LIMIT 1)
有没有办法在SQL中执行此操作?
答案 0 :(得分:1)
我不完全确定你的意思是这样的:
select s.name,
d1.name as d1_name,
d2.name as d2_name
from shortlist s
left join lateral (
SELECT data1.*, similarity(s.name, data1.name) AS sim
FROM data1
ORDER BY sim
DESC LIMIT 1
) d1 on true
left join lateral (
SELECT data2.*, similarity(s.name, data2.name) AS sim
FROM data2
ORDER BY sim DESC
LIMIT 1
) d2 on true
您希望每个表都有一个外部联接(left join
),否则如果至少有一个表中没有匹配项,您将看不到任何内容。