我有一个汇总表,如下所示:
+---------+----------+
| left_id | right_id |
+---------+----------+
| a | b |
+---------+----------+
| a | c |
+---------+----------+
还有一个值表:
+----+-------+
| id | value |
+----+-------+
| a | 1 |
+----+-------+
| a | 2 |
+----+-------+
| a | 3 |
+----+-------+
| b | 1 |
+----+-------+
| b | 4 |
+----+-------+
| b | 5 |
+----+-------+
| c | 1 |
+----+-------+
| c | 2 |
+----+-------+
| c | 3 |
+----+-------+
| c | 4 |
+----+-------+
对于每对,我想计算并比较,相交和设置差异(每种方式)的长度,以比较值,以便输出看起来像这样:
+---------+----------+-------+--------------+-----------+------------+
| left_id | right_id | union | intersection | left_diff | right_diff |
+---------+----------+-------+--------------+-----------+------------+
| a | b | 5 | 1 | 2 | 2 |
+---------+----------+-------+--------------+-----------+------------+
| a | c | 4 | 3 | 0 | 1 |
+---------+----------+-------+--------------+-----------+------------+
使用PostgreSQL解决此问题的最佳方法是什么?
更新:这是带有数据https://rextester.com/RWID9864的右下角链接
答案 0 :(得分:1)
您需要执行此操作的标量子查询。
UNION也可以用retrieve
表示,这使得该查询的编写时间较短。但是对于交叉点,您需要更长的查询时间。
要计算“差异”,请使用def subString(string)
sentence = string
print"=========================\n"
print sentence
print "\n"
print "Enter the word you want to replace: "
replaceWord = gets
print "Enter what you want the new word to be: "
newWord = gets
sentence[replaceWord] = [newWord]
print sentence
#newString = sentence.gsub(replaceWord, newWord)
#newString = sentence.gsub("World", "Ruby")
#print newString
end
运算符:
OR
答案 1 :(得分:1)
我不知道是什么原因导致您运行缓慢,因为我看不到表格大小和/或无法解释计划。假设两个表都足够大,以至于嵌套循环效率低下,并且不敢考虑将值连接到自身,那么我将尝试从这样的标量子查询中重写它:
select p.*,
coalesce(stats."union", 0) "union",
coalesce(stats.intersection, 0) intersection,
coalesce(stats.left_cnt - stats.intersection, 0) left_diff,
coalesce(stats.right_cnt - stats.intersection, 0) right_diff
from pairs p
left join (
select left_id,
right_id,
count(*) "union",
count(has_left and has_right) intersection,
count(has_left) left_cnt,
count(has_right) right_cnt
from (
select p.*,
v."value" the_value,
true has_left
from pairs p
join "values" v on v.id = p.left_id
) l
full join (
select p.*,
v."value" the_value,
true has_right
from pairs p
join "values" v on v.id = p.right_id
) r using(left_id, right_id, the_value)
group by left_id,
right_id
) stats on p.left_id = stats.left_id
and p.right_id = stats.right_id;
这里的每个连接条件都允许散列和/或合并连接,因此计划者将有机会避免嵌套循环。