我有下表和一些示例记录:
id | attr1_id | attr2_id | user_id | rating_id | override_comment
------+----------+----------+-------------------+-----------+------------------
1 | 188 | 201 | user_1@domain.com | 3 |
2 | 193 | 201 | user_2@domain.com | 2 |
3 | 193 | 201 | user_2@domain.com | 1 |
4 | 194 | 201 | user_2@domain.com | 1 |
5 | 194 | 201 | user_1@domain.com | 1 |
6 | 192 | 201 | user_2@domain.com | 1 |
(attr1_id
,attr2_id
,user_id
)的组合为UNIQUE
,这意味着每个用户只能创建一条具有特定属性ID的记录。
我的目标是计算rating_id = 1
的行数,但只计算attr1_id
和attr2_id
的每个组合只有一次,并且只计算不存在任何其他行的位置(由其他用户提供)rating_id > 1
并引用相同的attr1_id
和attr2_id
。
请注意,attr1_id
和attr2_id
的组合可以切换,因此给出了以下两条记录:
id | attr1_id | attr2_id | user_id | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
20 | 5 | 2 | user_1@domain.com | 3 |
------+----------+----------+--------------------+-----------+------------------
21 | 2 | 5 | user_2@domain.com | 1 |
不应计算任何行,因为行引用attr_ids
的相同组合,其中一行有rating_id > 1
。
但是,如果存在这两行:
id | attr1_id | attr2_id | user_id | rating_id | override_comment
------+----------+----------+--------------------+-----------+------------------
20 | 5 | 2 | user_1@domain.com | 1 |
------+----------+----------+--------------------+-----------+------------------
21 | 2 | 5 | user_2@domain.com | 1 |
------+----------+----------+--------------------+-----------+------------------
22 | 2 | 5 | user_3@domain.com | 1 |
所有行都应该只计为一行,因为它们都共享attr1_id
和attr2_id
的相同组合,并且都有rating_id = 1
。
到目前为止,我的方法是这样,但它导致根本没有选择任何行。
SELECT *
FROM compatibility c
WHERE rating_id > 1
AND NOT EXISTs
(SELECT *
FROM compatibility c2
WHERE c.rating_id > 1
AND (
(c.attr1_id = c2.attr1_id) AND (c.attr2_id = c2.attr2_id)
OR
(c.attr1_id = c2.attr2_id) AND (c.attr2_id = c2.attr1_id)
)
)
我怎样才能做到这一点?
答案 0 :(得分:2)
我的目标是计算rating_id = 1的行数,但仅限 只计算一次attr1_id和attr2_id的每个组合 哪里没有任何其他行(由其他用户)有rating_id> 1
您的原始查询是在正确的轨道上排除违规行。您刚刚>
而不是=
。计数的棘手步骤不见了。
SELECT count(*) AS ct
FROM (
SELECT 1
FROM compatibility c
WHERE rating_id = 1
AND NOT EXISTS (
SELECT 1
FROM compatibility c2
WHERE c2.rating_id > 1
AND (c2.attr1_id = c.attr1_id AND c2.attr2_id = c.attr2_id OR
c2.attr1_id = c.attr2_id AND c2.attr2_id = c.attr1_id))
GROUP BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
) sub;
也可能更快。
SELECT count(*) AS ct
FROM (
SELECT 1 -- selecting more columns for count only would be a waste
FROM compatibility
GROUP BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
HAVING every(rating_id = 1)
) sub;
与@Clodoaldo's query或此earlier answer with more explanation类似
every(rating_id = 1)
比not bool_or(rating_id > 1)
更简单,但也排除了rating < 1
- 这对您的案例来说可能很好(甚至更好)。
MySQL 目前没有实现(标准SQL!)every()
。由于您只想消除rating_id > 1
,因此这个简单的表达式更符合您的要求并适用于两个RDBMS:
HAVING max(rating_id) = 1
使用count(*)
作为窗口聚合函数且没有子查询。
SELECT count(*) OVER () AS ct
FROM compatibility
GROUP BY least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
HAVING max(rating_id) = 1
LIMIT 1;
在聚合步骤之后应用窗口函数。在此基础上,我们在单个查询级别中完成两个聚合步骤:
(atr1_id, atr2_id)
,不包括存在差异rating_id
的行。 LIMIT 1
获得一行(所有行都相同)
MySQL没有窗口功能。仅 Postgres
最短,不一定最快。
SQL Fiddle. (在pg9.2上,因为pg9.3当前处于脱机状态。)
答案 1 :(得分:1)
如果我理解正确,您需要一对其评级始终为“1”的属性。
这应该为您提供属性:
select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
min(rating_id) as minri, max(rating_id) as maxri
from compatibility c
group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
having min(rating_id) = 1 and max(rating_id) = 1;
要获得计数,只需将其用作子查询:
select count(*)
from (select least(attr1_id, attr2_id) as a1, greatest(attr1_id, attr2_id) as a2,
min(rating_id) as minri, max(rating_id) as maxri
from compatibility c
group by least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
having min(rating_id) = 1 and max(rating_id) = 1
) c
答案 2 :(得分:1)
在Postgresql中这样做。 SQLFiddle现在不能正常工作:
select count(*)
from (
select least(attr1_id, attr2_id), greatest(attr1_id, attr2_id)
from compatibility
group by 1, 2
having not bool_or(rating_id > 1)
) s
;
count
-------
2
(1 row)
答案 3 :(得分:0)
我会使用CASE .. WHEN
来重新排列属性,使得较小的属性始终是第一个,并且顺序就是那个。要遵循的示例查询..
SELECT attrSmall,
attrLarge,
MAX(rating_id) as ratingMax
FROM (
SELECT CASE WHEN c.attr1_id < c.attr2_id
THEN c.attr1_id
ELSE c.attr2_id END as attrSmall,
CASE WHEN c.attr1_id < c.attr2_id
THEN c.attr2_id
ELSE c.attr1_id END as attrLarge,
c.rating_id
FROM compatibility c) as c1
GROUP BY atrrSmall, attrLarge
HAVING ratingMax = 1