我的表格如下:
| id | Category |
|----|----------|
| 1 | Red |
| 1 | Cat |
| 2 | Blue |
| 3 | Yellow |
| 3 | Dog |
| 3 | Bike |
| 4 | Blue |
| 4 | Cat |
我想要的是按ID进行分组,只保留那些具有以下三个特征的ID:
所以在上面的表示例中,我想保留然后分组1和4,但排除2和3。
这是我到目前为止的代码:
SELECT id
FROM table
GROUP BY id
HAVING( (sum(case when (code_value IN ('Red', 'Yellow', 'Blue') then 1 else 0 end) > 0)
AND
(sum(case when (code_value IN ('Cat', 'Dog', 'Fish') then 1 else 0 end) > 0)
AND
(sum(case when (code_value IN ('Bike', 'Car', 'Bus') then 0 else 1 end) > 0)
)
这个概念似乎有效,但速度很慢。我想知道是否有其他人对此有更好的想法。注意这个工作,有些情况下我会有超过3个特征,所以如果它很容易扩展,那将是理想的。
答案 0 :(得分:0)
您可以先计算至少包含3个成员之一的ID。然后除了那些ids之外。所以像这样:
SELECT id
FROM table
where id NOT IN
(
select id from table where code_value in ('Bike', 'Car', 'Bus')
)
GROUP BY id
HAVING( (sum(case when (code_value IN ('Red', 'Yellow', 'Blue')) then 1 else 0 end)>0
AND
sum(case when (code_value IN ('Cat', 'Dog', 'Fish')) then 1 else 0 end)>0
))
答案 1 :(得分:0)
考虑将您的选择特征存储在单独的查找表中, colorsTable,animalsTable,vehiclesTable (每个都具有不同的 code_value 数据),可以无限制地扩展。然后将它们作为派生表(或视图)连接到主聚合查询:
SELECT t.id
FROM mytable As t
LEFT JOIN
(SELECT s1.id, count(*) As cnt1
FROM myTable s1 INNER JOIN colorsTable s2
ON s1.code_value = s2.code_value
GROUP BY s1.id) As a
ON t.id = a.id
LEFT JOIN
(SELECT s1.id, count(*) As cnt2
FROM myTable s1 INNER JOIN animalsTable s2
ON s1.code_value = s2.code_value
GROUP BY s1.id) As b
ON t.id = b.id
LEFT JOIN
(SELECT s1.id, count(*) As cnt3
FROM myTable s1 INNER JOIN vehiclesTable s2
ON s1.code_value = s2.code_value
GROUP BY s1.id) As c
ON t.id = c.id
WHERE a.cnt1 > 0 AND b.cnt2 > 0 AND c.cnt3 IS NULL
GROUP BY t.id
答案 2 :(得分:0)
这不会影响性能,但我会将查询编写为:
HAVING sum(code_value IN ('Red', 'Yellow', 'Blue')) > 0 AND
sum(code_value IN ('Cat', 'Dog', 'Fish')) > 0 AND
sum(code_value IN ('Bike', 'Car', 'Bus')) = 0
然后,如果您将完整查询编写为:
SELECT id
FROM table
WHERE code_value IN ('Red', 'Yellow', 'Blue', 'Cat', 'Dog', 'Fish', 'Bike', 'Car', 'Bus')
GROUP BY id
HAVING sum(code_value IN ('Red', 'Yellow', 'Blue')) > 0 AND
sum(code_value IN ('Cat', 'Dog', 'Fish')) > 0 AND
sum(code_value IN ('Bike', 'Car', 'Bus')) = 0
然后减少GROUP BY
之前的数据大小。此版本还可以利用table(code_value, id)
上的索引。这可能会对性能有所帮助,具体取决于数据的分布。