MySQL查询以确定与类别A和B相关的记录

时间:2012-10-26 08:48:52

标签: mysql sql database

以下是我的数据的简化版本:

products:
+----+-----------+
| id | name      |
+----+-----------+
|  1 | Product X |
|  2 | Product Y |
|  3 | Product Z |
+----+-----------+

categories:
+----+---------------+
| id | name          |
+----+---------------+
|  1 | Hotel         |
|  2 | Accommodation |
+----+---------------+

category_product
+----+------------+-------------+
| id | product_id | category_id |
+----+------------+-------------+
|  1 |          1 |           1 |
|  2 |          1 |           2 |
|  3 |          2 |           1 |
|  4 |          3 |           2 |
+----+------------+-------------+

如何构建一个有效查询,只检索两个类别“酒店”和“住宿”相关的products(例如产品X)?

我首先尝试了一种加入方法

SELECT *
FROM products p
JOIN category_product cp
ON p.id = cp.product_id
WHERE cp.category_id = 1 OR cp.category_id = 2

^这不起作用,因为它不会将查询限制在同时包含

我找到了一种使用子查询的方法......但出于性能原因,我已被警告不要进行子查询:

SELECT *
FROM products p
WHERE
(
    SELECT id
    FROM category_product
    WHERE product_id = p.id
    AND category_id = 1
)
AND
(
    SELECT id
    FROM category_product
    WHERE product_id = p.id
    AND category_id = 2
)

有没有更好的解决方案(或替代品如何)?我已经考虑将类别去标准化为产品的额外列,但理想情况下要避免这样做。希望有一个魔术解决方案!

更新

我已经运行了答案中提供的一些(很棒的)解决方案: 我的数据是235 000个category_product行和58 000个产品,显然基准测试总是依赖于环境和索引等。

“关系师”@podiluska

2 categories: 2826 rows  ~ 20ms 
5 categories: 46 rows ~ 25-30 ms 
8 categories: 1 rows ~ 25-30 ms 

“哪里存在”@Tim Schmelter

2 categories: 2826 rows  ~ 5-7ms 
5 categories: 46 rows ~ 30 ms 
8 categories: 1 rows ~ 300 ms 

我们可以看到结果开始分散,因为有更多的类别被抛入。我将看看使用“关系划分”,因为它提供了一致的结果,但实现可能会让我看看“存在的位置”(长格式http://pastebin.com/6NRX0QbJ

5 个答案:

答案 0 :(得分:4)

SELECT p.*
FROM products p
     inner join 
(
    select product_ID
    from category_product
    where category_id in (1,2)
    group by product_id
    having count(distinct category_id)=2
) pc
    on p.id = pc.product_id

这种技术被称为“关系分裂”

答案 1 :(得分:0)

select *
from products p
where
    (
        select
            count(distinct cp.category_id)
        from category_product as cp
        where
            cp.product_id = p.id and
            cp.category_id in (1, 2)
    ) = 2

或者你可以使用exists

select *
from products p
where
    exists
    (
        select
            count(distinct cp.category_id)
        from category_product as cp
        where
            cp.product_id = p.id and
            cp.category_id in (1, 2)
        having count(distinct cp.category_id) = 2
    )

答案 2 :(得分:0)

我会使用EXISTS

SELECT P.* FROM Products P
WHERE EXISTS
(
    SELECT 1 FROM category_product cp
    WHERE cp.product_id = p.id
    AND category_id = 1
)
AND EXISTS
(
    SELECT 1 FROM category_product cp
    WHERE cp.product_id = p.id
    AND category_id = 2
)

答案 3 :(得分:0)

SELECT categories.name,products.name 
FROM 
category_product,category,product 
where 
    category_product.product_id=product.id 
and 
   category_product.category_id=category.id 
    and 
   (
      select count(1) from category_product 
      where 
      category_product.categoty_id=1
      or 
      category_product.categoty_id=2 
     group by product_id having count(1)=2
   )

答案 4 :(得分:-1)

SELECT p.id
FROM products p
JOIN category_product cp
ON p.id = cp.product_id
WHERE cp.category_id IN (1,2)
GROUP BY p.id
HAVING COUNT(DISTINCT cp.category_id) = 2