使用反连接进行sql优化

时间:2015-01-12 09:58:24

标签: sql query-optimization anti-join

我有一个类别的递归表和一个包含以下字段的公司表:

category(id, name, parent) // parent is foreign key to category id :)
company(id, category_1, category_2, category_3) // category_* is foreign key to category id

类别树位于最大深度= 3;

类别cx - > category cy - > category cz

了解公司类别总是链接到最后一个类别(c3),我想要公司链接到的所有类别(c1z,c2z,c3z,c1y,c2y,c3y,c1x,c2x,c3x)用于我的搜索发动机。 // c1y是category_1的父级,c1x是类别1的父级的父级...

我提出的最佳查询是:

SELECT
  ID,
  NAME
FROM category c3
WHERE ID IN (
    select category_1 from company where id=:companyId
  union
    select category_2 from company where id=:companyId
  union
    select category_3 from company where id=:companyId
  union
        select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
        )
  union 
        select parent from category where id in (
          select parent from category where id in (
          select category_1 from company where id=:companyId
          union
          select category_2 from company where id=:companyId
          union
          select category_3 from company where id=:companyId
          )
        )
  )

它有很多重复。一个用于公司的category_ *。和一次多次重复。

以任何方式删除所有这些重复项?

- 更新 -

假设我们使用两个表来解决category- *字段,那么3级别的递归问题呢?

例如,如果只有一个类别,它将是

SELECT
  ID,
  NAME
FROM category
WHERE ID IN (
  select category_1 from company where id=:companyId
  union
  select parent from category where id in (
    select category_1 from company where id=:companyId
  )
  union
  select parent from category where id in (
    select parent from category where id in (
      select category_1 from company where id=:companyId
    )
  )
);

2 个答案:

答案 0 :(得分:1)

如果要加入数据,请使用以下内容(SQL服务器示例):

DECLARE @category TABLE (id INT IDENTITY(1,1), name VARCHAR(30), parent INT) -- parent is foreign key to category id :)
DECLARE @company TABLE (id INT IDENTITY(1,1), category_1 INT, category_2 INT, category_3 INT) --category_* is foreign key to category->id


INSERT INTO @category (name, parent )
VALUES('Top category', null), ('Cars', 1)

INSERT INTO @company (category_1, category_2 , category_3 )
VALUES(2, null, null), (2, 2, null), (2, 2, 2)


SELECT t1.*, t2.*
FROM @category AS t1 INNER JOIN @company AS t2 ON t1.id = t2.category_1 or t1.id = t2.category_2  or t1.id = t2.category_3 

以上代码产生:

id  name    parent  id  category_1  category_2  category_3
2   Cars    1   1   2   NULL    NULL
2   Cars    1   2   2   2   NULL
2   Cars    1   3   2   2   2

但是,这种数据库结构是错误的!

而不是一个表

company(id, category_1, category_2, category_3)

创建两个表

company(id, name)
comp_cat(id, comp_id, cat_id)

为什么呢?我不想直接回答,所以我问你:1)当公司与超过3个类别相关时会发生什么? 2)为什么在没有设置第二和第三类的情况下保存空值?

如果是SQL Server,您可以使用Common Table Expressions

;WITH CTE AS
(
    SELECT id, category_1 AS cat_id
    FROM @company 
    WHERE NOT category_1 IS NULL
    UNION ALL
    SELECT id, category_2 AS cat_id
    FROM @company 
    WHERE NOT category_2 IS NULL
    UNION ALL
    SELECT id, category_3 AS cat_id
    FROM @company 
    WHERE NOT category_3 IS NULL
)
SELECT DISTINCT t1.*, t2.*
FROM CTE AS t1 INNER JOIN @category AS t2 ON t1.cat_id = t2.id 

干杯, 马切伊

答案 1 :(得分:0)

我使用common table expressions查询。这是我提出的最后一个查询

 with cte as (
    select category_1 as id  from company where id=:companyId and category_1 is not null
    union
    select category_2 as id from company where id=:companyId and category_2 is not null
    union
    select category_3 as id from company where id=:companyId and category_3 is not null
  ) select id, name FROM category WHERE id IN (
      select id from cte 
    union
      select parent from category where id in (select id from cte)
    union
      select parent from category where id in (
       select parent from category where id in (select id from cte)
     )
   );

这是我能想到的最好的方式。感谢@Maciej展示方式,并感谢@Nicholai获取有关DBMS支持的信息。

只有在有一种方法可以将一个行转换为像matlab那样的行...:P