多个左连接 - 保持返回的行数减少?

时间:2012-02-23 00:34:18

标签: sql sql-server tsql

我正在试图找出如何最好地查询包含一个中央表的模式,以及记录一对多的一些“属性”表(抱歉,不确定这里最好的术语)关系。在业务层中,每个表对应一个可能包含零个或多个元素的集合。

现在我正在查看的代码通过从主表中获取值列表来检索数据,然后循环遍历它并查询每个“附件”表以填充这些集合。

如果可以的话,我想尝试将其归结为单个查询。我尝试使用多个LEFT JOIN。但这有效地加入了附件表中值的交叉产品,这会导致行的爆炸 - 特别是当你添加一些连接时。有问题的表包含五个这样的关系,因此每个记录返回的行数可能很大,几乎完全由冗余数据组成。

这是一些较小的合成示例,包括一些表,数据,我正在使用的查询结构以及结果:

数据库结构&数据:

create table Containers (
  Id int not null primary key,
  Name nvarchar(8) not null);

create table Containers_Animals (
  Container int not null references Containers(Id),
  Animal nvarchar(8) not null,
  primary key (Container, Animal)
  );

create table Containers_Foods (
  Container int not null references Containers(Id),
  Food nvarchar(8) not null,
  primary key (Container, Food)
  );

insert into Containers (Id, Name) 
  values (0, 'box'), (1, 'sack'), (2, 'bucket');

insert into Containers_Animals (Container, Animal)
  values (1, 'monkey'), (2, 'dog'), (2, 'whale'), (2, 'lemur'); 

insert into Containers_Foods (Container, Food)
  values (1, 'lime'), (2, 'bread'), (2, 'chips'), (2, 'apple'), (2, 'grape');

耦合到这样的业务对象:

class Container {
    public string Name;
    public string[] Animals;  // may be empty
    public string[] Foods;    // may be empty
}

这就是我构建查询的方式:

select c.Name container, a.Animal animal, f.Food food from Containers c
  left join Containers_Animals a on a.Container = c.Id
  left join Containers_Foods f on f.Container = c.Id;

这给出了这些结果:

container animal   food
--------- -------- --------
box       NULL     NULL
sack      monkey   lime
bucket    dog      apple
bucket    dog      bread
bucket    dog      chips
bucket    dog      grape
bucket    lemur    apple
bucket    lemur    bread
bucket    lemur    chips
bucket    lemur    grape
bucket    whale    apple
bucket    whale    bread
bucket    whale    chips
bucket    whale    grape

我想要看到的是一些行,这些行等于任何关系上与根表关联的最大值的数量,空的空间用NULL填充。这将保持返回的方式,方式,方式向下,同时仍然很容易转换为对象。像这样:

container animal   food
--------- -------- --------
box       NULL     NULL
sack      monkey   lime
bucket    dog      apple
bucket    lemur    bread
bucket    whale    chips
bucket    NULL     grape

可以吗?

3 个答案:

答案 0 :(得分:4)

为什么不返回按容器排序的两个数据集,然后在客户端中对它们进行逻辑合并连接?你要求的是让数据库引擎做更多的工作,查询更复杂,(对我而言)小的好处。

看起来像这样。使用两个左连接来确保每个数据集至少有一个所有容器名称的实例,然后同时循环它们。这是一些粗略的伪代码:

Dim CurrentContainer
If Not Animals.Eof Then
   CurrentContainer = Animals.Container
End If
Do While Not Animals.Eof Or Not Foods.Eof
   Row = New Couplet(AnimalType, FoodType);
   If Animals.Animal = CurrentContainer Then
      Row.AnimalType = Animals.Animal
      Animals.MoveNext
   End If
   If Foods.Container = CurrentContainer Then
      Row.FoodType = Foods.Food
      Foods.MoveNext
   End If
   If Not Animals.Eof AndAlso Animals.Container <> CurrentContainer _
      AndAlso Not Foods.Eof AndAlso Foods.Container <> CurrentContainer Then
      CurrentContainer = [Container from either non-Eof recordset]
   EndIf
   'Process the row, output it, put it in a stack, build a new recordset, whatever.
Loop

然而,当然你要求的是可能!这有两种方式。

  1. 单独处理输入并加入他们的位置:

    WITH CA AS (
        SELECT *,
            Row_Number() OVER (PARTITION BY Container ORDER BY Animal) Pos
        FROM Containers_Animals
    ), CF AS (
        SELECT *,
            Row_Number() OVER (PARTITION BY Container ORDER BY Food) Pos
        FROM Containers_Foods
    )
    SELECT
        C.Name,
        CA.Animal,
        CF.Food
    FROM
        Containers C
        LEFT JOIN (
            SELECT Container, Pos FROM CA
            UNION SELECT Container, Pos FROM CF
        ) P ON C.Id = P.Container
        LEFT JOIN CA
            ON C.Id = CA.Container
            AND P.Pos = CA.Pos
        LEFT JOIN CF
            ON C.Id = CF.Container
            AND P.Pos = CF.Pos;
    
  2. 垂直连接输入并旋转它们:

    WITH FoodAnimals AS (
        SELECT
            C.Name,
            1 Which,
            CA.Animal Item,
            Row_Number() OVER (PARTITION BY C.Id ORDER BY (CA.Animal)) Pos
        FROM
            Containers C
            LEFT JOIN Containers_Animals CA
                ON C.Id = CA.Container
        UNION
        SELECT
            C.Name,
            2 Which,
            CF.Food,
            Row_Number() OVER (PARTITION BY C.Id ORDER BY (CF.Food)) Pos
        FROM
            Containers C
            LEFT JOIN Containers_Foods CF
                ON C.Id = CF.Container
    )
    SELECT
        P.Name,
        P.[1] Animal,
        P.[2] Food
    FROM
        FoodAnimals FA
        PIVOT (Max(Item) FOR Which IN ([1], [2])) P;
    

答案 1 :(得分:0)

; with a as (
    select ID, c.Name container, a.Animal animal
    , r=row_number()over(partition by c.ID order by a.Animal)
    from Containers c
      left join Containers_Animals a on a.Container = c.Id
)
, b as (
    select ID, c.Name container, f.Food food
    , r=row_number()over(partition by c.ID order by f.Food)
    from Containers c
      left join Containers_Foods f on f.Container = c.Id

)
select a.container, a.animal, b.food
from a
left join b on a.container=b.container and a.r=b.r
union
select b.container, a.animal, b.food
from b
left join a on a.container=b.container and a.r=b.r

答案 2 :(得分:0)

WITH
  ca_ranked AS (
    SELECT
      *,
      rnk = ROW_NUMBER() OVER (PARTITION BY Container ORDER BY Animal)
    FROM Containers_Animals
  ),
  cf_ranked AS (
    SELECT
      *,
      rnk = ROW_NUMBER() OVER (PARTITION BY Container ORDER BY Food)
    FROM Containers_Foods
  )
SELECT
  container = c.Name,
  animal    = ca.Animal,
  food      = cf.Food
FROM ca_ranked ca
  FULL  JOIN cf_ranked cf ON ca.Container = cf.Container AND ca.rnk = cf.rnk
  RIGHT JOIN Containers c ON c.Id = COALESCE(ca.Container, cf.Container)
;