Question

我有一个较大的（数百万行的10行）SQL表，它列出了属性类型和属性。我想调查给定对象的这些属性的子集（一次三个或四个）之间的关系。对象可能有一些，全部或没有我感兴趣的属性。如果它没有我感兴趣的属性，我可以认为它不存在。

Id | AttributeType | AttributeValue 
------------------------------------
01 |     01        |     100        
01 |     02        |     4500
01 |     04        |      D
01 |     15        |      E

问题是，基本上我想要返回我所在的所有属性类型的结果（如果它们中的任何一个存在），但是如果它们都不存在则没有结果。

执行此查询：

select
    case 
        when Att1.id is null then Att2.id 
        else Att1.id
    end as Id, 
    Att1.AttributeValue as Attribute5, 
    Att2.AttributeValue as Attribute6
 from Attributes Att1
full outer join Attributes Att2
on Att1.id = Att2.id
and Att1.AttributeType = 5
and Att2.AttributeType = 6

对于Id 1不起作用，因为它没有任何一种类型的属性，但是查询在连接的任一侧创建了空记录，所以我看到类似这样的内容：

Id | Attribute5 | Attribute6
-----------------------------
01 |    100     |   Null     
01 |   4500     |   Null

如果我试图避免创建空记录，我会错过我想看到的记录。这个查询：

select
     case 
         when Att1.id is null then Att2.id 
         else Att1.id
     end as Id, 
    Att1.AttributeValue as Attribute1, 
    Att2.AttributeValue as Attribute2
from Attributes Att1
full outer join Attributes Att2
on Att1.id = Att2.id
where Att1.AttributeType = 1
and Att2.AttributeType = 3

什么都不产生，但应该产生：

 Id | Attribute1 | Attribute3
-----------------------------
 01 |    100     |   Null

我可以使用左连接

解决这两个问题

select
    case 
        when Att1.id is null then Att2.id 
        else Att1.id
    end as Id, 
    Att1.AttributeValue as Attribute1, 
    Att2.AttributeValue as Attribute3
from Attributes Att1
left join Attributes Att2
on Att1.id = Att2.id
and Att2.AttributeType = 3
where Att1.AttributeType = 1

产生正确的输出。

这个问题是它没有平等对待属性。所以，如果Id 01有一个属性01的值而不是03它没关系，但是如果它没有01，并且确实有03，我就不会看到它。当我扩展到三个和四个连接时，这就变成了一个问题。

理想情况下，考虑到我将不得不为不同的属性类型运行此查询，以及首先创建属性表需要多少处理时间，我希望能够得到所有的使用单个查询我需要的结果，而不是其他人的结果。

Answer 1

您可能希望使用SQL-Server的“Pivot”功能（http://technet.microsoft.com/en-us/library/ms177410(v=sql.105).aspx）

我认为你的例子的语法是：

SELECT Id, [01], [02], [04], [15], [06]
from
(SELECT Id, AttributeType, AttributeValue From Attributes) att
PIVOT
(
    MAX(AttributeValue)
    for AttributeType IN ([01], [02], [04], [15], [06])
    ) AS myPivot

这将为四个AttributeTypes中的每一个提供一列，每列具有值。请注意，您必须使用分组功能，因此我使用了MAX（）。如果同一个Id / AttributeType组合有多个记录，则只能获得MAX（）返回的行。对于你的例子，我得到：

Id  01    02      04   15   06
01  100   4500    D    E    NULL

有数百万行，我不确定它会如何执行，但它应该是我所知道的最简单的解决方案，并且适用于合理数量的列。 NULL应该自动运行。

Answer 2

尝试这样的事情......

select distinct
    base.id, 
    Att1.AttributeValue as Attribute1, 
    Att2.AttributeValue as Attribute2,
    Att3.AttributeValue as Attribute3,
    Att4.AttributeValue as Attribute4
from Attributes base
left join Attributes Att1 on base.id = Att1.id and Att1.AttributeType = 1
left join Attributes Att2 on base.id = Att2.id and Att2.AttributeType = 2
left join Attributes Att3 on base.id = Att3.id and Att3.AttributeType = 3
left join Attributes Att4 on base.id = Att4.id and Att4.AttributeType = 4
where base.id = 1

你需要一个“静态”表，然后将属性加入到...

理想情况下，由于你没有使用除基表中的ID以外的任何东西，如果你没有在这里使用整个表，它会更好地执行，但是，考虑到布局，这将起作用，如果仅作为例。如果你知道你正在查看ID 1,3,5,7，那么将它们放在变量/临时表中可能会更好，并加入它以消除必须加入Attributes表额外的时间。

Answer 3

预选您的数据。

SELECT *
FROM Attributes
WHERE AttributeType IN (...)

然后在这个简化集上执行Full Outer Join（作为视图或ina WITH子句）。

您还可以尝试仅选择ID并将其加入现有的完整外部联接以查看哪个更快

Answer 4

DECLARE @SearchFilter TABLE (AttributeType int)
INSERT @SearchFilter VALUES (1),(5),(17),(32)

SELECT a.*
FROM Attributes a
WHERE EXISTS (
  SELECT AttributeType FROM Attributes WHERE id = a.id
  INTERSECT
  SELECT AttributeType FROM @SearchFilter
)

当记录可能不存在时从表中选择Null

4 个答案: