Question

我知道很多人都观察过这种行为，但我想知道是否有人可以解释原因。当我创建一个小表来创建使用pivot函数的示例时，我得到了我期望的结果：

CREATE TABLE dbo.AverageFishLength
    (
      Fishtype VARCHAR(50) ,
      AvgLength DECIMAL(8, 2) ,
      FishAge_Years INT
    )
INSERT  INTO dbo.AverageFishLength
        ( Fishtype, AvgLength, FishAge_Years )
VALUES  ( 'Muskie', 32.75, 3 ),
        ( 'Muskie', 37.5, 4 ),
        ( 'Muskie', 39.75, 5 ),
        ( 'Walleye', 16.5, 3 ),
        ( 'Walleye', 18.25, 4 ),
        ( 'Walleye', 20.0, 5 ),
        ( 'Northern Pike', 20.75, 3 ),
        ( 'Northern Pike', 23.25, 4 ),
        ( 'Northern Pike', 26.0, 5 );

以下是数据透视查询：

SELECT  Fishtype ,
        [3] AS [3 Years Old] ,
        [4] AS [4 Years Old] ,
        [5] AS [5 Years Old]
FROM    dbo.AverageFishLength   PIVOT( SUM(AvgLength) 
                                FOR FishAge_Years IN ( [3], [4], [5] ) ) AS PivotTbl

结果如下：

enter image description here

但是，如果我使用标识列创建表，则结果会分成不同的行：

DROP TABLE dbo.AverageFishLength
CREATE TABLE dbo.AverageFishLength
    (
      ID INT IDENTITY(1,1) ,
      Fishtype VARCHAR(50) ,
      AvgLength DECIMAL(8, 2) ,
      FishAge_Years INT
    )
INSERT  INTO dbo.AverageFishLength
        ( Fishtype, AvgLength, FishAge_Years )
VALUES  ( 'Muskie', 32.75, 3 ),
        ( 'Muskie', 37.5, 4 ),
        ( 'Muskie', 39.75, 5 ),
        ( 'Walleye', 16.5, 3 ),
        ( 'Walleye', 18.25, 4 ),
        ( 'Walleye', 20.0, 5 ),
        ( 'Northern Pike', 20.75, 3 ),
        ( 'Northern Pike', 23.25, 4 ),
        ( 'Northern Pike', 26.0, 5 );

完全相同的查询：

SELECT  Fishtype ,
        [3] AS [3 Years Old] ,
        [4] AS [4 Years Old] ,
        [5] AS [5 Years Old]
FROM    dbo.AverageFishLength   PIVOT( SUM(AvgLength) 
                                FOR FishAge_Years IN ( [3], [4], [5] ) ) AS PivotTbl

结果不同：

enter image description here

在我看来，ID列正在查询中使用，即使它根本没有出现在查询中。它几乎就像隐含在查询中，但未在结果集中显示。

有人可以解释为什么会这样吗？

Answer 1

之所以发生这种情况是因为ID列对于每一行都是唯一的，因为您直接查询表（没有子查询），该列作为聚合函数所需的GROUP BY的一部分包含在内。

MSDN docs about FROM说明以下文档：

table_source PIVOT <pivot_clause>

指定table_source基于pivot_column进行透视。 table_source是表或表的表达式。输出是一个表，其中包含table_source的所有列，pivot_column和value_column除外。 table_source的列（pivot_column和value_column 除外）称为pivot操作符的分组列。

PIVOT针对分组列对输入表执行分组操作，并为每个组返回一行。此外，输出包含在column_list中指定的每个值的一列，该列出现在input_table的pivot_column中。

您的版本基本上是在说SELECT * FROM yourtable并将数据PIVOT。即使ID列不在最终的SELECT列表中，它也是查询中的分组元素。如果你将PIVOT与一个＆＃34; pre-PIVOT＆＃34;示例显示您的版本。此示例使用CASE表达式和聚合函数：

SELECT Fishtype, sum(case when FishAge_Years = 3 then AvgLength else 0 end) as [3], sum(case when FishAge_Years = 4 then AvgLength else 0 end) as [4], sum(case when FishAge_Years = 5 then AvgLength else 0 end) as [5] FROM dbo.AverageFishLength GROUP BY Fishtype, ID;

结果将会有所偏差，因为即使您在最终列表中没有ID，它仍然被用于分组，因为它们是唯一的，您会获得多行。

使用PIVOT时解决此问题的最简单方法是使用子查询：

SELECT Fishtype , [3] AS [3 Years Old] , [4] AS [4 Years Old] , [5] AS [5 Years Old] FROM ( SELECT Fishtype, AvgLength, FishAge_Years FROM dbo.AverageFishLength ) d PIVOT ( SUM(AvgLength) FOR FishAge_Years IN ( [3], [4], [5] ) ) AS PivotTbl;

在此版本中，您只返回表中实际需要和想要的列 - 这不包括ID，因此不会用于对数据进行分组。

为什么使用“额外”列进行旋转不会合并结果

1 个答案: