表和同一个表上的子查询之间的内部连接

时间:2015-02-06 14:43:25

标签: sql-server subquery inner-join

SQL server 2012。

编辑: 我的原始查询比它应该更复杂,因为我尝试对表中的字段子集进行Distinct查询,并将其连接到表本身以获取其他(文本)字段。以下查询也可以解决问题:

SELECT DISTINCT
    p1.id
    ,p1.Name
    ,CAST( p1.[Description] AS nvarchar(max)) AS Description
    ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Categories
  FROM [dbo].[Company] AS p1
  ORDER BY p1.Id

我有一个表格,其数据与此类似(每个公司的多个记录除了类别字段外都是相同的):

+----+------+-----------------+----------+
| Id | Name | Description     | Category |
+----+------+-----------------+----------+
| 1  | AAA  | <loads of text> | cat1     |
| 1  | AAA  | <loads of text> | cat2     |
| 2  | BBB  | <even more text>| cat1     |
| 2  | BBB  | <even more text>| cat3     |
+----+------+-----------------+----------+

我尝试进行查询以获得此结果(每个公司的独特记录和汇总到1个字段的类别):

| 1  | AAA  | <loads of text> | cat1, cat2 |
| 2  | BBB  | <even more text>| cat1, cat3 |

使用各种主题的信息我已经提出了这个:

SELECT 
    t1.Id
    ,t2.Name
    ,t2.[Description]
    ,t1.Category
FROM  [dbo].[Company] AS t2 
INNER JOIN (SELECT DISTINCT p1.Id
      ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Category
  FROM [dbo].[Company] AS p1
  ) AS t1 ON t1.Id = t2.Id
  ORDER BY t1.Id

查询结果包含公司表中每条记录的记录,类别汇总到类别字段中:

+----+------+-----------------+------------+
| Id | Name | Description     | Category   |
+----+------+-----------------+------------+
| 1  | AAA  | <loads of text> | cat1, cat2 |
| 1  | AAA  | <loads of text> | cat1, cat2 |
| 2  | BBB  | <even more text>| cat1, cat3 |
| 2  | BBB  | <even more text>| cat1, cat3 |
+----+------+-----------------+------------+

我认为如果两个表都匹配,INNER JOIN只会选择行。 子查询单独生成预期结果(每个Id的一条记录,聚合类别)。我在整个查询中尝试了另一个group by子句,但是失败了,因为我不能在group子句中包含Description字段,因为它是一个文本类型字段。

我错过了什么?

3 个答案:

答案 0 :(得分:2)

我会在外部查询中尝试DISTINCT。这应该可以解决您的问题,除非某些行的描述/名称不同,这可能是完全可能的,因为您的数据库表应该是两个表,并且您可能从未编写任何代码以确保每个描述/名称保持不变ID。如果你有id,name和description的复合唯一索引,那么你可能没问题。

如果确实存在多个描述的问题,则需要使用聚合来修复外部查询中的问题。或者您需要修复数据并添加唯一索引以防止将来出现。

至于你遇到这个问题的原因,连接工作正常,但你得到的是派生表和另一个表之间的一对多关系。这就是为什么你要获得多个记录以及为什么不同的应该修复它,除非id的数据与名称和描述不同。

试试这个:

SELECT DISTINCT
    t1.Id
    ,t2.Name
    ,cast(t2.[Description] as nvarchar(max))
    ,t1.Category
FROM  [dbo].[Company] AS t2 
INNER JOIN (SELECT DISTINCT p1.Id
      ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Category
  FROM [dbo].[Company] AS p1
  ) AS t1 ON t1.Id = t2.Id
  ORDER BY t1.Id

或者你可以修复你糟糕的桌面设计。

答案 1 :(得分:1)

SELECT  *
FROM    ( SELECT    t1.Id ,
                    t2.Name ,
                    t2.[Description] ,
                    t1.Category ,
                    ROW_NUMBER() OVER ( PARTITION BY t1.Id, t2.Name,
                                        t1.Category ORDER BY t1.id ) row_num
          FROM      [dbo].[Company] AS t2
                    INNER JOIN ( SELECT DISTINCT
                                        p1.Id ,
                                        ( SELECT    [Category] + ', '
                                          FROM      [dbo].[Company] AS p2
                                          WHERE     p2.Id = p1.Id
                                          ORDER BY  Name
                                        FOR
                                          XML PATH('')
                                        ) AS Category
                                 FROM   [dbo].[Company] AS p1
                               ) AS t1 ON t1.Id = t2.Id
        ) t1
ORDER BY t1.Id

答案 2 :(得分:1)

要解决使用文本作为数据类型的讨厌问题,您可以在拉动它时强制转换该列。如果可能的话,我会将列永久更改为varchar(max)。

这样的事情:

SELECT 
    t1.Id
    , t2.Name
    , cast(t2.[Description] as varchar(max)) as Description
    , t1.Category
FROM  [dbo].[Company] AS t2 
INNER JOIN (SELECT DISTINCT p1.Id
      ,( SELECT [Category] + ', '
           FROM [dbo].[Company] AS p2
          WHERE p2.Id = p1.Id
          ORDER BY Name
            FOR XML PATH('') ) AS Category
  FROM [dbo].[Company] AS p1
  ) AS t1 ON t1.Id = t2.Id
  GROUP BY Id
    , Name
    , cast(t2.[Description] as varchar(max))
  ORDER BY t1.Id