我遇到了很多关于此问题的帖子,所提出的解决方案都倾向于采用相同的方式,但在我的情况下它非常不方便。
大多数时候建议这样的事情。
DECLARE @Actors TABLE ( [Id] INT , [Name] VARCHAR(20) , [MovieId] INT);
DECLARE @Movie TABLE ( [Id] INT, [Name] VARCHAR(20), [FranchiseId] INT );
INSERT INTO @Actors
( Id, Name, MovieId )
VALUES ( 1, 'Sean Connery', 1 ),
( 2, 'Gert Fröbe', 1 ),
( 3, 'Honor Blackman', 1 ),
( 4, 'Daniel Craig', 2 ),
( 5, 'Judi Dench', 2 ),
( 2, 'Harrison Ford', 3 )
INSERT INTO @Movie
( Id, Name, FranchiseId )
VALUES ( 1, 'Goldfinger', 1 ),
( 2, 'Skyfall', 1 ),
( 3, 'Return of the Jedi', 2 )
SELECT m.Name ,
STUFF(( SELECT ',' + a_c.Name
FROM @Actors a_c
WHERE a_c.MovieId = m.Id
FOR
XML PATH('')
), 1, 1, '')
FROM @Actors a
JOIN @Movie m ON a.MovieId = m.Id
GROUP BY m.Id ,
m.Name
问题是(我该如何解释?),一个人并没有真正访问分组的项目(如Count(),Max(),Min(),...),一个人重建了“外部查询“和WHERE语句中的强制,相应的值与GROUP BY语句中的相同(在外部查询中)。
如果你不明白我想说的是什么,我通过一个额外的表扩展了上面的例子,你会看到,我还必须扩展“内部查询”
DECLARE @Actors TABLE ( [Id] INT , [Name] VARCHAR(20) , [MovieId] INT);
DECLARE @Movie TABLE ( [Id] INT, [Name] VARCHAR(20), [FranchiseId] INT );
DECLARE @Franchise TABLE ( [Id] INT , [Name] VARCHAR(20));
INSERT INTO @Actors
( Id, Name, MovieId )
VALUES ( 1, 'Sean Connery', 1 ),
( 2, 'Gert Fröbe', 1 ),
( 3, 'Honor Blackman', 1 ),
( 4, 'Daniel Craig', 2 ),
( 5, 'Judi Dench', 2 ),
( 2, 'Harrison Ford', 3 )
INSERT INTO @Movie
( Id, Name, FranchiseId )
VALUES ( 1, 'Goldfinger', 1 ),
( 2, 'Skyfall', 1 ),
( 3, 'Return of the Jedi', 2 )
INSERT INTO @Franchise
( Id, Name )
VALUES ( 1, 'James Bond' ),
( 2, 'Star Wars' )
SELECT f.Name ,
STUFF(( SELECT ',' + a_c.Name
FROM @Actors a_c
JOIN @Movie m_c ON a_c.MovieId = m_c.Id
WHERE m_c.FranchiseId = f.Id
FOR
XML PATH('')
), 1, 1, '')
FROM @Actors a
JOIN @Movie m ON a.MovieId = m.Id
JOIN @Franchise f ON m.FranchiseId = m.Id
GROUP BY f.Id ,
f.Name
现在,进一步,想象一个巨大的查询,非常复杂,在许多表上的几个分组值。性能是一个问题。我不想在“内部查询”中重建整个连接模式。
还有其他方法吗?一种不会破坏性能的方法,您不必复制连接模式?
答案 0 :(得分:0)
您可以使用公用表表达式(CTE)简化查询。这样,您只需要指定一次JOIN
。此外,您确实只需要GROUP BY
中的一列:
WITH idsAndNames AS -- the CTE that is used in two places
(
SELECT f.Id AS FranchiseId,
f.Name AS FranchiseName,
m.Id AS MovieId,
m.Name AS MovieName,
a.Id AS ActorId,
a.Name As ActorName
FROM @Actors a
JOIN @Movie m ON a.MovieId = m.Id
JOIN @Franchise f ON m.FranchiseId = f.Id
)
SELECT n.FranchiseName,
STUFF((SELECT ',' + x.ActorName -- you might need a DISTINCT here, btw.
FROM idsAndNames x
WHERE x.FranchiseId = n.FranchiseId
FOR XML PATH('')), 1, 1, '')
FROM idsAndNames n
GROUP BY /* n.FranchiseId, */ n.FranchiseName -- name might suffice if it's unique
答案 1 :(得分:0)
与我所说的in this comment相反,你根本不需要GROUP BY
子句 ,也不需要 ! WHERE
子句
你只需要外部SELECT
来“迭代”所有特许经营权(或任何你想要分组的)。然后在内部SELECT
中,您需要一些JOIN
来到特许经营关键列。 而不是通过外部特许经营权的密钥过滤的 WHERE
子句,只需直接在INNER JOIN
中使用外部特许经营权密钥:
SELECT f.Name AS FranchiseName,
COALESCE(STUFF((SELECT DISTINCT ', ' + a.Name
FROM @Actor a
JOIN @Movie m ON a.MovieId = m.Id
WHERE m.FranchiseId = f.Id
ORDER BY ', ' + a.Name -- this is optional
FOR XML PATH('')), 1, 1, ''), '') AS ActorNames
FROM @Franchise f
信息来源: "High Performance T-SQL Using Window Functions" by Itzik Ben-Gak。因为SQL Server遗憾的是没有用于连接值的聚合/窗口函数,所以本书的作者推荐了类似上面的内容作为下一个最佳解决方案。
P.S。:我已删除my previous solution that substituted an additional
JOIN
for aWHERE
clause;我现在相当确定aWHERE
clause is likely to perform better。尽管如此,我还是留下了一些我之前的解决方案的证据(即罢工文本),因为我之前提到了一个评论。