我试图为每个用户选择一行。我不关心我得到的图像。此查询适用于MySQL,但不适用于SQL Server:
SELECT user.id, (images.path + images.name) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
答案 0 :(得分:12)
到目前为止,使用MIN/MAX
聚合或ROW_NUMBER
发布的解决方案可能效率最低(取决于数据分布),因为在为每个组选择一个之前,它们通常必须检查所有匹配的行。
使用AdventureWorks sample database进行说明,以下查询都会从每个TransactionType
的“交易记录”表中选择一个ReferenceOrderID
和ProductID
:
MIN
/ MAX
汇总SELECT
p.ProductID,
MIN(th.TransactionType + STR(th.ReferenceOrderID, 11))
FROM Production.Product AS p
INNER JOIN Production.TransactionHistory AS th ON
th.ProductID = p.ProductID
GROUP BY
p.ProductID;
ROW_NUMBER
WITH x AS
(
SELECT
th.ProductID,
th.TransactionType,
th.ReferenceOrderID,
rn = ROW_NUMBER() OVER (PARTITION BY th.ProductID ORDER BY (SELECT NULL))
FROM Production.TransactionHistory AS th
)
SELECT
p.ProductID,
x.TransactionType,
x.ReferenceOrderID
FROM Production.Product AS p
INNER JOIN x ON x.ProductID = p.ProductID
WHERE
x.rn = 1
OPTION (MAXDOP 1);
ANY
聚合SELECT
q.ProductID,
q.TransactionType,
q.ReferenceOrderID
FROM
(
SELECT
p.ProductID,
th.TransactionType,
th.ReferenceOrderID,
rn = ROW_NUMBER() OVER (
PARTITION BY p.ProductID
ORDER BY p.ProductID)
FROM Production.Product AS p
JOIN Production.TransactionHistory AS th ON p.ProductID = th.ProductID
) AS q
WHERE
q.rn = 1;
有关ANY
汇总的详细信息,请参阅this blog post。
TOP
SELECT p.ProductID,
(
-- No ORDER BY, so could be any row
SELECT TOP (1)
th.TransactionType + STR( th.ReferenceOrderID, 11)
FROM Production.TransactionHistory AS th WITH (FORCESEEK)
WHERE
th.ProductID = p.ProductID
)
FROM Production.Product AS p;
CROSS APPLY
与TOP (1)
上一个查询需要连接,并为没有交易历史记录的产品返回NULL
。将CROSS APPLY
与TOP
一起使用可以解决这两个问题:
SELECT
p.Name,
ca.TransactionType,
ca.ReferenceOrderID
FROM Production.Product AS p
CROSS APPLY
(
SELECT TOP (1)
th.TransactionType,
th.ReferenceOrderID
FROM Production.TransactionHistory AS th WITH (FORCESEEK)
WHERE
th.ProductID = p.ProductID
) AS ca;
使用最佳索引编制,如果每个用户通常拥有许多图片,APPLY
可能效率最高。
答案 1 :(得分:4)
如果用户有多个图片,而您只想要一张图片,您想要哪一张?虽然MySQL具有loosy-goosy语法,不会强迫您做出选择,只是给你任何旧的任意值,SQL Server让你选择。一种方法是MIN
:
SELECT u.id, MIN(i.path + i.name) AS image_path
FROM dbo.users AS u
INNER JOIN dbo.images AS i
ON u.id = i.user_id
GROUP BY u.id;
您也可以将MAX
替换为MIN
。并且根据SQL Server的版本,以及实际上是否需要更多列,可能还有其他方法可以更有效地执行此操作(避免某些排序/组工作)。例如,如果您想单独使用路径和名称,这将无法很好地解决:
SELECT u.id, MIN(i.path), MIN(i.name)
FROM dbo.users AS u
INNER JOIN dbo.images AS i
ON u.id = i.user_id
GROUP BY u.id;
...因为理论上你可以从两个不同的行中获取路径和名称,这个结果将不再有意义。所以你可以这样做:
;WITH x AS
(
SELECT user_id, path, name, rn = ROW_NUMBER() OVER
(PARTITION BY user_id ORDER BY (SELECT NULL))
FROM dbo.images
)
SELECT u.id, x.path, x.name
FROM dbo.users AS u
INNER JOIN x
ON u.id = x.user_id
WHERE x.rn = 1;
在现有案例中使用此变体是否有意义在很大程度上取决于这两个表的索引方式,但您可以尝试这种方法并比较计划/性能:
;WITH x AS
(
SELECT user_id, path + name AS image_path, rn = ROW_NUMBER() OVER
(PARTITION BY user_id ORDER BY (SELECT NULL))
FROM dbo.images
)
SELECT u.id, x.image_path
FROM dbo.users AS u
INNER JOIN x
ON u.id = x.user_id
WHERE x.rn = 1;
(并尝试将SELECT NULL
替换为dbo.images
中窄索引中的前导列。)
P.S。不要使用AS 'alias'
语法。不推荐使用该表单,并使别名看起来像字符串文字。另外use the schema prefix always,并使用别名,因此您不必在整个查询中重复完整的表名...
答案 2 :(得分:3)
您需要一个聚合函数。 right 聚合函数与应用程序有关。这意味着你是唯一能说出来的人。一个原始的黑客:
SELECT user.id, max((images.path + images.name)) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
MySQL对GROUP BY子句的处理被广泛认为是BAD。
答案 3 :(得分:2)
根据需要使用Max或Min:
SELECT user.id, max(images.path + images.name) as image_path
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
答案 4 :(得分:1)
如果一个用户有多个图像,则选择第一个(按字母顺序)条目
SELECT user.id, min(images.path + images.name) as image_path
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
答案 5 :(得分:1)
使用GROUP BY
时,您只能使用汇总的列,并汇总其他列的函数。
以下是实现此目的的一种方法:
SELECT user.id, (MAX(images.path) + MAX(images.name)) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id
虽然你更有可能想要:
SELECT user.id, MAX(images.path + images.name)) as 'image_path'
FROM users
JOIN images ON images.user_id = users.id
GROUP BY users.id