我有大量的电子邮件数据集和状态代码。
ID Recipient Date Status
1 someone@example.com 01/01/2010 1
2 someone@example.com 02/01/2010 1
3 them@example.com 01/01/2010 1
4 them@example.com 02/01/2010 2
5 them@example.com 03/01/2010 1
6 others@example.com 01/01/2010 1
7 others@example.com 02/01/2010 2
在这个例子中:
我需要检索的是发送给每个人的所有电子邮件的数量,以及最新状态代码的内容。
第一部分相当简单:
SELECT Recipient, Count(*) EmailCount
FROM Messages
GROUP BY Recipient
ORDER BY Recipient
这给了我:
Recipient EmailCount
someone@example.com 2
them@example.com 3
others@example.com 2
我如何获得最新的状态代码?
最终结果应为:
Recipient EmailCount LastStatus
someone@example.com 2 1
them@example.com 3 1
others@example.com 2 2
感谢。
(服务器是Microsoft SQL Server 2008,查询是通过.Net中的OleDbConnection运行的)
答案 0 :(得分:4)
这是“每组最大数量”查询的示例。我认为将它分成两个子查询然后加入结果是最容易理解的。
第一个子查询就是你已经拥有的。
第二个子查询使用窗口函数ROW_NUMBER为每个收件人的电子邮件编号,从最开始的1开始,然后是2,3,等...
然后将第一个查询的结果与第二个查询的结果连接起来,第二个查询的行号为1,即最新的。这样做可以保证在有联系的情况下,每个收件人只能获得一行。
以下是查询:
SELECT T1.Recipient, T1.EmailCount, T2.Status FROM
(
SELECT Recipient, COUNT(*) AS EmailCount
FROM Messages
GROUP BY Recipient
) T1
JOIN
(
SELECT
Recipient,
Status,
ROW_NUMBER() OVER (PARTITION BY Recipient ORDER BY Date Desc) AS rn
FROM Messages
) T2
ON T1.Recipient = T2.Recipient AND T2.rn = 1
这给出了以下结果:
Recipient EmailCount Status
others@example.com 2 2
someone@example.com 2 1
them@example.com 3 1
答案 1 :(得分:2)
它不是很漂亮,但我可能只是使用了几个子选择:
SELECT Recipient,
COUNT(*) EmailCount,
(SELECT Status
FROM Messages M2
WHERE Recipient = M.Recipient
AND Date = (SELECT MAX(Date)
FROM Messages
WHERE Recipient = M2.Recipient))
FROM Messages M
GROUP BY Recipient
ORDER BY Recipient
答案 2 :(得分:2)
SELECT
M.Recipient,
C.EmailCount,
M.Status
FROM
(
SELECT Recipient, Count(*) EmailCount
FROM Messages
GROUP BY Recipient
) C
JOIN
(
SELECT Recipient, MAX(Date) AS LastDate
FROM Messages
GROUP BY Recipient
) MD ON C.Recipient = MD.Recipient
JOIN
Messages M ON MD.Recipient = M.Recipient AND MD.LastDate = M.Date
ORDER BY
Recipient
我发现聚合大多比分级函数更好地扩展
答案 3 :(得分:1)
您不能轻易地执行此操作是单个查询,因为count(*)是组函数,而最新状态来自特定行。以下是获取每个用户的最新状态的查询:
SELECT M.Recipient, M.Status FROM Messages M
WHERE M.Date = (SELECT MAX(SUB.Date) FROM MESSAGES SUB
WHERE SUB.Recipient = M.Recipient)
答案 4 :(得分:0)
您可以使用排名功能。像(未经测试)的东西:
WITH MyResults AS
(
SELECT Recipient, Status, ROW_NUMBER() OVER( Recipient ORDER BY ( [date] DESC ) ) AS [row_number]
FROM Messages
)
SELECT MyResults.Recipient, MyCounts.EmailCount, MyResults.Status
FROM (
SELECT Recipient, Count(*) EmailCount
FROM Messages
GROUP BY Recipient
) MyCounts
INNER JOIN MyResults
ON MyCounts.Recipient = MyResults.Recipient
WHERE MyResults.[row_number] = 1