我有下表
MyTable
ID
MessageType
MessageDate
MessageBody
该表是几百万行,但其中只有100个唯一的MessageType。
我需要的是每个MessageType的示例(必须至少包含MessageType和MessageBody),但我不能执行 DISTINCT
因为只获取MessageType列。
我在想像
SELECT TOP 5 *
FROM MyTable
WHERE MessageType IN (SELECT DISTINCT MessageType FROM MyTable)
我知道这不起作用,因为它只是我的前5名,但我不知道如何让SQL循环通过它。
感谢您的帮助
答案 0 :(得分:2)
Row_Number版本
;WITH cte AS
(
SELECT ID,
MessageType,
MessageDate,
MessageBody,
ROW_NUMBER() OVER (PARTITION BY MessageType ORDER BY (SELECT 0)) AS RN
FROM MyTable
)
SELECT ID,
MessageType,
MessageDate,
MessageBody
FROM cte
WHERE RN <=5
CROSS APPLY
版
WITH m1 AS
(
SELECT DISTINCT MessageType
FROM MyTable
)
SELECT m2.*
FROM m1
CROSS APPLY
(
SELECT TOP 5 *
FROM MyTable m2
WHERE m2.MessageType = m1.MessageType
) m2
答案 1 :(得分:2)
马丁,如果我正确地读了你的答案,我想你会产生的是每条信息的5个样本。 Marc_s只想从每条消息中获取一个样本。
我认为你需要的是:
SELECT ID,
MessageType,
MessageDate
FROM (
SELECT ID,
MessageType,
MessageDate,
ROW_NUMBER() OVER (PARTITION BY MessageType, ORDER BY NEWID() ) AS RN
-- I am using NewID() because it will produce a nice random sampling,
-- but Mark's SELECT(0) will be faster.
FROM MyTable
) sampling
WHERE RN =1