根据一列的MAX和另一列的相关MIN选择行

时间:2016-06-22 15:44:44

标签: sql sql-server-2008 tsql

我有一个像这样设置的表:

CREATE TABLE dbo.IntervalCounts (
    item_id int NOT NULL,
    interval_time time(0) NOT NULL,
    interval_count int DEFAULT 0 NOT NULL
)

每个 item_id 有96 interval_time s,从00:00到23:45,以15分钟为增量。每个 interval_time 都有 interval_count > = 0.此表格大约有。 2亿行。

我需要从计数最高的表中选择值,然后,如果有多个具有相同计数的合格行,请选择间隔时间最短的行。

所以,如果我有 item_id 1,其最大数量为100:

item_id   interval_time interval_count
1         00:00         100
1         13:15         100
1         07:45         100
1         19:30         100

我想得到一行:

item_id   interval_time interval_count
1         00:00         100

获得第一个选择很简单,我有:

SELECT a.item_id, a.interval_time, a.interval_count
    FROM dbo.IntervalCounts a
    LEFT JOIN dbo.IntervalCounts b
        ON a.item_id = b.item_id
        AND a.interval_count < b.interval_count
    WHERE 1=1
    AND b.interval_count IS NULL

然而,将它降低到一行对我来说已经证明是棘手的。

这次三重自我加入跑了一个半小时才杀了它(我会定期运行它,理想情况下它最多不会超过15分钟)。

SELECT a.item_id, a.interval_time, a.interval_count
    FROM dbo.IntervalCounts a
    LEFT JOIN dbo.IntervalCounts b
        ON a.item_id = b.item_id
        AND a.interval_count < b.interval_count
    LEFT JOIN dbo.IntervalCounts c
        ON a.item_id = c.item_id
        -- if I remove this line, it will ALWAYS give me the 00:00 interval
        -- if I keep it, it runs way too long
        AND a.interval_count = c.interval_count
        AND a.interval_time > c.interval_time
    WHERE 1=1
    AND b.interval_count IS NULL
    AND c.interval_time IS NULL

做这样的事情似乎很笨拙,我也被迫在大约一个半小时后杀死了执行:

DECLARE @tempTable TABLE
    (
    item_id int,
    interval_time time(0),
    interval_count int
    )

INSERT INTO @tempTable
SELECT a.item_id, a.interval_time, a.interval_count
FROM dbo.IntervalCount a
LEFT JOIN dbo.IntervalCount b
    ON a.item_id = b.item_id
    AND a.interval_count < b.interval_count
WHERE 1=1
AND b.interval_count IS NULL

SELECT a.item_id, a.interval_time, a.interval_count
FROM @tempTable a
LEFT JOIN @tempTable b
    ON a.item_id = b.item_id
    AND a.interval_time > b.interval_time
WHERE 1=1
AND b.interval_time IS NULL

必须有更好的方法,但我很难过。我怎么能以不会永远运行的方式做到这一点?

1 个答案:

答案 0 :(得分:4)

你正在思考它,你可以使用ROW_NUMBER

WITH CTE AS
(
    SELECT  *,
            RN = ROW_NUMBER() OVER(PARTITION BY item_id 
                                   ORDER BY interval_count DESC, interval_time)
    FROM dbo.IntervalCounts
)
SELECT *
FROM CTE
WHERE RN = 1;