SQL查询返回n个具有空字段或按该字段分组的记录的前几条记录

时间:2018-10-08 10:05:04

标签: sql sql-server

我有一个带有过滤器列的表A

| id |  name  | filter |
| 1  |  joe   |  a     |
| 2  |  anna  |  a     |
| 3  |  mike  | null   |
| 4  |  frank | null   |
| 5  |  sarah |  b     |
| 6  |  jamie |  b     |

假定记录按ID排序。具有相同过滤器值的记录应仅计为一个。

TOP(1)应该返回

| id |  name  | filter |
| 1  |  joe   |  a     |
| 2  |  anna  |  a     |

TOP(2)应该返回

| id |  name  | filter |
| 1  |  joe   |  a     |
| 2  |  anna  |  a     |
| 3  |  mike  | null   |

TOP(3)应该返回

| id |  name  | filter |
| 1  |  joe   |  a     |
| 2  |  anna  |  a     |
| 3  |  mike  | null   |
| 4  |  frank | null   |

TOP(4)应该返回

| id |  name  | filter |
| 1  |  joe   |  a     |
| 2  |  anna  |  a     |
| 3  |  mike  | null   |
| 4  |  frank | null   |
| 5  |  sarah |  b     |
| 6  |  jamie |  b     |

5 个答案:

答案 0 :(得分:2)

再三考虑,您尝试选择第一个 n 个不同的过滤器。只需为每个过滤器找到最小的ID并编号即可:

Server sends data via websocket -> Python program receives it and sends it to C++ program -> C++ program processes the data and sends some data to Python code -> Python code sends packets to Server

结果:

DECLARE @A TABLE(id INT, name VARCHAR(100), filter VARCHAR(100));
INSERT INTO @A VALUES
(1, 'joe',     'y' ), -- 1st
(2, 'anna',    'x' ), -- 2nd
(3, 'mike',    NULL), -- 3rd
(4, 'frank',   NULL), -- 4th
(5, 'sarah',   'x' ),
(6, 'jamie',   'y' ),
(9, 'forrest', 'z' ); -- 5th

WITH filter_minid AS (
    SELECT filter, MIN(id) AS minid
    FROM @A
    GROUP BY filter, CASE WHEN filter IS NULL THEN id END
), filter_minid_number AS (
    SELECT filter, minid, ROW_NUMBER() OVER (ORDER BY minid) AS rn
    FROM filter_minid
)
SELECT *
FROM @A a 
INNER JOIN filter_minid_number ON a.filter = filter_minid_number.filter OR a.id = filter_minid_number.minid
WHERE rn <= 5 -- this is where you filter for n distinct ids

答案 1 :(得分:2)

您可以使用窗口式MIN()对具有相同过滤器的值进行分组(不同组中的NULL值),然后使用DENSE_RANK()对值进行展平,以便以后进行过滤。

IF OBJECT_ID('tempdb..#Values') IS NOT NULL
    DROP TABLE #Values

CREATE TABLE #Values (
    ID INT IDENTITY,
    Name VARCHAR(10),
    Filter VARCHAR(10))

INSERT INTO #Values (
    Name,
    Filter)
VALUES
    ('joe', 'a'),
    ('anna', 'a'),
    ('mike', NULL),
    ('frank', NULL),
    ('sarah', 'b'),
    ('jamie', 'b'),
    ('john', 'a')

DECLARE @v_TopFilter INT = 4 -- Your top filter here

;WITH MinimumByFilter AS
(
    SELECT
        V.*,
        MinimumIDByFilter = MIN(V.ID) OVER (
            PARTITION BY 
                V.Filter,
                CASE WHEN V.Filter IS NULL THEN V.ID END)
    FROM
        #Values AS V
),
DenseRank AS
(
    SELECT
        M.*,
        DenseRank = DENSE_RANK() OVER(ORDER BY M.MinimumIDByFilter ASC)
    FROM
        MinimumByFilter AS M
)
SELECT
    D.ID,
    D.Name,
    D.Filter
FROM
    DenseRank AS D
WHERE
    D.DenseRank <= @v_TopFilter
ORDER BY
    D.ID ASC

您可以在此处查看函数返回的内容:

ID  Name    Filter  MinimumIDByFilter   DenseRank
1   joe     a       1                   1
2   anna    a       1                   1
7   john    a       1                   1
3   mike    NULL    3                   2
4   frank   NULL    4                   3
5   sarah   b       5                   4
6   jamie   b       5                   4

答案 2 :(得分:1)

您可以尝试一下。

DECLARE @Tbl TABLE ( id INT,  name  varchar(10), filter varchar(10))

INSERT INTO @Tbl VALUES
(1 ,'joe', 'a'),
(2 ,'anna', 'a'),
(3 ,'mike', null),
(4 ,'frank', null),
(5 ,'sarah', 'b'),
(6 ,'jamie', 'b')

DECLARE @TOP INT = 3

SELECT id, name, filter FROM 
    ( SELECT *, DENSE_RANK() OVER(ORDER BY SUB_RNK) RNK
        FROM ( SELECT *, 
            MIN(id) OVER(PARTITION BY ISNULL(filter,id) ) SUB_RNK
          FROM @Tbl ) T1
    ) T2
WHERE 
    T2.RNK <= @TOP

结果:(前3名)

id          name       filter
----------- ---------- ----------
1           joe        a
2           anna       a
3           mike       NULL
4           frank      NULL

答案 3 :(得分:0)

使用子查询

CREATE PROCEDURE `top` (IN x INT UNSIGNED)
BEGIN
select * from tableA where `filter` in (select distinct `filter` from tableA LIMIT x )
END

使用加入

CREATE PROCEDURE `top` (IN x INT UNSIGNED)
    BEGIN
    select * from tableA A 
      join (select distinct `filter` from tableA LIMIT x ) AA
      on A.`filter` = AA.`filter`
END

答案 4 :(得分:0)

您可以使用dense_rank()

select t.*
from (select t.*,
             dense_rank() over (order by filter,
                                         (case when filter is null then id end)
                                end) as seqnum
      from t
     ) t
where seqnum < ?  -- whatever your limit is;

如果您想为此使用top,则可以使用top with ties

select top (?) with ties t.*
from (select t.*,
             dense_rank() over (order by filter,
                                         (case when filter is null then id end)
                                end) as seqnum
      from t
     ) t
order by seqnum;