选择N行,避免在非关键的非索引字段上重复

时间:2014-06-17 05:15:14

标签: sql-server tsql

使用T-SQL,如何选择非关键非索引列的n行并避免重复结果?

示例表:

ID_ | state    | customer | memo
------------------------------------------
1   |  abc     | 123      | memo text xyz
2   |  abc     | 123      | memo text abc
3   |  abc     | 456      | memo text def
4   |  abc     | 456      | memo text rew
5   |  abc     | 789      | memo text yte
6   |  def     | 123      | memo text hrd
7   |  def     | 432      | memo text dfg

我想为州'abc'选择2个备忘录,但返回的备忘录不应该是同一个客户。

memo
----
memo text xyz
memo text def

PS:唯一可用的选择条件是state(例如:where state ='abc')

我设法以非常低效的方式做到了这一点

SELECT top 2 MAX(memo)
FROM table
WHERE state = 'abc'
GROUP BY customer

这适用于小样本量,但生产表有超过10亿行。

2 个答案:

答案 0 :(得分:4)

您可以尝试在实际数据库大小中使用以下查询。不确定具有十亿行的数据库表中的性能。所以你可以自己做测试。

SELECT memo
FROM   (SELECT memo,
               ROW_NUMBER() OVER (PARTITION BY customer ORDER BY (SELECT 0)) AS RN
        FROM   table1 WHERE state = 'abc') T
WHERE  RN = 1 

您可以查看 SQL FIDDLE

编辑:在状态和客户上添加非聚集索引(包括备忘录)将极大地提高性能。

CREATE NONCLUSTERED INDEX [custom_index] ON table 
(
    [state] ASC,
    [customer] ASC
)
INCLUDE ( [memo]) WITH (SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF) ON [DATA]

答案 1 :(得分:1)

为州/客户获取n个不同值的方法是获取每个组的ID

SELECT MIN(ID_) ID
FROM   Table1
GROUP BY State, customer

MIN可以用MAX代替,它只是获取其中一个值的一种方式)
然后JOIN添加其他条件的表格

WITH getID AS (
  SELECT MIN(ID_) ID
  FROM   Table1
  GROUP BY State, customer
)
SELECT TOP 2
       t.ID_, t.State, t.Customer, t.memo
FROM   table1 t
       INNER JOIN getID g ON t.ID_ = g.ID
WHERE  t.state = 'abc'

SQLFiddle demo

如果您的SQLServer版本不支持WITH CTE可以成为子查询

SELECT TOP 2
       t.ID_, t.State, t.Customer, t.memo
FROM   table1 t
       INNER JOIN (SELECT MIN(ID_) ID
                   FROM   Table1
                   GROUP BY State, customer
                  ) g ON t.ID_ = g.ID
WHERE  t.state = 'abc'

另一种方法是使用CROSS APPLY获取不同的ID

SELECT TOP 2
       t.ID_, t.State, t.Customer, t.memo
FROM   table1 t
       CROSS APPLY (SELECT TOP 1
                           ID_
                    FROM   table1 t1
                    WHERE  t1.State = t.State AND t1.Customer = t.Customer) c
WHERE  t.state = 'abc'
  AND  c.ID_ = t.ID_;

SQLFiddle demo