使用联合和订购者,而不会重复

时间:2018-07-31 09:47:50

标签: sql sql-server tsql union

我正在使用UNION合并精确搜索和模糊搜索的结果。我希望精确匹配位于顶部,其他结果按列排序。我发现this solutionUNION ALL可以正常工作,但是通过添加rank列,我失去了UNION的属性(没有全部),从而从结果中删除了完全匹配的重复项设置。

是否有解决此问题的优雅方法,还是必须手动删除重复项?

我的简化查询供参考:

SELECT 1 AS [Rank], [CallerID] 
FROM [PHONE]
WHERE [CallerID] = '12345'

UNION

SELECT 2 AS [Rank], [CallerID] 
FROM [PHONE]
WHERE [CallerID] LIKE '12%' AND ABS(LEN([CallerID]) - LEN('12345')) < 3
ORDER BY [Rank] ASC, [CallerID] ASC

结果看起来像这样:

Rank        CallerID
----------- --------------------
1           12345
2           123
2           1233
2           1234
2           12345     <- I don't want this line
2           1236

备注:为我的CallerID设置DISTINCT无法解决问题,因为我的真实查询有更多列。我真的只想删除在UNION中合并的两个结果集之间的重复项。

2 个答案:

答案 0 :(得分:2)

将您现有的查询放在CTE中(这里我将示例数据放在那里) 然后使用ROW_NUMBER()和另外的WHERE来过滤结果:

with OriginalQuery as (
select 1 as Rank,  12345 as CallerID union all
select 2 ,123 union all
select 2,1233 union all
select 2,1234 union all
select 2,12345 union all
select 2,1236
), Preferred as (
    select *,ROW_NUMBER() OVER (
        PARTITION BY CallerID /* other columns too? */
        ORDER BY RANK
        ) as rn
    from OriginalQuery
)
select
    *
from
    Preferred
where
    rn = 1
order by Rank,CallerID

如前所述,如果PARTITION本身不是此数据的键,则可能必须向CallerID添加更多/调整列。


当然,如果您的底层数据中没有任何重复项,而您得到重复项的原因仅仅是因为您正在运行两次搜索并将结果组合在一起,那么这要简单得多要做:

SELECT [CallerID] 
FROM [PHONE]
WHERE
    CallerID = '12345' OR
    ([CallerID] LIKE '12%' AND ABS(LEN([CallerID]) - LEN('12345')) < 3)
ORDER BY CASE WHEN CallerID='12345' THEN 0 ELSE 1 END, [CallerID] ASC

在合并两个搜索而不是合并其结果的地方,然后使用CASEORDER BY中挑选出最佳匹配项。

答案 1 :(得分:1)

一个变体:

WITH DataSource AS
(
    SELECT 1 AS [Rank], [CallerID] 
    FROM [PHONE]
    WHERE [CallerID] = '12345'
)
SELECT [Rank]
      ,[CallerID]
FROM DataSource

UNION ALL

SELECT 2 AS [Rank], [CallerID] 
FROM [PHONE]
WHERE [CallerID] LIKE '12%' AND ABS(LEN([CallerID]) - LEN('12345')) < 3
    AND [CallerID] NOT IN (SELECT [CallerID] FROM DataSource)
ORDER BY [Rank] ASC, [CallerID] ASC