我正在使用UNION
合并精确搜索和模糊搜索的结果。我希望精确匹配位于顶部,其他结果按列排序。我发现this solution与UNION ALL
可以正常工作,但是通过添加rank
列,我失去了UNION
的属性(没有全部),从而从结果中删除了完全匹配的重复项设置。
是否有解决此问题的优雅方法,还是必须手动删除重复项?
我的简化查询供参考:
SELECT 1 AS [Rank], [CallerID]
FROM [PHONE]
WHERE [CallerID] = '12345'
UNION
SELECT 2 AS [Rank], [CallerID]
FROM [PHONE]
WHERE [CallerID] LIKE '12%' AND ABS(LEN([CallerID]) - LEN('12345')) < 3
ORDER BY [Rank] ASC, [CallerID] ASC
结果看起来像这样:
Rank CallerID
----------- --------------------
1 12345
2 123
2 1233
2 1234
2 12345 <- I don't want this line
2 1236
备注:为我的CallerID设置DISTINCT
无法解决问题,因为我的真实查询有更多列。我真的只想删除在UNION
中合并的两个结果集之间的重复项。
答案 0 :(得分:2)
将您现有的查询放在CTE中(这里我将示例数据放在那里)
然后使用ROW_NUMBER()
和另外的WHERE
来过滤结果:
with OriginalQuery as (
select 1 as Rank, 12345 as CallerID union all
select 2 ,123 union all
select 2,1233 union all
select 2,1234 union all
select 2,12345 union all
select 2,1236
), Preferred as (
select *,ROW_NUMBER() OVER (
PARTITION BY CallerID /* other columns too? */
ORDER BY RANK
) as rn
from OriginalQuery
)
select
*
from
Preferred
where
rn = 1
order by Rank,CallerID
如前所述,如果PARTITION
本身不是此数据的键,则可能必须向CallerID
添加更多/调整列。
当然,如果您的底层数据中没有任何重复项,而您得到重复项的原因仅仅是因为您正在运行两次搜索并将结果组合在一起,那么这要简单得多要做:
SELECT [CallerID]
FROM [PHONE]
WHERE
CallerID = '12345' OR
([CallerID] LIKE '12%' AND ABS(LEN([CallerID]) - LEN('12345')) < 3)
ORDER BY CASE WHEN CallerID='12345' THEN 0 ELSE 1 END, [CallerID] ASC
在合并两个搜索而不是合并其结果的地方,然后使用CASE
在ORDER BY
中挑选出最佳匹配项。
答案 1 :(得分:1)
一个变体:
WITH DataSource AS
(
SELECT 1 AS [Rank], [CallerID]
FROM [PHONE]
WHERE [CallerID] = '12345'
)
SELECT [Rank]
,[CallerID]
FROM DataSource
UNION ALL
SELECT 2 AS [Rank], [CallerID]
FROM [PHONE]
WHERE [CallerID] LIKE '12%' AND ABS(LEN([CallerID]) - LEN('12345')) < 3
AND [CallerID] NOT IN (SELECT [CallerID] FROM DataSource)
ORDER BY [Rank] ASC, [CallerID] ASC