计算匹配词的百分比

时间:2018-11-15 08:17:22

标签: sql-server sql-server-2008-r2

CREATE TABLE tbl_pat
(
    id int,
    name varchar(100),
    [address] varchar(500)
);
INSERT INTO tbl_pat VALUES(1,'Jack','Lane 1, 90 Road Street, SL');
INSERT INTO tbl_pat VALUES(2,'Will','SA, Lane 10, Street road');
INSERT INTO tbl_pat VALUES(3,'White','Lane 1 ZIM');
INSERT INTO tbl_pat VALUES(4,'Shaw','Street Road');
INSERT INTO tbl_pat VALUES(5,'Steve','Road Street');
INSERT INTO tbl_pat VALUES(6,'Brown','Nz Road 10');

预期结果:

搜索字符串为:Street Road

Name    Address                     Percentage
---------------------------------------------
Shaw    Street Road                 100
Steve   Road Street                 100
Will    SA, Lane 10, Street road    20
Jack    Lane 1, 90 Road Street, SL  17

注意:百分比是在假设条件下提及的,但是前两个百分比应为100%,因为它具有完全匹配的条件。

我正在使用PATINDEX来搜索单词。

查询:搜索马路

SELECT [Name],[Address] 
FROM tbl_pat 
WHERE PATINDEX('%Street%',[Address])>=1 AND PATINDEX('%Road%',[Address])>=1 

如何计算单个选择语句中匹配单词的百分比?

2 个答案:

答案 0 :(得分:1)

我在这里使用string_split()中的SQL Server 2017。您可以用任何可用的字符串拆分功能替换。只是搜索

它不是完美的,但适用于您的样品。

select  p.id, p.name, p.address, count(k.value) * 100.0 / count(*) as pecentage
from    tbl_pat p
        cross apply string_split(replace([address], ',', ' '), ' ') w
        left join
        (
            select  value
            from    string_split ('Road Street', ' ') 
        ) k on  w.value     = k.value
group by p.id, p.name, p.address

答案 1 :(得分:1)

我认为您提供的预期百分比计算存在问题 例如,杰克,有2个匹配项,输出6个单词。所以我希望它的结果是%33

请在我使用String_Split function将文本拆分成单词的地方测试以下SQL查询

declare @str nvarchar(max) = 'Road Street'

; with tbl as (
select *, count(*) over (partition by id) word_count
from tbl_pat t
cross apply STRING_SPLIT(replace(t.address,',',' '), ' ')
where trim([value]) <> ''  
)
select distinct id, [name], word_count, count(search.[value]) over (partition by id),
    convert( decimal(5,2), (100.0 * (count(search.[value]) over (partition by id)) / word_count))
from tbl 
left join (
    select * from STRING_SPLIT(@str, ' ')
) search
    on search.[value] = tbl.[value]
order by id

输出是

enter image description here