每次用户在网站上搜索文本时,搜索文本都会记录到search_table。还记录了子搜索。它们用星号记录。
目标是找到用户搜索的最完整的搜索文本。
理想的方式是:
Group the ids = 1,4,6 and obtain id=6
Group the ids = 2,5,7 and obtain id = 7
Group the ids = 3 and obtain id = 3
Group the ids 8, 9 and obtain id = 9
SEARCH_TABLE
id user search_text
--------------------
1 user1 data manag*
2 user1 confer*
3 user1 incomplete sear*
4 user1 data managem*
5 user1 conference c*
6 user1 data management
7 user1 conference call
8 user1 status in*
9 user1 status information
输出应为
user search_text
---------------------
user1 data management
user1 conference call
user1 incomplete sear*
user1 status information
你能帮帮忙吗?
答案 0 :(得分:0)
下面的内容应该可以完成这项工作:
SELECT * FROM
SEARCH_TABLE st
WHERE
NOT EXISTS (
SELECT 1 FROM
SEARCH_TABLE st2
-- remove asterkis and ad %
WHERE st2.search_Text LIKE replace(st.search_text,'*','')||'%'
)
这会过滤所有属于他人的搜索。
答案 1 :(得分:0)
这可能不是最优雅的方式,但这里有一个方法:
alter table your_table
add group_id int
select [user], left(search_text, 5) as Group_Text, IDENTITY(int, 1,1) as Group_ID
into #group_id_table
from your_table
group by [user], left(search_text, 5)
order by [user], left(search_text, 5)
update a
set a.group_id = b.group_id
from your_table as a
join #group_id_table as b
on left(search_text, 5) = group_text
select [user], max(search_text), group_id
from your_table
group by [user], group_id
order by [user], group_id
当我运行它时,这实现了预期的结果,但当然因为你将group_id基于用户指定的字符串长度,可能存在问题。我希望这能帮到你。
答案 2 :(得分:0)
给它一个机会。我将完成的文本(及其较短的部分)分开,然后找到每条记录的最长部分。在Oracle中测试,因为我现在无法访问PostgreSQL,但我没有使用任何异国情调,所以它应该可以工作。
with
--Contains all completed searches
COMPLETE as (select * from SEARCH_TABLE where SEARCH_TEXT not like '%*'),
--Contains all searches that are incomplete and dont have a completed match
INCOMPLETE as (
select S.*
from SEARCH_TABLE S
left join COMPLETE C
on S.USR = C.USR
and C.SEARCH_TEXT like replace(S.SEARCH_TEXT, '*', '%')
where C.ID is null
),
--chains all incompleted with any matching pattern shorter than it.
CHAINED_INC as (
select LONGER.USR, LONGER.ID, LONGER.SEARCH_TEXT, SHORTER.SEARCH_TEXT SEARCH_TEXT_SHORT
from INCOMPLETE LONGER
join INCOMPLETE SHORTER
on LONGER.SEARCH_TEXT like replace(SHORTER.SEARCH_TEXT, '*', '%')
and LONGER.ID <> SHORTER.ID
)
--if a text is not the shorter text for a different record, that means it's the longest text for that pattern.
select distinct T1.USR, T1.SEARCH_TEXT
from CHAINED_INC T1
left join CHAINED_INC T2
on T1.USR = T2.USR
and T1.SEARCH_TEXT = T2.SEARCH_TEXT_SHORT
where T2.SEARCH_TEXT_SHORT is null
--finally, union back to the completed texts.
union all
select USR, SEARCH_TEXT from COMPLETE
;
修改:从选择
中删除了ID