说我有这个:
page_url | canvas_url
---------------------------------------------------------------
http://www.google.com/ | http://www.google.com/barfoobaz
http://www.google.com/foo/bar | http://www.google.com/foo
我想找到最长匹配所排序的字符串开头的行。我面临的问题是找到最长的匹配字符串,而不仅仅是匹配的行也有匹配的字符串。即。
http://www.google.com/foo匹配第1行中的page_url
和第2行中的canvas_url
,但如果它是两列的长度而不是匹配,则认为第1行与{{匹配得更好第1行中的1}}更长。
我可以抓住所有匹配,然后在代码中过滤长度,例如:
canvas_url
或执行2个子选择,抓取SELECT *, LENGTH(canvas_url), LENGTH(page_url)
FROM app
WHERE
'http://www.google.com/foo' LIKE CONCAT(canvas_url, '%') OR
'http://www.google.com/foo' LIKE CONCAT(page_url, '%')
各自canvas_url
的热门匹配,然后在代码中将其过滤为1,但我更愿意(除非出现任何荒谬的性能问题)让数据库返回我的内容需要。
我最关心的是MySQL,但我需要针对SQLite和Postgress,所以我对其中任何一个的答案感到满意。
建议?
答案 0 :(得分:3)
这将有助于获得最长的实际匹配长度(不仅仅是记录中最长的网址):
-- Get page_url matches
SELECT *, LENGTH(page_url) AS MatchLen
FROM app
WHERE 'http://www.google.com/foo' LIKE CONCAT(page_url, '%') -- can't tell from question if this should be reversed
UNION ALL
-- Get canvas_url matches
SELECT *, LENGTH(canvas_url) AS MatchLen
FROM app
WHERE 'http://www.google.com/foo' LIKE CONCAT(canvas_url, '%')
-- Bring the longest matches to the top
ORDER BY MatchLen DESC -- May need to add a tie-breaker here
LIMIT 1
答案 1 :(得分:1)
也许你只需要这样的东西?
SELECT page_url as url, LENGTH(page_url) as len
FROM pages WHERE 'http://www.google.com/foo' LIKE CONCAT(page_url, '%')
UNION
SELECT canvas_url as url, LENGTH(canvas_url) as len
FROM pages WHERE 'http://www.google.com/foo' LIKE CONCAT(canvas_url, '%')
ORDER BY len DESC
LIMIT 1
答案 2 :(得分:0)
如果您只需查找第一行,则需要按顺序排列。你必须对如何安排它有点聪明:
SELECT *, LENGTH(canvas_url), LENGTH(page_url)
FROM app
WHERE canvas_url like concat('http://www.google.com/foo' '%') OR
page_url like concat('http://www.google.com/foo', '%')
order by (case when canvas_url like concat('http://www.google.com/foo' '%') and
page_url like concat('http://www.google.com/foo', '%') and
LENGTH(canvas_url) < LENGTH(page_url)
then LENGTH(page_url)
when canvas_url like concat('http://www.google.com/foo' '%') and
page_url like concat('http://www.google.com/foo', '%') and
LENGTH(canvas_url) >= LENGTH(page_url)
when canvas_url like concat('http://www.google.com/foo' '%')
then LENGTH(canvas_url)
else LENGTH(page_url)
end)
limit 1
这是按匹配字符串中较长的顺序排序,然后返回恰好一行。请注意,LIMIT
不是标准的,因此不同的数据库具有不同的返回一行的机制。