我正在使用sparq sql。让我们说这是我的大桌子的快照:
ups store
ups store austin
ups store chicago
ups store bern
walmart
target
如何在sql中找到上述数据的最长前缀?那就是:
ups store
walmart
target
我已经有了一个Java程序来执行此操作,但我有一个大文件,现在我的问题是 如果可以在SQL中合理地完成这个吗?
以下更复杂的scnenario怎么样? (如果没有这个,我可以活着,但如果可能的话,我很高兴)
ups store austin
ups store chicago
ups store bern
walmart
target
那会返回[ups store, walmart, target]
。
答案 0 :(得分:1)
假设您可以自由创建另一个表,该表只包含从零到最长可能字符串大小的升序整数列表,那么以下内容应仅使用ANSI SQL执行作业:
SELECT
id,
SUBSTRING(name, 1, CASE WHEN number = 0 THEN LENGTH(name) ELSE number END) AS prefix
FROM
-- Join all places to all possible substring lengths.
(SELECT *
FROM places p
CROSS JOIN lengths l) subq
-- If number is zero then no prefix match was found elsewhere
-- (from the question it looked like you wanted to include these)
WHERE (subq.number = 0 OR
-- Look for prefix match elsewhere
EXISTS (SELECT * FROM places p
WHERE SUBSTRING(p.name FROM 1 FOR subq.number)
= SUBSTRING(subq.name FROM 1 FOR subq.number)
AND p.id <> subq.id))
-- Include as a prefix match if the whole string is being used
AND (subq.number = LENGTH(name)
-- Don't include trailing spaces in a prefix
OR (SUBSTRING(subq.name, subq.number, 1) <> ' '
-- Only include the longest prefix match
AND NOT EXISTS (SELECT * FROM places p
WHERE SUBSTRING(p.name FROM 1 FOR subq.number + 1)
= SUBSTRING(subq.name FROM 1 FOR subq.number + 1)
AND p.id <> subq.id)))
ORDER BY id;
现场演示: http://rextester.com/XPNRP24390
第二个方面是,如果我们有(ups存储奥斯汀,ups商店 芝加哥)。我们可以使用SQL从中提取'ups store'。
这应该只是以与上述类似的方式使用SUBSTRING
的情况,例如:
SELECT SUBSTRING(name,
LENGTH('ups store ') + 1,
LENGTH(name) - LENGTH('ups store '))
FROM places
WHERE SUBSTRING(name,
1,
LENGTH('ups store ')) = 'ups store ';
答案 1 :(得分:0)
假设您的列名是&#34; mycolumn&#34;,并且您的大表是&#34; mytable&#34;,并且单个空格是您的字段分隔符:
在PostgreSQL中,你可以做一些简单的事情:
select
mycolumn
from
mytable
order by
length(split_part(mycolumn, ' ', 1)) desc
limit
1
如果您经常运行此查询,我可能会在表上尝试一个有序的功能索引,如下所示:
create prefix_index on mytable (length(split_part(mycolumn, ' ', 1)) desc)