Question

我想通过＆＃34;执行＆＃34;特殊群组。在使用SQL语言的字符串上，有些以＆＃34; *＆＃34;结尾。我用postgresql。我无法清楚地表达这个问题，即使我已经部分解决了这个问题，使用了不优雅的select，union和嵌套查询。

例如：

1）INPUT：我有一个字符串列表：

thestrings
varchar(9)
--------------
1000
1000-0001
1000-0002
2000*
2000-0001
2000-0002
3000*
3000-00*
3000-0001
3000-0002

2）输出：我希望我的＆＃34;特殊小组由＆＃34;回复：

因为2000-0001和2000-0002包含在2000 *中，因为3000 * *，3000-0001和3000-0002包括3000 *

3）SQL查询我这样做：

SELECT every strings ending with *
UNION
SELECT every string where the begining  NOT IN  (SELECT every string ending with *)   <-- with multiple inelegant left functions and NOT IN subqueries

4）我正在做的事情是：

1000
1000-0001
1000-0002
2000*
3000*
3000-00* <-- the problem

问题是： 3000-00 * 坚持我的结果。

所以我的问题是：我怎样才能概括我的问题？删除列表中具有相同开始字符串的所有字符串（以*结尾）？我想到了正则表达式，但是如何从正则表达式中的select中传递列表？

感谢您的帮助。

Answer 1

仅选择表中不存在主字符串的字符串：

select str
from mytable
where not exists 
(
  select *
  from mytable master
  where master.str like '%*'
  and master.str <> mytable.str
  and rtrim(mytable.str, '*') like rtrim(master.str, '*') || '%'
);

Answer 2

假设只有一个通用模式可以匹配任何给定的字符串，则以下内容应该符合您的要求：

select coalesce(tpat.thestring, t.thestring) as thestring
from t left join
     t tpat
     on t.thestring like replace(tpat.thestring, '*', '%') and
        t.thestring <> tpat.thestring
group by coalesce(tpat.thestring, t.thestring);

但是，这不是你的情况。但是，您可以使用distinct on：

进行调整

  select distinct on (t.thestring) coalesce(tpat.thestring, t.thestring)
  from t left join
       t tpat
       on t.thestring like replace(tpat.thestring, '*', '%') and
          t.thestring <> tpat.thestring
  order by t.thestring, length(tpat.thestring)

SQL特殊组by以*结尾的字符串列表

2 个答案: