Question

如何在 PostgreSQL 中创建单词/字符串的所有可能字谜的列表。

例如，如果String为' act ' 那么所需的输出应为：

行为， atc， cta，猫，交谘会 tca

我有一个表'tbl_words'，其中包含一百万个单词。

然后，我只想从此字谜列表中检查/搜索数据库表中的有效单词。

像上面的字谜列表一样，有效词是：行为，猫。

有什么办法吗？

更新1：

我需要这样的输出：（给定单词的所有排列）

任何想法??

Answer 1

查询生成3个元素集的所有排列：

with recursive numbers as (
    select generate_series(1, 3) as i
),
rec as (
    select i, array[i] as p
    from numbers
union all
    select n.i, p || n.i
    from numbers n
    join rec on cardinality(p) < 3 and not n.i = any(p)
)
select p as permutation
from rec
where cardinality(p) = 3
order by 1

 permutation 
-------------
 {1,2,3}
 {1,3,2}
 {2,1,3}
 {2,3,1}
 {3,1,2}
 {3,2,1}
(6 rows)

修改最终查询以生成给定单词的字母的排列：

with recursive numbers as (
    select generate_series(1, 3) as i
),
rec as (
    select i, array[i] as p
    from numbers
union all
    select n.i, p || n.i
    from numbers n
    join rec on cardinality(p) < 3 and not n.i = any(p)
)
select a[p[1]] || a[p[2]] || a[p[3]] as result
from rec
cross join regexp_split_to_array('act', '') as a
where cardinality(p) = 3
order by 1

 result 
--------
 act
 atc
 cat
 cta
 tac
 tca
(6 rows)

Answer 2

这是一个解决方案：

with recursive params as (
      select *
      from (values ('cata')) v(str)
     ),
     nums as (
      select str, 1 as n
      from params
      union all
      select str, 1 + n
      from nums
      where n < length(str)
     ),
     pos as (
      select str, array[n] as poses, array_remove(array_agg(n) over (partition by str), n) as rests, 1 as lev
      from nums
      union all
      select pos.str, array_append(pos.poses, nums.n), array_remove(rests, nums.n), lev + 1
      from pos join
           nums
           on pos.str = nums.str and array_position(pos.rests, nums.n) > 0
      where cardinality(rests) > 0
     )
select distinct pos.str , string_agg(substr(pos.str, thepos, 1), '')
from pos cross join lateral
     unnest(pos.poses) thepos
where cardinality(rests) = 0 
group by pos.str, pos.poses;

这非常棘手，特别是当字符串中有重复的字母时。此处采用的方法会生成从1到n的数字的所有排列，其中n是字符串的长度。然后将它们用作索引，以从原始字符串中提取字符。

那些热衷于此的人会注意到，它与select distinct一起使用group by。这似乎是避免在结果字符串中重复的最简单方法。

如何在PostgreSQL中创建单词/字符串的所有可能字谜的列表

2 个答案: