我的数据库中有很多字符串(PostgreSQL),例如:
with mystrings as (
select 'H e l l o, how are you'::varchar string union all
select 'I am fine, t h a n k you'::varchar string union all
select 'This is s t r a n g e text'::varchar string union all
select 'With c r a z y space b e t w e e n characters'::varchar string
)
select * from mystrings
有没有办法可以删除单词中字符之间的空格?对于我的例子,结果应该是:
Hello, how are you
I am fine, thank you
This is strange text
With crazy space between characters
我从replace
开始,但是有很多这样的单词在字符之间有空格,我甚至找不到它们。
因为可能难以有意义地连接字符,所以最好只获得串联候选列表。使用示例数据,结果应为:
H e l l o
t h a n k
s t r a n g e
c r a z y
b e t w e e n
当至少有三个单独的字符用两个空格分隔时,这样的查询应该找到并返回字符串中的所有子字符串(并继续直到patern [space] individual character
出现):
He l l o how are you --> llo
H e l l o how are you --> Hello
C r a z y space b e t w e e n --> {crazy, between}
答案 0 :(得分:1)
根据您的 已编辑的 问题,以下内容会获得所有可能具有least three individual characters separated by two spaces
SELECT
data || ' --> {' || replace_candidates || '}'
FROM(
SELECT
data,
( SELECT
array_to_string( array_agg( data ),',' )
FROM (
SELECT
data,
length( data )
FROM (
SELECT
replace( data, ' ', '' ) AS data
FROM
regexp_split_to_table( data, '\S{2,}' ) AS data
) t
WHERE length( data ) > 2
) t ) AS replace_candidates
FROM
mystrings
) T
WHERE
replace_candidates IS NOT NULL
<强>工作强>
首先查看最内部的查询(regexp_split_to_table
)
regexg
获取2 characters in a sequence
(空格不是separated
)的所有字符串regexp_split_to_table
获得匹配的反转,更多信息here empty char
替换空格并使用records
length greater than 2
醇>
根据您的要求,铰孔是array aggregate
函数来处理formatting
,更多here
<强>结果
H e l l o how are you --> {Hello}
I am fine, t h a n k you --> {thank}
This is s t r a n g e text --> {strange}
With c r a z y space b e t w e e n characters --> {crazy,between}
SOME MORE TEST T E X T --> {TEXT}
注意:它会将字符视为[space][char][space]
,但您可以根据[space][space][char][space]
或[space][char][special_char][space]
的需要对其进行修改...
希望这有帮助; p
答案 1 :(得分:0)
如果单词存在,您可以使用在线词典之类的资源,然后您不必删除空格,否则删除空格,或者您可以使用表格,您必须将所有字符串存在,然后您必须检查该表希望你明白我的意思。
答案 2 :(得分:0)
以下查找可能的串联候选项:
with mystrings as (
select 'H e l l o, how are you'::varchar string union all
select 'I am fine, t h a n k you'::varchar string union all
select 'This is s t r a n g e text'::varchar string union all
select 'With c r a z y space b e t w e e n characters'::varchar string
)
, u as (
select string, strpart[rn] as strpart, rn
from (
select *, generate_subscripts(strpart, 1) as rn
from (
select string, string_to_array(replace(string,',',''), ' ') as strpart
from mystrings
) x
) y
)
,w as (
select
string,strpart,rn,
case when length(strpart) = 1 then 1 else 0 end as indchar ,
case when coalesce(length(lag(strpart) over()),0) <> 1 and length(strpart) = 1 then 1 else 0 end as strstart,
case when coalesce(length(lead(strpart) over()),0) <> 1 and length(strpart) = 1 then 1 else 0 end as strend
from u
)
,x as (
select
string,rn,strpart,indchar,strstart,
sum(strstart) over (order by string, rn) as strid
from w
where indchar = 1 and not (strstart = 1 and strend = 1)
)
select string, array_to_string(array_agg(strpart),'') as candidate from x group by string, strid