我有一个表项目名称,如下所示:
Microsoft Word
Adobe Premiere
Paint
Mozila Firefox
Adobe Photoshop CS7
Windows Movie Maker
我想选择数据(表格产品,列名称)变成这样:
Microsoft
Word
Microsoft Word
Adobe
PremiereF
Adobe Premier
Paint
Mozila firefox
Adobe
Photoshop
CS7
Adobe Photoshop
Photoshop CS7
Windows
Movie
Maker
我正在使用Postgres ....是否可以这样做?
答案 0 :(得分:1)
我不清楚您的预期结果是什么。
对于Adobe Photoshop CS7
,您的结果是:
Adobe
Photoshop
CS7
Adobe Photoshop
Photoshop CS7
原始字符串Adobe Photoshop CS7
呢?对于解决方案,我希望您希望所有子短语的顺序正确。因此,解决方案应包括Adobe Photoshop CS7
结果。您的其他结果(包括原始字符串)表明了这一点。
(1)第一步:从头开始获取所有子短语:
String: A B C D E
A
A B
A B C
A B C D
A B C D E
查询
WITH single_words AS (
SELECT *, row_number() OVER (PARTITION BY id) AS nth_word FROM ( -- B
SELECT id, regexp_split_to_table(phrase, '\s') as word FROM phrases -- A
)s
)
SELECT
array_agg(word) OVER (PARTITION BY id ORDER BY nth_word) as phrase_part -- C
FROM single_words;
A:WITH
查询使查询简化为只编写一次子查询(在(2)中使用)。 regexp_split_to_table
函数在空白处分割字符串,并将每个单词放在一行中。
B:窗口函数row_number
在单词上添加一个计数器,以指示原始字符串(https://www.postgresql.org/docs/current/static/tutorial-window.html)中的原点位置。
C:窗口函数array_agg() OVER (... ORDER BY nth_word)
将单词聚合到一个列表中。 ORDER BY
用于获取由原始单词位置指示的升序单词列表(如果没有ORDER BY
,array_agg
会添加该短语的所有单词,从而为所有{{1 }}行)
(2)第二步:从所有起点获取所有子短语:
word
查询
String: A B C D E
A
B
C
D
E
A B
B C
C D
D E
A B C
B C D
C D E
A B C D
B C D E
A B C D E
A:与(1)中相同
B:将短语与自己交叉连接;更好的说:将同一个词的每个后续词连在一起
C:此窗口函数将短语词聚合到给定的结果中。
如果您不喜欢该数组,则可以使用函数WITH single_words AS ( -- A
SELECT *, row_number() OVER (PARTITION BY id) AS nth_word FROM (
SELECT id, regexp_split_to_table(phrase, '\s') as word FROM phrases
)s
)
SELECT
*,
array_agg(b.word) OVER (PARTITION BY a.id, a.nth_word ORDER BY a.id, a.nth_word, b.nth_word) as phrase_part -- C
FROM single_words a -- B
JOIN single_words b
ON (a.id = b.id AND a.nth_word <= b.nth_word)
答案 1 :(得分:0)
您可以使用regexp_split_to_array
:
CREATE TABLE s(c TEXT);
INSERT INTO s(c) VALUES('Microsoft Word'), ('Adobe Premiere');
SELECT unnest(regexp_split_to_array(s.c, '\s+'))
FROM s
UNION ALL
SELECT c
FROM s;
<强> Rextester Demo 强>
修改强>
获取您可以使用的每种组合:
WITH src AS (
SELECT id,name, rn::int, (MAX(rn) OVER(PARTITION BY id))::int AS m_rn
FROM s,
unnest(regexp_split_to_array(s.c, '\s+')) WITH ORDINALITY AS sub(name,rn)
)
SELECT id, string_agg(b.Name ,' ' ORDER BY rn) AS combination
FROM (SELECT p.id, p.Name, p.rn, RIGHT(o.n::bit(16)::text, m_rn) AS bitmap
FROM src AS p
CROSS JOIN generate_series(1, 100000) AS o(n)
WHERE o.n < 2 ^ m_rn) b
WHERE SUBSTRING(b.bitmap, b.rn, 1) = '1'
GROUP BY b.id, b.bitmap
ORDER BY id, b.bitmap;
<强> Rextester Demo 2 强>