提取单词和字符之间的字符串

时间:2020-04-29 23:48:33

标签: sql regex postgresql amazon-redshift

对于PostgreSQL。

我有一个字符串,其中包含用于广告系列的元数据。

示例:date:20200429-category:phones-audience:youth-promo:nooffer

我希望能够为每个键/对提取值,例如为类别列提取电话,为受众列提取青年。

修改:
现在,我在哪里拥有它: split_part(split_part(example_string_field, 'category:',2),'-',1) 但这似乎有点混乱。

寻求帮助,谢谢。

2 个答案:

答案 0 :(得分:0)

我认为您可以使用正则表达式:

regexp_substr(str, 'category:([^-])+', 1, 1, 'e')

答案 1 :(得分:0)

我大体上同意Bohemian,您的解决方案很好,但是您可以通过一些字符串操作将其传递给键值表,这使得最后一步只是针对单列的一组选择。使解析新列稍微容易一些,只需在crosstabbed_data中添加另一行即可。

testdb=# with input_rows as (
select 'date:20200429-category:phones-audience:youth-promo:nooffer' as data
UNION ALL
select 'date:20200430-category:tablet-audience:olds-promo:offer'
),
eav_data as (
SELECT rownum, k_v[1] part, k_v[2] val
FROM
  (
  SELECT rownum, string_to_array(item, ':') AS k_v
  FROM (select rownum, unnest(items) as item from (
    select row_number() over () as rownum, string_to_array(data, '-') as items from input_rows)_0
  )_1 )_2
),
rownums as (select rownum as num from eav_data group by rownum),
crosstabbed_data as (
select
  (select val from eav_data where rownum=num and part='date') as date,
  (select val from eav_data where rownum=num and part='category') as category,
  (select val from eav_data where rownum=num and part='audience') as audience,
  (select val from eav_data where rownum=num and part='promo') as promo
from rownums)
select * from crosstabbed_data;
   date   | category | audience |  promo  
----------+----------+----------+---------
 20200429 | phones   | youth    | nooffer
 20200430 | tablet   | olds     | offer
(2 rows)