PostgreSQL-拆分行

时间:2016-02-23 10:51:44

标签: sql regex postgresql

我有一张看起来像这样的表:

ID      |  name  | details
---------------------------
1.3.1-3 | Jack   | a
5.4.1-2 | John   | b
1.4.5   | Alex   | c

如何分割它:

ID      |  name  | details
---------------------------
1.3.1   | Jack   | a
1.3.2   | Jack   | a
1.3.3   | Jack   | a
5.4.1   | John   | b
5.4.2   | John   | b
1.4.5   | Alex   | c

如何在postgresql中执行此操作?

4 个答案:

答案 0 :(得分:2)

CREATE TABLE tosplit
        ( id text NOT NULL
        , name text
        , details text
        );

INSERT INTO tosplit( id , name , details ) VALUES
 ( '1.3.1-3' , 'Jack' , 'a' )
,( '5.4.1-2' , 'John' , 'b' )
,( '1.4.5' , 'Alex' , 'c' )


WITH zzz AS (
        SELECT id
        , regexp_replace(id, '([0-9\.]+\.)([0-9]+)-([0-9]+)', e'\\1', e'g') AS one
        , regexp_replace(id, '([0-9\.]+\.)([0-9]+)-([0-9]+)', e'\\2', e'g') AS two
        , regexp_replace(id, '([0-9\.]+\.)([0-9]+)-([0-9]+)', e'\\3', e'g') AS three
        , name
        , details
        FROM tosplit
        )
    SELECT z1.id
        -- , z1.one
        , z1.one || generate_series( z1.two::integer, z1.three::integer)::text AS four
        , z1.name, z1.details
FROM zzz z1
WHERE z1.two <> z1.one
UNION ALL
SELECT z0.id
        -- , z0.one
        , z0.one AS four
        , z0.name, z0.details
FROM zzz z0
WHERE z0.two = z0.one
        ;

结果:

CREATE TABLE
INSERT 0 3
   id    | four  | name | details 
---------+-------+------+---------
 1.3.1-3 | 1.3.1 | Jack | a
 1.3.1-3 | 1.3.2 | Jack | a
 1.3.1-3 | 1.3.3 | Jack | a
 5.4.1-2 | 5.4.1 | John | b
 5.4.1-2 | 5.4.2 | John | b
 1.4.5   | 1.4.5 | Alex | c

答案 1 :(得分:1)

您可以根据id-拆分.并与生成的系列连接:

CREATE TABLE tab(
   ID      VARCHAR(18) NOT NULL PRIMARY KEY
  ,name    VARCHAR(8) NOT NULL
  ,details VARCHAR(11) NOT NULL
);
INSERT INTO tab(ID,name,details) VALUES ('1.3.1-3','Jack','a');
INSERT INTO tab(ID,name,details) VALUES ('5.4.1-2','John','b');
INSERT INTO tab(ID,name,details) VALUES ('1.4.5','Alex','c');
INSERT INTO tab(ID,name,details) VALUES ('1.7.11-13','Joe','d');
INSERT INTO tab(ID,name,details) VALUES ('1.7-13','Smith','e');

主要查询:

;WITH cte AS
(
  SELECT *, 
    split_part(id, '-', 1) AS prefix,
    split_part(reverse(split_part(id, '-', 1)),'.',1)::int AS start,
    CASE WHEN split_part(id, '-',2) <> '' 
         THEN split_part(id, '-', 2):: int 
         ELSE NULL 
    END AS stop
  FROM tab
)
SELECT 
  LEFT(prefix, LENGTH(prefix) - strpos(reverse(prefix), '.')) || '.' || n::text AS id,
  name,
  details     
FROM cte
CROSS JOIN LATERAL generate_series(start,COALESCE(stop, start)) AS sub(n);

SqlFiddleDemo

输出:

╔═════════╦════════╦═════════╗
║   id    ║ name   ║ details ║
╠═════════╬════════╬═════════╣
║ 1.3.1   ║ Jack   ║ a       ║
║ 1.3.2   ║ Jack   ║ a       ║
║ 1.3.3   ║ Jack   ║ a       ║
║ 5.4.1   ║ John   ║ b       ║
║ 5.4.2   ║ John   ║ b       ║
║ 1.4.5   ║ Alex   ║ c       ║
║ 1.7.11  ║ Joe    ║ d       ║
║ 1.7.12  ║ Joe    ║ d       ║
║ 1.7.13  ║ Joe    ║ d       ║
║ 1.7     ║ Smith  ║ e       ║
║ 1.8     ║ Smith  ║ e       ║
║ 1.9     ║ Smith  ║ e       ║
║ 1.10    ║ Smith  ║ e       ║
║ 1.11    ║ Smith  ║ e       ║
║ 1.12    ║ Smith  ║ e       ║
║ 1.13    ║ Smith  ║ e       ║
╚═════════╩════════╩═════════╝

答案 2 :(得分:1)

with elements as (
  select id, 
         regexp_split_to_array(id, '(\.)') as id_elements,
         name, 
         details
  from the_table
), bounds as (
  select id, 
         case 
           when strpos(id, '-') = 0 then 1
           else split_part(id_elements[cardinality(id_elements)], '-', 1)::int
         end as start_value,
         case 
           when strpos(id, '-') = 0 then 1
           else split_part(id_elements[cardinality(id_elements)], '-', 2)::int
         end as end_value,
         case 
           when strpos(id, '-') = 0 then id
           else array_to_string(id_elements[1:cardinality(id_elements)-1], '.')
         end as base_id,
         name, 
         details
  from elements
)
select b.base_id||'.'||c.cnt as new_id, 
       b.name,
       b.details, 
       count(*) over (partition by b.base_id) as num_rows
from bounds b 
  cross join lateral generate_series(b.start_value, b.end_value) as c (cnt)
order by num_rows desc, c.cnt;

第一个CTE只根据.拆分ID。然后,第二个CTE计算每个ID的起始值和结束值,并从实际ID值“剥离”范围定义,以获得可以与最终select语句中的实际行索引连接的基数。

使用此测试数据:

insert into the_table
values
('1.3.1-3',              'Jack',  'details 1'),
('5.4.1-2',              'John',  'details 2'),
('1.4.5',                'Alex',  'details 3'),
('10.11.12.1-5',         'Peter', 'details 4'),
('1.4.10-13',            'Arthur','details 5'),
('11.12.13.14.15.16.2-7','Zaphod','details 6');

返回以下结果:

new_id              | name   | details   | num_rows
--------------------+--------+-----------+---------
11.12.13.14.15.16.2 | Zaphod | details 6 |        6
11.12.13.14.15.16.3 | Zaphod | details 6 |        6
11.12.13.14.15.16.4 | Zaphod | details 6 |        6
11.12.13.14.15.16.5 | Zaphod | details 6 |        6
11.12.13.14.15.16.6 | Zaphod | details 6 |        6
11.12.13.14.15.16.7 | Zaphod | details 6 |        6
10.11.12.1          | Peter  | details 4 |        5
10.11.12.2          | Peter  | details 4 |        5
10.11.12.3          | Peter  | details 4 |        5
10.11.12.4          | Peter  | details 4 |        5
10.11.12.5          | Peter  | details 4 |        5
1.4.10              | Arthur | details 5 |        4
1.4.11              | Arthur | details 5 |        4
1.4.12              | Arthur | details 5 |        4
1.4.13              | Arthur | details 5 |        4
1.3.1               | Jack   | details 1 |        3
1.3.2               | Jack   | details 1 |        3
1.3.3               | Jack   | details 1 |        3
5.4.1               | John   | details 2 |        2
5.4.2               | John   | details 2 |        2
1.4.5.1             | Alex   | details 3 |        1

使用cardinality(id_elements)需要Postgres 9.4。对于早期版本,需要将其替换为array_length(id_elements, 1))

最后一点:

如果将起始值和结束值存储在单独的(整数)列中,而不是将它们附加到ID本身,那么这将更容易批次。此模型违反了基本数据库规范化(第一范式)。

如果存储包含例如的解决方案,则该解决方案(或给出的答案中的任何解决方案)将会严重失败可以通过正确规范化数据来防止10.12.13.A-Z(非数字值)。

答案 3 :(得分:0)

您可以使用以下查询:

SELECT CASE 
          WHEN num = 0 THEN "ID"
          ELSE CONCAT(LEFT("ID", 
                           LENGTH("ID") + 1 - 
                           position('.' IN REVERSE("ID"))),
                      num)
       END,
       "name", 
       "details"      
FROM (              
  SELECT split_part("ID", '-', 1) AS "ID", 
         "name", "details",
         generate_series(
           CASE 
             WHEN position('-' in "ID") = 0 THEN 0
             ELSE 1
           END, 
           CASE 
             WHEN position('-' in "ID") = 0 THEN 0
             ELSE CAST(split_part("ID", '-', 2)  AS INT)
           END) AS num  
  FROM mytable) AS t

Demo here