将具有数组的行拆分为具有子数组的多行

时间:2020-07-16 12:18:33

标签: sql postgresql

假设我有这样的桌子:

+-----+-----------------+-------------+
| ID  |     Points      | BreakPoints |
+-----+-----------------+-------------+
| 123 | {6,8,1,3,7,9}   | {1,7}       |
| 456 | {16,9,78,96,33} | {78}        |
+-----+-----------------+-------------+

我想在Points中包含的点上“破坏”这些BreakPoints序列,同时保留原始行的ID元素顺序很重要,因此我无法对其进行排序!

还请注意,这两个结果行中的断点均来自于在该断点处中断原始序列(分别在末尾和起始处)。所以结果应该是这样的:


+-----+------------+
| ID  |   Points   |
+-----+------------+
| 123 | {6,8,1}    |
| 123 | {1,3,7}    |
| 123 | {7,9}      |
| 456 | {16,9,78}  |
| 456 | {78,96,33} |
+-----+------------+

当然,我可以编写PL / pgSQL函数,对每一行调用它,对数组进行迭代,对每个子序列进行RETURN NEXT迭代。但是还有其他方法,而不必为所有行调用函数吗?

2 个答案:

答案 0 :(得分:1)

我认为这可以满足您的需求

select t.id,
       array_agg(point order by point)
from t cross join
     unnest(points) point cross join lateral
     (select lag(breakpoint) over (order by breakpoint) as prev_breakpoint, breakpoint
      from unnest(t.breakpoints) breakpoint
      union all
      select max(breakpoint), null
      from unnest(t.breakpoints) breakpoint
     ) b
where (point >= prev_breakpoint or prev_breakpoint is null) and
      (point <= breakpoint or breakpoint is null)
group by t.id, breakpoint;

Here是db <>小提琴。

编辑:

以下是修改后的代码,用于解决您的实际问题:

select id, grp,
       (case when lead(max(breakpoint)) over (partition by id order by grp) is not null
             then array_agg(point order by n) || lead(max(breakpoint)) over (partition by id order by grp)
             else array_agg(point order by n)
        end) as next_breakpoint
from (select t.id, p.*, breakpoint,
             count(breakpoint) over (partition by t.id order by p.n) as grp
      from t cross join
           unnest(points) with ordinality as p(point, n) left join
           unnest(breakpoints) as breakpoint
           on p.point = breakpoint
     ) t
group by id, grp;

它已包含在db <> fidde中。

这个想法很简单。只需返回每个点的位置并与断点匹配即可。然后使用窗口函数定义组。

唯一的麻烦在于聚合。您想要两个记录中的断点。因此,需要进行一些操作。我认为使用lead()进行数组操作比其他方法更简单。

答案 1 :(得分:1)

WITH data(id, points, breakpoints) AS (
    VALUES (123, ARRAY [6,8,1,3,7,9], ARRAY [7, 1])
         , (456, ARRAY [16,9,78,96,33], ARRAY [78])
),
-- we'll map the breakpoints to the indices where they appear in `points` and sort this array
-- so, ARRAY[1, 7] -> ARRAY[3, 5] (the positions of 1 & 7 in `points`, arrays are 1-based)
-- and ARRAY[7, 1] -> ARRAY[3, 5] (since we sort this new 'breakpoint_indices' array)
sorted_breakpoint_indices(id, points, breakpoint_indices, number_of_breakpoints) AS (
    SELECT id
         , points
         , breakpoint_indices
         , number_of_breakpoints
    FROM data
    JOIN LATERAL (
        SELECT ARRAY_AGG(array_position(points, breakpoint) ORDER BY array_position(points, breakpoint))
             , COUNT(*) -- simply here to avoid multiple `cardinality(breakpoint_indices)` below
        FROM unnest(breakpoints) AS breakpoint
    ) AS f(breakpoint_indices, number_of_breakpoints)
    ON true
)
SELECT id
     , CASE i
         -- first segment, from start to breakpoint #1
         WHEN 0 THEN points[:breakpoint_indices[1]]
         -- last segment, from last breakpoint to end
         WHEN number_of_breakpoints THEN points[breakpoint_indices[number_of_breakpoints]:]
         -- default case, bp i to i+1
         ELSE points[breakpoint_indices[i]:breakpoint_indices[i+1]]
       END
FROM sorted_breakpoint_indices
   , generate_series(0, number_of_breakpoints, 1) AS f(i)

返回

+---+----------+
|id |result    |
+---+----------+
|123|{6,8,1}   |
|123|{1,3,7}   |
|123|{7,9}     |
|456|{16,9,78} |
|456|{78,96,33}|
+---+----------+

注意:我在写此答案时还写了其他版本,可以通过查看此帖子的编辑历史记录来查看

相关问题