返回年份范围作为年份范围

时间:2013-07-08 17:58:38

标签: sql arrays postgresql postgresql-9.2 window-functions

我尝试查询包含character varying[]年份列的表格,并将这些年份作为逗号分隔的年份范围字符串返回。年份范围将由阵列中存在的连续年份确定,不连续的年/年范围应以逗号分隔。

数据类型为character varying[]而不是integer[]的原因是因为少数值包含ALL而不是年份列表。我们可以省略这些结果。

到目前为止,由于我甚至不确定从哪里开始,我几乎没有运气。

是否有人能够给我一些指导或提供一个有用的例子,说明如何解决诸如挑战?

years_table示例

+=========+============================+
| id      | years                      |
| integer | character varying[]        |
+=========+============================+
| 1       | {ALL}                      |
| 2       | {1999,2000,2010,2011,2012} |
| 3       | {1990,1991,2007}           |
+---------+----------------------------+

输出目标:

示例SQL查询:

SELECT id, [year concat logic] AS year_ranges
FROM years_table WHERE 'ALL' NOT IN years

结果:

+====+======================+
| id | year_ranges          |
+====+======================+
| 2  | 1999-2000, 2010-2012 |
| 3  | 1990-1991, 2007      |
+----+----------------------+

2 个答案:

答案 0 :(得分:4)

SELECT id, string_agg(year_range, ', ') AS year_ranges
FROM (
   SELECT id, CASE WHEN count(*) > 1
               THEN min(year)::text || '-' ||  max(year)::text 
               ELSE min(year)::text
              END AS year_range
   FROM  (
      SELECT *, row_number() OVER (ORDER BY id, year) - year AS grp
      FROM  (
         SELECT id, unnest(years) AS year
         FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
                      ,(3,      '{1990,1991,2007}')
               ) AS tbl(id, years)
         ) sub1
      ) sub2
   GROUP  BY id, grp
   ORDER  BY id, min(year)
   ) sub3
GROUP  BY id
ORDER  BY id

生成完全所需的结果。

如果您处理一个varchar数组(varchar[],只需将其转换为int[],然后再继续。它似乎是完全合法的形式:

years::int[]

将内部子选择替换为生产代码中源表的名称。

 FROM  (VALUES (2::int, '{1999,2000,2010,2011,2012}'::int[])
              ,(3,      '{1990,1991,2007}')
       ) AS tbl(id, years)

- >

FROM  tbl

由于我们正在处理自然升序数(年),我们可以使用快捷方式形成连续年份组(形成范围)。我从行号中减去年份本身(按年份排序)。连续几年,行号和年份都加1并产生相同的grp个数字。此外,新的范围开始了。

有关手册herehere窗口功能的更多信息。

在这种情况下,plpgsql函数可能更快。你必须测试。这些相关答案中的例子:
Ordered count of consecutive repeats / duplicates
ROW_NUMBER() shows unexpected values

答案 1 :(得分:2)

SQL Fiddle不是您要求的输出格式,但我认为它可能更有用:

select id, g, min(year), max(year)
from (
    select id, year,
        count(not g or null) over(partition by id order by year) as g
    from (
        select id, year,
            lag(year, 1, 0) over(partition by id order by year) = year - 1 as g
        from (
            select id, unnest(years)::integer as year
            from years
            where years != '{ALL}'
        ) s
    ) s
) s
group by 1, 2