GROUP BY整数值的间隔

时间:2014-06-02 14:34:26

标签: sql postgresql group-by intervals

我需要总结线性对象'长度并按几年的间隔对它们进行分组。我有一个表存储我的对象,如下所示:

- gid serial NOT NULL, 
- year INTEGER, 
- the_geom geometry(MULTILINESTRING) ;

我需要这样的结果:

period          | length
----------------+-----------
 2005 - 2014    | 18.6
 1995 - 2004    | 16.1
 1985 - 1994    | 7.6
 1975 - 1984    | 19.0
 1965 - 1974    | 28.2
 1945 - 1964    | 10.2
 before 1945    | 0.1 

我无法在网上找到如何执行此操作,除非使用不同的查询生成每一行并使用UNION ALL将它们合并在一起,这不太好...... < / p>

4 个答案:

答案 0 :(得分:0)

假设以下内容得到了长度:

select year, length(the_geo) as len
from table

然后你的问题是找到长度相同的连续年份。我喜欢以下用于查找此类序列的技巧,假设您每年都有一个值:

select min(year) || '-' || max(year), len
from  (select year, length(the_geo) as len,
              (row_number() over (order by year) -
               row_number() over (partition by length(the_geo) order by year)
              ) as grp
       from table
      ) t
group by grp, len
order by 1;

答案 1 :(得分:0)

整数除法的结果分组,有效地截断到除数的倍数 - 在你的情况下为10。按5切换以实现分区:

SELECT min(year)::text || ' - ' || max(year) AS period
     , sum(length(the_geom)) AS "length"
FROM   tbl
GROUP BY (year + 5) / 10
ORDER  BY min(year) DESC;

Per documentation:

  

/除法(整数除法截断结果)

答案 2 :(得分:0)

您需要一个判别函数,可以通过多种方式创建。为了您的目的,案例陈述只是票证,如:

select case
         when t.year >= 2015 then '2015-present'
         when t.year >= 2005 then '2005-2014'
         when t.year >= 1995 then '1995-2004'
         when t.year >= 1985 then '1985-1994'
         when t.year >= 1975 then '1975-1984'
         when t.year >= 1965 then '1965-1974'
         when t.year >= 1955 then '1955-1964'
         when t.year >= 1945 then '1945-1954'
         when t.year <  1945 then 'before 1945'
         else                     'no year given'
       end as period ,
       sum( compute_length_from_geometry( t.geometry) ) as length
from some_table t
where .
      .
      .
group by case
           when t.year >= 2015 then '2015-present'
           when t.year >= 2005 then '2005-2014'
           when t.year >= 1995 then '1995-2004'
           when t.year >= 1985 then '1985-1994'
           when t.year >= 1975 then '1975-1984'
           when t.year >= 1965 then '1965-1974'
           when t.year >= 1955 then '1955-1964'
           when t.year >= 1945 then '1945-1954'
           when t.year <  1945 then 'before 1945'
           else                     'no year given'
         end as period
order by case
           when t.year >= 2015 then  1
           when t.year >= 2005 then  2
           when t.year >= 1995 then  3
           when t.year >= 1985 then  4
           when t.year >= 1975 then  5
           when t.year >= 1965 then  6
           when t.year >= 1955 then  7
           when t.year >= 1945 then  8
           when t.year <  1945 then  9
           else                     10
         end as period

你也可以考虑一个包括永久或临时的包围表,如:

create table report_period
(
  period_id          int         not null ,
  year_from          int         not null ,
  year_thru          int         not null ,
  period_description varchar(32) not null ,

  primary key clustered ( period_id ) ,
  unique nonclustered ( year_from , year_thru ) ,

)
insert report_period values ( 1 , 2015 , 9999 , '2015-present' )
insert report_period values ( 2 , 2005 , 2014 , '2005-2014'    )
insert report_period values ( 3 , 1995 , 2004 , '1995-2004'    )
insert report_period values ( 4 , 1985 , 1994 , '1985-1994'    )
insert report_period values ( 5 , 1975 , 1984 , '1975-1984'    )
insert report_period values ( 6 , 1965 , 1974 , '1965-1974'    )
insert report_period values ( 7 , 1955 , 1964 , '1955-1964'    )
insert report_period values ( 8 , 1945 , 1954 , '1945-1954'    )
insert report_period values ( 9 , 0000 , 1944 , 'pre-1945'     )

然后你的查询变得像

select p.period_description as period ,
       sum( compute_length_from_geometry( t.geometry ) ) as length
from report_period p
join some_table    t on t.year between p.year_from and p.year_thru
group by p.period_id ,
         p.period_description
order by p.period_id

您甚至可以使用派生表来获得相同的效果

select p.period_description as period ,
       sum( compute_length_from_geometry( t.geometry ) ) as length
from (           select 1 as period_id , 2015 as year_from , 9999 as year_thru , '2015-present' as period_description
       UNION ALL select 2 as period_id , 2005 as year_from , 2014 as year_thru , '2005-2014' as period_description
       UNION ALL select 3 as period_id , 1995 as year_from , 2004 as year_thru , '1995-2004' as period_description
       ...
     ) p
join some_table    t on t.year between p.year_from and p.year_thru
group by p.period_id ,
         p.period_description
order by p.period_id

或者,您也可以简单地进行整数除法,例如

period_id = ( 2014 - t.year ) / 10

这将为您提供域名

的句点标识符
  • &GT; 0:2015或更晚
  • 0:2005-2014
  • -1:1995-2004
  • -2:1985-1994
  • -3:1975-1984
  • -4:1965-1974
  • -5:1955-1964
  • -6:1945-1954
  • &LT; -7:1945年之前

然后只需添加/减去适当的偏移量即可移动零点(或以年为单位更改计算偏移量)。

然而,这通常会否定列year上任何索引的使用,因为它现在是表达式

答案 3 :(得分:0)

包围表report_period的解决方案非常出色,对我来说是最简单的。非常感谢尼古拉斯(以及所有人)!