Postgresql查询每45天显示一次记录

时间:2016-01-14 05:12:40

标签: postgresql

我有一个表,其中包含user_id数据和他们加入的时间戳。 如果我需要按月显示数据,我可以使用:

select 
 count(user_id), 
 date_trunc('month',(to_timestamp(users.timestamp))::timestamp)::date
from 
 users 
group by 2

date_trunc代码允许使用' second',' day','周'因此我可以按这样的时间段分组数据。 如何获取按" n-day"分组的数据?期间说45天? 基本上我需要每45天显示一次用户数。 任何建议或指导表示赞赏!

目前我得到:

Date           Users
2015-03-01      47
2015-04-01      72
2015-05-01      123
2015-06-01      132
2015-07-01      136
2015-08-01      166
2015-09-01      129
2015-10-01      189

我希望数据能在45天内完成。类似的东西: -

Date           Users
2015-03-01      85
2015-04-15      157
2015-05-30      192
2015-07-14      229
2015-08-28      210
2015-10-12      294

更新:

我使用以下内容来获取输出,但仍然存在一个问题。我得到了偏移的值。

with
new_window as (
select
  generate_series as cohort
  , lag(generate_series, 1) over () as cohort_lag

from
  (
    select
      *
    from
      generate_series('2015-03-01'::date, '2016-01-01', '45 day')
  )
  t
)
select
  --cohort
  cohort_lag -- This worked. !!!
  , count(*)
from
  new_window
join users on
  user_timestamp <= cohort
  and user_timestamp > cohort_lag
group by 1
order by 1

但我得到的输出是:

Date           Users
2015-04-15      85
2015-05-30      157
2015-07-14      193
2015-08-28      225
2015-10-12      210

基本上2015-03-01显示的用户应该是2015-03-01和2015-04-15之间的用户,依此类推。

但我似乎正在获得用户的价值观。即:到2015-04-15用户85.这不是我想要的结果。 这里有什么帮助吗?

2 个答案:

答案 0 :(得分:1)

尝试此查询:

SELECT to_char(i::date,'YYYY-MM-DD') as date, 0 as users 
FROM generate_series('2015-03-01', '2015-11-30','45 day'::interval) as i;

输出:

date        users
2015-03-01    0
2015-04-15    0
2015-05-30    0
2015-07-14    0
2015-08-28    0
2015-10-12    0
2015-11-26    0

答案 1 :(得分:0)

这看起来像是一个热点,它可能更好地包含在你可以使用一些变量的函数中,但这样的工作会不会有用呢?

with number_of_intervals as (
  select
    min (timestamp)::date as first_date,
    ceiling (extract (days from max (timestamp) - min (timestamp)) / 45)::int as num
  from users
),
intervals as (
  select
    generate_series(0, num - 1, 1) int_start,
    generate_series(1, num, 1) int_end
  from number_of_intervals
),
date_spans as (
  select
    n.first_date + 45 * i.int_start as interval_start,
    n.first_date + 45 * i.int_end as interval_end
  from
    number_of_intervals n
    cross join intervals i    
)
select
  d.interval_start, count (*) as user_count
from
  users u
  join date_spans d on
    u.timestamp >= d.interval_start and
    u.timestamp <  d.interval_end
group by
  d.interval_start
order by
  d.interval_start

使用此示例数据:

User Id     timestamp       derived range   count
1           3/1/2015        3/1-4/15    
2           3/26/2015       "   
3           4/4/2015        "   
4           4/6/2015        "               (4)
5           5/6/2015        4/16-5/30   
6           5/19/2015       "               (2)
7           6/16/2015       5/31-7/14   
8           6/27/2015       "   
9           7/9/2015        "               (3)
10          7/15/2015       7/15-8/28   
11          8/8/2015        "   
12          8/9/2015        "   
13          8/22/2015       "   
14          8/27/2015       "               (5)

这是输出:

2015-03-01      4
2015-04-15      2
2015-05-30      3
2015-07-14      5