BigQuery在一个时间间隔内选择数据

时间:2015-04-27 14:53:50

标签: group-by google-bigquery intervals

我的数据看起来像

  

名称|来自| To_City |请求日期

     

安迪|巴黎|伦敦| 08/21/2014 12:00

     莉娜|科隆|柏林| 08/22/2014 18:00

     

安迪|巴黎|伦敦| 08/22/2014 06:00

     

丽莎|罗马| Neapel | 08/25/2014 18:00

     莉娜|罗马|伦敦| 08/21/2014 20:00

     

丽莎|罗马| Neapel | 08/24/2014 18:00

     

安迪|巴黎|伦敦| 08/25/2014 12:00

我想知道一个人在+/-一天内有多少相同的驱动器请求。我很想收到一张桌子说:

  

名称|来自| To_City | avg请求日期| #requests

     

安迪|巴黎|伦敦| 08/21/2014 21:00 | 2

     莉娜|科隆|柏林| 08/22/2014 18:00 | 1

     

丽莎|罗马| Neapel | 08/25/2014 06:00 | 2

     莉娜|罗马|伦敦| 08/21/2014 20:00 | 1

     

安迪|巴黎|伦敦| 08/25/2014 12:00 | 1

这是 分组 子句的结果。但是,通常可以编写这样一个条件来检查在初始请求的24小时内是否有多少相同的请求? 到目前为止,我在Excel中下载数据并在那里进行,但是有很多数据,因此效率不高......

示例数据:

让我们先构建一个样本数据集:

select * from (select 'Andy' as name,'Paris' as f,'London' as to, '2014-08-21 12:00' as date),
(select 'Lena' as name,'Koln' as f,'Berlin' as to, '2014-08-22 18:00' as date),
(select 'Andy' as name,'Paris' as f,'London' as to, '2014-08-22 06:00' as date),
(select 'Lisa' as name,'Rome' as f,'Neapel' as to, '2014-08-25 18:00' as date),
(select 'Lena' as name,'Rome' as f,'London' as to, '2014-08-21 20:00' as date),
(select 'Lisa' as name,'Rome' as f,'Neapel' as to, '2014-08-24 18:00' as date),
(select 'Andy' as name,'Paris' as f,'London' as to, '2014-08-25 12:00' as date)

2 个答案:

答案 0 :(得分:2)

一种方法是使用RANGE窗口的窗口函数。为此,首先需要将日期转换为天数,因为RANGE要求排序列为连续数字。 PARTITION BY子句类似于GROUP BY - 它列出了定义"相同"的列。驱动请求(在您的情况下 - 名称,从和到)。然后,您只需使用COUNT(*)来计算此窗口内的天数。

select name, f, to, date, count(*) 
  over(partition by name, f, to
       order by day
       range between 1 preceding and 1 following) from (
select name, f, to, date, integer(timestamp(date)/1000000/60/60/24) day from
(select 'Andy' as name,'Paris' as f,'London' as to, '2014-08-21 12:00' as date),
(select 'Lena' as name,'Koln' as f,'Berlin' as to, '2014-08-22 18:00' as date),
(select 'Andy' as name,'Paris' as f,'London' as to, '2014-08-22 06:00' as date),
(select 'Lisa' as name,'Rome' as f,'Neapel' as to, '2014-08-25 18:00' as date),
(select 'Lena' as name,'Rome' as f,'London' as to, '2014-08-21 20:00' as date),
(select 'Lisa' as name,'Rome' as f,'Neapel' as to, '2014-08-24 18:00' as date),
(select 'Andy' as name,'Paris' as f,'London' as to, '2014-08-25 12:00' as date))

答案 1 :(得分:0)

您可以截断日期以排除小时,分钟和秒。然后按该列分组

SELECT SUBSTR(STRING(date-of-request), 0, 9) AS day
FROM t1
GROUP BY day