Bigquery-连续3天查找记录

时间:2018-06-22 08:38:31

标签: sql google-bigquery

使用BigQuery上的NOAA全球日地面天气数据摘要,我试图找出在堪萨斯州和2013年连续4天冰雹= 1的气象站的百分比。气象站被定义为concat(stn,wban)

这是我到目前为止构建的查询:

#standardSQL
select hail, concat(year, mo, da) as date, concat(a.stn, a.wban) as station, b.state
from `bigquery-public-data.noaa_gsod.gsod*` a
join `bigquery-public-data.noaa_gsod.stations` b
on a.stn=b.usaf AND a.wban=b.wban
where _TABLE_SUFFIX = '2013' and country = 'US' and state = 'KS'
order by date;

它将其连接到stations表,因此我只能选择堪萨斯州为我的州,但是在研究了如何使连续的日子变得很短之后,我提出了建议。我知道我可能会再次加入这项工作。感谢您的帮助

谢谢!

1 个答案:

答案 0 :(得分:2)

这是策略:

  • 使用带有windowing子句的window函数计算4天的冰雹天数。
  • 在站点级别进行汇总以计算连续天数。
  • 汇总以获取比例。

我认为没有这样的电台,但是查询看起来像:

select avg(case when has_hail_4 > 0 then 1.0 else 0 end)
from (SELECT station, max(hail_4) as has_hail_4
      from (select hail,
                   concat(g.year, g.mo, g.da) as date, concat(g.stn, g.wban) as station, s.state,
                   SUM(CASE WHEN hail = '1' THEN 1 else 0 END) OVER
                       (partition by g.stn, g.wban ORDER BY g.year, g.mo, g.da ROWS BETWEEN CURRENT ROW and 3 FOLLOWING) as hail_4
            from `bigquery-public-data.noaa_gsod.gsod*` g join
                 `bigquery-public-data.noaa_gsod.stations` s
                 on g.stn = s.usaf AND g.wban = s.wban
            where _TABLE_SUFFIX = '2013' and s.country = 'US' and s.state = 'KS'
           ) s 
      group by station
     ) s;