SQL - 如何使用start_date和end_date查找缺少的活动天数

时间:2017-10-11 05:23:02

标签: sql google-bigquery

我在数据库中有几个字段如下所示:

trip_id

start_date  
end_date

start_station_name  
end_station_name

我需要编写一个查询,显示2015年特定日期所有没有活动的电台。我写了以下查询,但它没有给出正确的输出:

select
    start_station_name,
    extract(date from start_date) as dt,
    count(*)
from
    trips_table
where
    (
        start_date >= timestamp('2015-01-01')
        and
        start_date < timestamp('2016-01-01')
    )
group by
    start_station_name,
    dt 
order by
    count(*)

有人可以帮忙提出正确的查询吗?提前谢谢!

2 个答案:

答案 0 :(得分:1)

以下是BigQuery Standard SQL

   

假设start_date和end_date属于DATE类型
它还假设start_date和end_date之间的所有日期都是“专用”到start_station_name字段中的站点,这很可能不是预期的,但问题是缺少这里的细节因此这样的假设

#standardSQL
WITH days AS (
  SELECT day
  FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
  SELECT DISTINCT start_station_name AS station
  FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`, 
           UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL

您可以使用以下简单/虚拟数据

测试/播放它
#standardSQL
WITH `trips_table` AS (
  SELECT 1 AS trip_id, DATE '2015-01-01' AS start_date, DATE '2015-12-01' AS end_date, '111' AS start_station_name UNION ALL
  SELECT 2, DATE '2015-12-10', DATE '2015-12-31', '111'
),
days AS (
  SELECT day
  FROM UNNEST(GENERATE_DATE_ARRAY('2015-01-01', '2015-12-31')) AS day
),
stations AS (
  SELECT DISTINCT start_station_name AS station
  FROM `trips_table`
)
SELECT s.*
FROM (SELECT * FROM stations CROSS JOIN days) AS s
LEFT JOIN (SELECT * FROM `trips_table`, 
           UNNEST(GENERATE_DATE_ARRAY(start_date, end_date)) AS day) AS a
ON s.day = a.day AND s.station = a.start_station_name
WHERE a.day IS NULL
ORDER BY station, day   

输出如下

station day  
111     2015-12-02   
111     2015-12-03   
111     2015-12-04   
111     2015-12-05   
111     2015-12-06   
111     2015-12-07   
111     2015-12-08   
111     2015-12-09   

答案 1 :(得分:0)

为此目的使用递归:尝试此SQL SERVER

WITH sample AS (
  SELECT CAST('2015-01-01' AS DATETIME) AS dt
  UNION ALL
  SELECT DATEADD(dd, 1, dt)
  FROM sample s
  WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
) 
SELECT * FROM sample
Where CAST(sample.dt as date) NOT IN (
  SELECT CAST(start_date as date) 
  FROM tablename 
  WHERE start_date >= '2015-01-01 00:00:00'
  AND start_date < '2016-01-01 00:00:00' 
) 
Option(maxrecursion 0) 

如果您想要使用它的电台数据,那么您可以使用左连接:

 WITH sample AS (
  SELECT CAST('2015-01-01' AS DATETIME) AS dt
  UNION ALL
  SELECT DATEADD(dd, 1, dt)
  FROM sample s
  WHERE DATEADD(dd, 1, dt) < CAST('2016-01-01' AS DATETIME)
) 
SELECT * FROM sample
left join tablename 
on CAST(sample.dt as date)  = CAST(tablename.start_date as date) 
where sample.dt>= '2015-01-01 00:00:00' and sample.dt< '2016-01-01 00:00:00' ) 
Option(maxrecursion 0) 

对于mysql,请看这个小提琴。我想这会对你有所帮助.... SQL Fiddle Demo