在SQL中

时间:2016-12-15 19:13:19

标签: sql google-bigquery

我有以下SQL表:

start_time          end_time            value
2016-01-01 00:00:00 2016-01-01 08:59:59 1
2016-01-01 06:00:00 2016-01-01 14:59:59 2
2016-01-01 12:00:00 2016-01-01 17:59:59 1.5
2016-01-01 03:00:00 2016-01-01 17:59:59 3

我想将其转换为:

start_time          end_time            min_value
2016-01-01 00:00:00 2016-01-01 08:59:59 1
2016-01-01 09:00:00 2016-01-01 11:59:59 2
2016-01-01 12:00:00 2016-01-01 17:59:59 1.5

其中min_value是给定时间点的最小value。是否可以在SQL中执行此操作?

3 个答案:

答案 0 :(得分:1)

嗯。嗯。 。 。这看起来很难。我认为以下策略可行:

  1. 将数据分为两部分,分别为开始时间和结束时间。
  2. 对于每个开始时间,计算当时有效的最小值。
  3. 对于每个结束时间,计算从那时开始生效的最小值。
  4. 使用间隙和岛屿方法重新组合
  5. 我只是不确定你能否在BQ中做到这一点,因为它涉及非等值连接。但是。 。

    with starts as (
          select start_time as time,
                 (select min(t2.value)
                  from t t2
                  where t.start_time between t2.start_time and t2.end_time
                 ) as value
          from t
         ),
         ends as (
          select end_time as time,
                 (select min(t2.value)
                  from t t2
                  where t2.end_time > t.end_time and
                        t2.start_time <= t.end_time
                 ) as value
          from t
         )
    select value, min(time), max(time)
    from (select time,
                 row_number() over (order by time) as seqnum,
                 row_number() over (partition by value order by time) as seqnum_v
          from ((select s.* from starts) union all
                (select e.* from ends)
               ) t
         ) t
    group by value, (seqnum - seqnum_v);
    

答案 1 :(得分:1)

请尝试以下操作。我认为它完全符合您的要求 正如您所看到的 - 我在您的示例中添加了一个条目以使其更加灵活:o)

WITH YourTable AS (
SELECT TIMESTAMP '2016-01-01 00:00:00' AS start_time, TIMESTAMP '2016-01-01 08:59:59' AS end_time, 1 AS value UNION ALL
SELECT TIMESTAMP '2016-01-01 06:00:00' AS start_time, TIMESTAMP '2016-01-01 14:59:59' AS end_time, 2 AS value UNION ALL
SELECT TIMESTAMP '2016-01-01 12:00:00' AS start_time, TIMESTAMP '2016-01-01 17:59:59' AS end_time, 1.5 AS value UNION ALL
SELECT TIMESTAMP '2016-01-01 03:00:00' AS start_time, TIMESTAMP '2016-01-01 17:59:59' AS end_time, 3 AS value UNION ALL
SELECT TIMESTAMP '2016-01-01 12:30:00' AS start_time, TIMESTAMP '2016-01-01 12:40:59' AS end_time, 1 AS value 
), 
Intervals AS (
  SELECT iStart AS start_time, LEAD(iStart) OVER(ORDER BY iStart) AS end_time
  FROM (
    SELECT DISTINCT iStart FROM (
      SELECT start_time AS iStart FROM YourTable UNION ALL 
      SELECT end_time AS iStart FROM YourTable )
  )
),
Intervals_Mins AS (
  SELECT b.start_time, b.end_time, MIN(value) AS min_value
  FROM YourTable AS a
  JOIN Intervals AS b
  ON b.start_time BETWEEN a.start_time AND a.end_time
  AND b.end_time BETWEEN a.start_time AND a.end_time
  GROUP BY b.start_time, b.end_time
),
Intervals_Group AS (
  SELECT start_time, end_time, min_value, IFNULL(SUM(flag) OVER(PARTITION BY CAST(min_value AS STRING) ROWS BETWEEN UNBOUNDED PRECEDING AND 1 PRECEDING), 0) AS time_group 
  FROM (
    SELECT start_time, end_time, min_value, IF(end_time = LEAD(start_time) OVER(PARTITION BY CAST(min_value AS STRING) ORDER BY start_time), 0, 1) AS flag
    FROM Intervals_Mins
  )
)
SELECT MIN(start_time) AS start_time, MAX(end_time) AS end_time, min_value
FROM Intervals_Group
GROUP BY min_value, time_group
-- ORDER BY start_time

答案 2 :(得分:0)

我不确定我是否理解预期输出与输入的关系,但如果您只想将最小值与不同(start_timeend_time)对相关联,则可以执行此操作例如:

#standardSQL
WITH T AS (
  SELECT TIMESTAMP '2016-01-01 00:00:00' AS start_time,
    TIMESTAMP '2016-01-01 08:59:59' AS end_time, 1 AS value UNION ALL
  SELECT TIMESTAMP '2016-01-01 06:00:00',
    TIMESTAMP '2016-01-01 14:59:59', 2 UNION ALL
  SELECT TIMESTAMP '2016-01-01 12:00:00',
    TIMESTAMP '2016-01-01 17:59:59', 1.5 UNION ALL
  SELECT TIMESTAMP '2016-01-01 3:00:00',
    TIMESTAMP '2016-01-01 17:59:59', 3
)
SELECT
  start_time,
  end_time,
  MIN(value) AS min_value
FROM T
GROUP BY start_time, end_time;