检查时间段内的列字符串值

时间:2017-06-29 21:34:45

标签: sql google-bigquery

我有一张具有以下结构的表格:

已编辑:

    id         date           status
    1        2017-04-20        good
    1        2017-04-19        bad
    1        2017-04-18        bad
    2        2017-04-20        ok
    2        2017-04-19        ok
    2        2017-04-17        ok
    2        2017-04-16        bad

我需要检查状态是否在一段时间内是相同的,让我们说过去3天。我试过了

SELECT id, date CASE WHEN status over(partition by id order by date rows between 3 preceding and current row) = 'ok' THEN true ELSE false END as test FROM Table 

结果如下:

    id         date           test
    1        2017-04-20        false
    1        2017-04-19        false
    1        2017-04-18        false
    2        2017-04-20        true
    2        2017-04-19        false
    2        2017-04-17        false
    2        2017-04-16        false

但它当然会引发错误。谢谢!

3 个答案:

答案 0 :(得分:2)

下面是BigQuery STandard SQL

  
#standardSQL
WITH yourTable AS ( 
  SELECT 1 AS id, DATE '2017-04-20' AS date, 'good' AS status UNION ALL
  SELECT 1, DATE '2017-04-19', 'bad' UNION ALL
  SELECT 1, DATE '2017-04-18', 'bad' UNION ALL
  SELECT 2, DATE '2017-04-20', 'ok' UNION ALL
  SELECT 2, DATE '2017-04-19', 'ok' UNION ALL
  SELECT 2, DATE '2017-04-17', 'ok' UNION ALL
  SELECT 2, DATE '2017-04-16', 'bad'
)
SELECT 
  id, 
  date, 
  MAX(status) OVER(win) = MIN(status) OVER(win) AND COUNT(status) OVER(win) = 3 AS test
FROM yourTable
WINDOW win AS (
  PARTITION BY id ORDER BY date ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
)
-- ORDER BY id, date DESC  

注意:这假设您为每个&每天这么3天就是3排!

答案 1 :(得分:1)

以下示例演示了检查所有状态是否相同:

WITH Input AS (
  SELECT 1 AS id, DATE '2017-04-20' AS date, 'good' AS status UNION ALL
  SELECT 1, DATE '2017-04-19', 'bad' UNION ALL
  SELECT 1, DATE '2017-04-18', 'bad' UNION ALL
  SELECT 2, DATE '2017-04-20', 'ok' UNION ALL
  SELECT 2, DATE '2017-04-19', 'ok' UNION ALL
  SELECT 2, DATE '2017-04-17', 'ok'
)
SELECT
  id,
  date,
  MAX(status) OVER StatusWindow = MIN(status) OVER StatusWindow AS test
FROM Input
WINDOW StatusWindow AS (
  PARTITION BY id ORDER BY date ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
);

请注意,仅检查MAX对于某些输入是不够的。

答案 2 :(得分:0)

from google.cloud import speech


def run_quickstart():
    speech_client = speech.Client()
    sample = speech_client.sample(source_uri="gs://linear-arena-2109/zoom0070.flac", encoding=speech.Encoding.FLAC)
    alternatives = sample.recognize('uk-UA')
    for alternative in alternatives:
        print(u'Transcript: {}'.format(alternative.transcript))

    with open("Output.txt", "w") as text_file:
        for alternative in alternatives:
            text_file.write(alternative.transcript.encode('utf8'))

if __name__ == '__main__':
    run_quickstart()