在大查询中提取日期字符串

时间:2017-07-05 17:02:10

标签: google-bigquery

我使用了以下代码

java.lang.Integer fridayFasting = null;

这里我们需要提取日期,我已经尝试使用d,所以它给了我数字字段,但这并没有给我完整的日期。

2 个答案:

答案 0 :(得分:1)

由于我无权访问您的表,因此我使用WITH子句来模拟您的输入表,使用提供的输入示例。这样,以下查询将生成所需的输出日期。请注意,这是一个StandardSQL查询:

#standardSQL
WITH
  input AS (
  SELECT
    "Start -> (11/11/2016 08:24:24 AM) -> Akumar (11/11/2016 11:15:33 AM) -> Akumar (01/06/2017 08:08:44 PM) -> Akumar (01/30/2017 10:34:33 AM) -> Akumar (03/15/2017 02:10:12 PM) -> Akumar (03/23/2017 12:42:52 PM) -> Akumar (06/20/2017 12:52:27 PM) -> (06/27/2017 05:30:48 PM) -> Sneha Singh (06/28/2017 03:11:34 AM)" AS input_group),
  split_output AS (
  SELECT
    SPLIT(input_group, "->") AS output
  FROM
    input)
SELECT
  REGEXP_EXTRACT(output,r"(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d)") AS date
FROM
  split_output so,
  so.output

输出(null用于"开始"条目 - 请注意,sql输出未订购):

Row date     
1   03/15/2017 02:10:12  
2   01/06/2017 08:08:44  
3   11/11/2016 08:24:24  
4   06/20/2017 12:52:27  
5   06/28/2017 03:11:34  
6   01/30/2017 10:34:33  
7   06/27/2017 05:30:48  
8   null     
9   11/11/2016 11:15:33  
10  03/23/2017 12:42:52

答案 1 :(得分:1)

以下是BigQuery Standard SQL

  
#standardSQL
WITH yourTable AS ( 
  SELECT 1 AS id, 'Start -> (11/11/2016 08:24:24 AM) -> Akumar (11/11/2016 11:15:33 AM) -> Akumar (01/06/2017 08:08:44 PM) -> Akumar (01/30/2017 10:34:33 AM) -> Akumar (03/15/2017 02:10:12 PM) -> Akumar (03/23/2017 12:42:52 PM) -> Akumar (06/20/2017 12:52:27 PM) -> (06/27/2017 05:30:48 PM) -> Sneha Singh (06/28/2017 03:11:34 AM)' AS INDIVIDUAL_NAMES
) 
SELECT 
  id, 
  LTRIM(SPLIT(CONCAT(' ', item), ' (')[OFFSET(0)]) AS name, 
  REGEXP_EXTRACT(item, r'(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)') AS date_as_string, 
  PARSE_DATETIME('%m/%d/%Y %r', REGEXP_EXTRACT(item, r'(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)')) AS date_as_datetime,
  PARSE_TIMESTAMP('%m/%d/%Y %r', REGEXP_EXTRACT(item, r'(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)')) AS date_as_timestamp
FROM yourTable, UNNEST(SPLIT(INDIVIDUAL_NAMES, ' -> ')) AS item 
-- ORDER BY 3 
上面的

可以略微简化以删除冗余部分 - 例如

#standardSQL
WITH yourTable AS ( 
  SELECT 1 AS id, 'Start -> (11/11/2016 08:24:24 AM) -> Akumar (11/11/2016 11:15:33 AM) -> Akumar (01/06/2017 08:08:44 PM) -> Akumar (01/30/2017 10:34:33 AM) -> Akumar (03/15/2017 02:10:12 PM) -> Akumar (03/23/2017 12:42:52 PM) -> Akumar (06/20/2017 12:52:27 PM) -> (06/27/2017 05:30:48 PM) -> Sneha Singh (06/28/2017 03:11:34 AM)' AS INDIVIDUAL_NAMES
) 
SELECT 
  id, 
  LTRIM(SPLIT(CONCAT(' ', item), ' (')[OFFSET(0)]) AS name, 
  date_as_string,
  PARSE_DATETIME('%m/%d/%Y %r', date_as_string) AS date_as_datetime,
  PARSE_TIMESTAMP('%m/%d/%Y %r', date_as_string) AS date_as_timestamp
FROM yourTable, 
  UNNEST(SPLIT(INDIVIDUAL_NAMES, ' -> ')) AS item, 
  UNNEST([REGEXP_EXTRACT(item, r'(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)')]) AS date_as_string 
-- ORDER BY 3   

输出如下

id  name        date_as_string          date_as_datetime    date_as_timestamp    
1   Start       null                    null                null     
1   Akumar      01/06/2017 08:08:44 PM  2017-01-06T20:08:44 2017-01-06 20:08:44 UTC
1   Akumar      01/30/2017 10:34:33 AM  2017-01-30T10:34:33 2017-01-30 10:34:33 UTC
1   Akumar      03/15/2017 02:10:12 PM  2017-03-15T14:10:12 2017-03-15 14:10:12 UTC
1   Akumar      03/23/2017 12:42:52 PM  2017-03-23T12:42:52 2017-03-23 12:42:52 UTC
1   Akumar      06/20/2017 12:52:27 PM  2017-06-20T12:52:27 2017-06-20 12:52:27 UTC
1               06/27/2017 05:30:48 PM  2017-06-27T17:30:48 2017-06-27 17:30:48 UTC
1   Sneha Singh 06/28/2017 03:11:34 AM  2017-06-28T03:11:34 2017-06-28 03:11:34 UTC
1               11/11/2016 08:24:24 AM  2016-11-11T08:24:24 2016-11-11 08:24:24 UTC
1   Akumar      11/11/2016 11:15:33 AM  2016-11-11T11:15:33 2016-11-11 11:15:33 UTC  
  如果您可以在遗留sql

中建议类似的实例,那么

会非常棒

您可以在下面尝试使用BigQuery Legacy SQL

#legacySQL
SELECT
  id,
  IF(name = CONCAT('(', date_as_string, ')'), '', name) AS name,
  date_as_string
FROM (
  SELECT 
    id, 
    NTH(1, SPLIT(item, ' (')) AS name,
    REGEXP_EXTRACT(item, r'(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)') AS date_as_string
  FROM FLATTEN((
    SELECT id, SPLIT(INDIVIDUAL_NAMES, ' -> ') AS item, INDIVIDUAL_NAMES
    FROM yourTable 
  ), item)
)
-- ORDER BY date_as_string 

您可以使用您提供的虚拟数据进行测试,如下所示

#legacySQL
SELECT
  id,
  IF(name = CONCAT('(', date_as_string, ')'), '', name) AS name,
  date_as_string
FROM (
  SELECT 
    id, 
    NTH(1, SPLIT(item, ' (')) AS name,
    REGEXP_EXTRACT(item, r'(\d\d/\d\d/\d\d\d\d \d\d:\d\d:\d\d [AP]M)') AS date_as_string
  FROM FLATTEN((
    SELECT id, SPLIT(INDIVIDUAL_NAMES, ' -> ') AS item, INDIVIDUAL_NAMES
    FROM (
      SELECT 1 AS id, 'Start -> (11/11/2016 08:24:24 AM) -> Akumar (11/11/2016 11:15:33 AM) -> Akumar (01/06/2017 08:08:44 PM) -> Akumar (01/30/2017 10:34:33 AM) -> Akumar (03/15/2017 02:10:12 PM) -> Akumar (03/23/2017 12:42:52 PM) -> Akumar (06/20/2017 12:52:27 PM) -> (06/27/2017 05:30:48 PM) -> Sneha Singh (06/28/2017 03:11:34 AM)' AS INDIVIDUAL_NAMES
    ) AS yourTable 
  ), item)
)
ORDER BY date_as_string