从日志表和引用表

时间:2017-04-27 22:13:39

标签: sql google-bigquery

我是大问题的新手,需要创建烫发。日志文件和引用表中的表。我是新手,并做错了什么,但无法弄清楚原因。请帮忙。

日志文件(示例)

-Event_Time,USER_ID,类型

-1 / 1 / 2017,123_abc,一个

-1 / 2 / 2017,123_bcd,B

-1 / 2 / 2017,123_abc,C

参考表(示例)

-Type Partner

-a 1

-b 2

-c 3

-d 3

create table workarea.SummaryTable AS (
User_ID string,
TotalCount integer,
imps_time timestamp,
Partner integer)

insert into workarea.SummaryTable
select distinct User_ID,
COUNT(*) as TotalCount,
MIN(TIME) as imps_time,
SUM(case when Partner = '1' then 1 else 0 end) as 1,
SUM(case when Partner = '2' then 1 else 0 end) as 2,
SUM(case when Partner = '3' then 1 else 0 end) as 3

from workarea.logfile i
join workarea.referencetable r on i.Type=r.Type
where CID=10848805 
group by USER_ID

1 个答案:

答案 0 :(得分:1)

  

..并做错了什么但无法弄清楚为什么

到目前为止我已经确定的失败点

  1. CREATE TABLE语句在BigQuery中不可用 Data Manipulation Language仅允许您INSERTDELETEUPDATE
  2. 您需要拥有表pre-created才能使用/插入数据

    1. 别名不能以数字开头 -
    2. 所以下面的片段不正确

      SUM(CASE WHEN Partner = '1' THEN 1 ELSE 0 END) AS 1,    
      SUM(CASE WHEN Partner = '2' THEN 1 ELSE 0 END) AS 2,    
      SUM(CASE WHEN Partner = '3' THEN 1 ELSE 0 END) AS 3    
      

      你应该使用像

      这样的东西
      SUM(CASE WHEN Partner = '1' THEN 1 ELSE 0 END) AS Partner_1,    
      SUM(CASE WHEN Partner = '2' THEN 1 ELSE 0 END) AS Partner_2,    
      SUM(CASE WHEN Partner = '3' THEN 1 ELSE 0 END) AS Partner_3    
      
      1. 某些字段在引用表中看起来不存在,但您在最终查询中使用它们:time中的MIN(time) as imps_timeCID中的WHERE CID=10848805

      2. 目标表的模式有4个字段 - 而select语句的模式有6个字段。你应该清楚这一点!!他们必须匹配!

      3. 可能的“解决方案”(BigQuery Standard SQL)

        我假设(仅为了在这里取得一些进展)目的地表的架构在现实中如下所示

        User_ID STRING,
        TotalCount INT64,
        imps_time TIMESTAMP,
        Partner_1 INT64,
        Partner_2 INT64,
        Partner_3 INT64
        

        在这种情况下 - 下面的查询应该产生正确的插入结果

           
        #standardSQL
        SELECT 
          User_ID,
          COUNT(*) AS TotalCount,
          MIN(Event_Time) AS imps_time,
          SUM(CASE WHEN Partner = '1' THEN 1 ELSE 0 END) AS Partner_1,
          SUM(CASE WHEN Partner = '2' THEN 1 ELSE 0 END) AS Partner_2,
          SUM(CASE WHEN Partner = '3' THEN 1 ELSE 0 END) AS Partner_3
        FROM `workarea.logfile` i
        JOIN `workarea.referencetable` r ON i.Type=r.Type
        -- WHERE CID=10848805 
        GROUP BY USER_ID