BQ命令行:查询过大

时间:2015-08-31 08:55:48

标签: google-bigquery

我正在从命令行运行查询以存储在新表中。在这个查询中,我有几个子查询,每个子查询使用TABLE_DATE_RANGE访问多个表。

对于每个表存根,每天有一个表。所以有4个子查询,每个访问180个表(两个TABLE_DATE_RANGE查询中90天)。这相当于720表。所以我不应该超出1k表限制。

我之前已经超出了1k表限制并且出现了“太多表”或类似的错误。

然而,此查询给出了错误“查询太大”。如下所示,我确实允许大量结果。有谁知道解决这个问题?

bq query -n0 --allow_large_results --replace --destination_table="cdate-prod:crm_adhoc.tmp_email_details_event_date" 'select event_date
,contact_id
,message_name
,message_name_join
,message_id
,email
,REGEXP_EXTRACT(email,r'([^@]*$)') as email_domain
,REGEXP_EXTRACT(REGEXP_EXTRACT(email,r'([^@]*$)'),r'(^[^\.]*)') as email_provider
,sent
,sent_unique_hlp
,open
,open_unique_hlp
,open_unique_msg_hlp
,click
,click_unique_hlp
,click_unique_msg_hlp
,soft_bounce
,medium_bounce
,hard_bounce
,activity
,type
,case when type = 1 then 'PPM' 
      when type = 2 then 'NPM' 
      when type = 3 then 'PENDING' 
      when type = 4 then 'CB' 
      when type = 5 then 'REDEBIT' 
      when type = 6 then 'INTCO' 
      when type = 7 then 'EXTCO' 
      else 'XX' 
 end as type_str
from 

(select send_date  as event_date
,contact_id
,message_name
,substr(message_name,7) as message_name_join
,message_id
,email
, 1 as sent
, contact_id as sent_unique_hlp
, 0 as open
, string('') as open_unique_hlp
, string('') as open_unique_msg_hlp
, 0 as click
, string('') as click_unique_hlp 
, string('') as click_unique_msg_hlp
, 0 as soft_bounce
, 0 as medium_bounce
, 0 as hard_bounce
, IFNULL(activity,0) as activity
, IFNULL(type,0) as type
from TABLE_DATE_RANGE(crm_data.campaign_messages,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day")),
     TABLE_DATE_RANGE(crm_data.interface_messages,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day"))) ms,

(select open_date as event_date
,contact_id
,message_name
,substr(message_name,7) as message_name_join
,message_id
,email
, 0 as sent
, string('')as sent_unique_hlp
, 1 as open
, contact_id as open_unique_hlp
, concat(contact_id,string(TIMESTAMP_TO_MSEC(send_date))) open_unique_msg_hlp
, 0 as click
, string('') as click_unique_hlp 
, string('') as click_unique_msg_hlp
, 0 as soft_bounce
, 0 as medium_bounce
, 0 as hard_bounce
, IFNULL(activity,0) as activity
, IFNULL(type,0) as type
from TABLE_DATE_RANGE(crm_data.interface_openings,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day")),
     TABLE_DATE_RANGE(crm_data.campaign_openings,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day"))) op,

(select click_date as event_date 
,contact_id
,message_name
,substr(message_name,7) as message_name_join
,message_id
,email
, 0 as sent
, string('')as sent_unique_hlp
, 0 as open
, string('') as open_unique_hlp
, string('') as open_unique_msg_hlp
, 1 as click
, contact_id as click_unique_hlp 
, concat(contact_id,string(TIMESTAMP_TO_MSEC(send_date))) click_unique_msg_hlp
, 0 as soft_bounce
, 0 as medium_bounce
, 0 as hard_bounce
, IFNULL(activity,0) as activity
, IFNULL(type,0) as type
from TABLE_DATE_RANGE(crm_data.interface_clicks,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day")),
     TABLE_DATE_RANGE(crm_data.campaign_clicks,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day"))) cl,

(select bounce_date as event_date
,contact_id
,message_name
,substr(message_name,7) as message_name_join
,message_id
,email
, 0 as sent
, string('')as sent_unique_hlp
, 0 as open
, string('') as open_unique_hlp
, string('') as open_unique_msg_hlp
, 0 as click
, string('') as click_unique_hlp 
, string('') as click_unique_msg_hlp
,case when bounce_category = 1 then 1 end soft_bounce
,case when bounce_category = 2 then 1 end medium_bounce
,case when bounce_category in (3,4,5) then 1 end hard_bounce
, IFNULL(activity,0) as activity
, IFNULL(type,0) as type
from TABLE_DATE_RANGE(crm_data.interface_bounces,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day")),
     TABLE_DATE_RANGE(crm_data.campaign_bounces,date_add(CURRENT_DATE(),-90,"day"),date_add(CURRENT_DATE(),-1,"day"))) bo'

Waiting on bqjob_r71fbdcc95fa950e5_0000014f82f52d18_1 ... (0s) Current status: RUNNING 
Waiting on bqjob_r71fbdcc95fa950e5_0000014f82f52d18_1 ... (1s) Current status: RUNNING 
Waiting on bqjob_r71fbdcc95fa950e5_0000014f82f52d18_1 ... (1s) Current status: DONE   
Error in query string: Error processing job
'cdate-prod:bqjob_r71fbdcc95fa950e5_0000014f82f52d18_1': Query too large

1 个答案:

答案 0 :(得分:2)

我的猜测,据我所知:

  • 查询最长只能为x个字符。这里提出的查询比这短,但是......

  • TABLE_DATE_RANGE通过内部扩展查询以显式包含所有范围内表名来工作。这通常很好,但是......

  • 此查询引用720个表。将通过明确提及720 * length(table_name)来扩展查询。这超出了极限。

建议:您可以将旧表格合并为月度实体而不是每日实体吗?