多个连接和使用BigQuery写入目标表

时间:2016-03-09 23:25:35

标签: sql google-bigquery

如果我 DON' T 设置目标表,我有以下查询可以正常工作。

SELECT soi.customer_id
, p.department
, p.category
, p.subcategory
, p.tier1
, p.tier2
, pc.bucket as categorization
, SUM(soi.price) as demand
, COUNT(1) as cnt
FROM store.sales_item soi 
INNER JOIN datamart.product p ON (soi.product_id = p.product_id)
INNER JOIN daily_customer_fact.dcf_product_categorization pc 
ON (p.department = pc.department
    AND p.category = pc.category 
    AND p.subcategory = pc.subcategory 
    AND p.tier1 = pc.tier1 
    AND p.tier2 = pc.tier2)
    WHERE DATE(soi.created_timestamp) < current_date()
    GROUP EACH BY 1,2,3,4,5,6,7 LIMIT 10

但是,如果我设置了目标表,则会失败并显示

Error: Ambiguous field name 'app_version' in JOIN. Please use the table qualifier before field name.

该列存在于store.sales_item表中,但我没有选择也没有加入该列。

1 个答案:

答案 0 :(得分:1)

我之前看过这个错误消息,它指向以下内容:

  • 指定目标表时的查询作业是将flattenResults设置为false。
  • store.sales_itemdatamart.product表都包含名为“app_version”的字段。

如果是这样,我建议看看这个答案: https://stackoverflow.com/a/28996481/4001094

以及此问题报告:https://code.google.com/p/google-bigquery/issues/detail?id=459

在您的情况下,您应该能够通过执行以下操作来使查询成功,使用上面链接的答案中的建议#3。我无法测试它,因为我无法访问您的源表,但它应该接近使用flattenResults设置为false。

SELECT soi_and_p.customer_id
, soi_and_p.department
, soi_and_p.category
, soi_and_p.subcategory
, soi_and_p.tier1
, soi_and_p.tier2
, pc.bucket as categorization
, SUM(soi_and_p.price) as demand
, COUNT(1) as cnt
FROM 
  (SELECT soi.customer_id AS customer_id
   , p.department AS department
   , p.subcategory AS subcategory
   , p.tier1 AS tier1
   , p.tier2 AS tier2
   , soi.price AS price
   , soi.created_timestamp AS created_timestamp
  FROM store.sales_item soi 
  INNER JOIN datamart.product p ON (soi.product_id = p.product_id)
  ) as soi_and_p
INNER JOIN daily_customer_fact.dcf_product_categorization pc 
ON (soi_and_p.department = pc.department
    AND soi_and_p.category = pc.category 
    AND soi_and_p.subcategory = pc.subcategory 
    AND soi_and_p.tier1 = pc.tier1 
    AND soi_and_p.tier2 = pc.tier2)
    WHERE DATE(soi_and_p.created_timestamp) < current_date()
    GROUP EACH BY 1,2,3,4,5,6,7 LIMIT 10