Question

以下是一个展平＆＃34; hit＆＃34;结果通过Google Analytics传输给BigQuery：

SELECT
*
FROM flatten(flatten(flatten(flatten(flatten(flatten(flatten(flatten(flatten([ga_sessions_20171116], hits), hits.product), hits.product.customDimensions), hits.product.customMetrics), hits.promotion), hits.experiment), hits.customDimensions), hits.customVariables), hits.customMetrics)
Limit 20

如何在一系列表格中做同样的事情，或者甚至是可能的？我试过了：

SELECT
*
FROM flatten(flatten(flatten(flatten(flatten(flatten(flatten(flatten(flatten([ga_sessions_2017111*], hits), hits.product), hits.product.customDimensions), hits.product.customMetrics), hits.promotion), hits.experiment), hits.customDimensions), hits.customVariables), hits.customMetrics)
WHERE _TABLE_SUFFIX BETWEEN '0' and '10'
Limit 20

但它没有用。有谁知道会怎么做？

Answer 1

在Standard SQL中使用嵌套数据比在Legacy中更容易（因为查询语法和行为的可预测性）。

这就是说，请考虑使用它。您在标准中的查询变为：

SELECT
  fullvisitorid visitor_id,
  prods.productSku sku,
  custd.index index
FROM `project_id.dataset_id.ga_sessions_*`,
UNNEST(hits) hits,
UNNEST(hits.product) prods,
UNNEST(prods.customDimensions) custd
WHERE _TABLE_SUFFIX BETWEEN '20171110' and '20171111'
LIMIT 1000

这只是一个例子，但希望它足以理解这个概念。

hits是结构的重复字段，所以它类似于：

hits = [{'hitNumber': 1, 'product': [{'productSku': 'sku0'}]}, {'hitNumber': 2}, ...]

当您应用unnest(hits) AS unnested_hits操作时，它变为：

unnested_hits = {'hitNumber': 1, 'product': [{'productSku': 'sku0'}]},
                {'hitNumber': 2}
                ...

因此，如果您将其称为"unnested_hits"，当您引用此别名时，您将获得此扁平化数据。您可以继续使用@ {1}}中的字段product。

为了更深入地理解这些概念，请务必仔细阅读docs，它们写得非常好，您可以学习在BigQuery中有效工作所需的所有内容。

最后请注意，您正在从GA中选择所有字段。正如那句老话所说，每当有人运行unnested_hits类型的查询时，世界某处的熊猫就会死亡。

您必须非常小心在BQ中运行此类查询，因为您将按处理数据的数量收费;确保你只带来绝对必要的东西。

通过一系列表格从Big Query中获取Google Analytics（分析）的点击数据

1 个答案: