BigQuery:Scalar Subquery产生了多个 - Custom Dimensions

时间:2017-07-04 18:50:17

标签: google-bigquery standard-sql

我试图在我的一个联盟中获得自定义维度,但是我在标量子查询中遇到问题会产生多个元素。我认为问题在于此代码。我试图迁移到标准SQL,所以请在标准SQL中给出答案。

SELECT
      d.value
FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
WHERE
      d.index = 65) AS viewID,

查询的整体示例

#standardSQL
SELECT
  date,
  channelGrouping,
  viewID,
  SUM(Revenue) Revenue,
  SUM(Shipping) Shipping,
  SUM(bounces) bounces,
  SUM(transactions) transactions,
  COUNT(date) sessions
FROM (
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    (
    SELECT
      d.value
    FROM
      UNNEST(hits) AS hits,
      UNNEST(hits.customDimensions) AS d
    WHERE
      d.index = 65) AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703'
  UNION ALL
  SELECT
    date,
    channelGrouping,
    'XXXXXXXXX' AS viewID,
    totals.totaltransactionrevenue / 1e6 Revenue,
    (
    SELECT
      SUM(hits.transaction.transactionshipping) / 1e6
    FROM
      UNNEST(hits) hits) Shipping,
    totals.bounces bounces,
    totals.transactions transactions
  FROM
    `XXXXXXXXX.ga_sessions_*`
  WHERE
    _TABLE_SUFFIX BETWEEN '20170625'
    AND '20170703' )
GROUP BY
  date,
  channelGrouping,
  viewID

2 个答案:

答案 0 :(得分:2)

问题在于部分或全部匹配具有索引为65的自定义维度。有几种不同的方法可以解决此问题。您可以使用ARRAY子查询来获取该索引的所有值:

ARRAY(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65) AS viewIDs,

这将为您提供跨命中的所有视图ID,但您还需要在union的第一个查询中使用viewID的数组。另一种选择是从第一次点击中获取视图ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits[SAFE_OFFSET(0)].customDimensions) AS d
  WHERE
    d.index = 65) AS viewID

或者,如果您不关心自己获得哪个视图ID,可以使用LIMIT获取任意ID:

(
  SELECT
    d.value
  FROM
    UNNEST(hits) AS hits,
    UNNEST(hits.customDimensions) AS d
  WHERE
    d.index = 65
  LIMIT 1) AS viewID,

答案 1 :(得分:2)

您可以在BigQuery中模拟一些数据,以便更好地了解这里发生了什么。

例如,此数据模拟hits中的ga_sessions架构:

WITH data AS(
  select ARRAY<STRUCT<hitNumber INT64, customDimensions ARRAY<STRUCT<index INT64, value STRING>> >> [STRUCT(1 as hitNumber, [STRUCT(1 as index, 'val1' as value), STRUCT(2 as index, 'val2' as value), STRUCT(3 as index, 'val3' as value)] as customDimensions), STRUCT(2 as hitNumber, [STRUCT(1 as index, 'val1' as value)] as customDimensions)] hits
)

select * from data

现在,如果您针对查询index = 1所在位置的模拟数据运行查询,则会得到相同的错误,因为在两个不同的位置索引为1。

为了实现这一点,您必须将其作为ARRAY这样使用:

SELECT
  array(select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1)
FROM data

你会看到结果:

enter image description here

因此,在您的查询中,您必须适应将此值作为ARRAY返回,或者如果对于index=65值相同的所有值,您可以执行类似的操作:

SELECT
  (select custd.value from unnest(hits) hits, unnest(hits.customDimensions) custd where index = 1 limit 1)
FROM data

这将在标量子查询中只带来一个结果。