将STRUCT的ARRAY传递给标准BigQuery SQL的用户定义函数

时间:2017-01-17 10:53:53

标签: google-bigquery

如何将STRULTS的ARRAY传递到我的用户定义函数(使用标准SQL)?

首先,一点背景:

表架构:

id STRING
customer STRING
request STRUCT<
  headers STRING
  body STRING
  url STRING
>
response STRUCT<
  size INT64
  body STRING
>
outgoing ARRAY<
  STRUCT<
    request STRUCT<
      url STRING,
      body STRING,
      headers STRING
    >,
    response STRUCT<
      size INT64,
      body STRING
    >
  >
>

用户定义的功能:

CREATE TEMPORARY FUNCTION extractDetailed(
  customer STRING,
  request STRUCT<
    headers STRING,
    body STRING
  >,
  outgoing ARRAY<
    STRUCT<
      request STRUCT<url STRING>,
      response STRUCT<body STRING>
    >
  >
)
RETURNS STRING
LANGUAGE js AS """

""";

SELECT extractDetailed(customer, STRUCT(request.headers, request.body), outgoing)
FROM request_logs

关于我的问题:我似乎无法弄清楚如何选择outgoing ARRAY的一部分,并将其作为数组传递给用户定义的函数。

实际上,我试图模拟以下用户定义的函数调用:

extractDetailed(
  "customer id",
  { "headers": "", "body": "" },
  [
    {
      "request": { "url": "" },
      "response": { "body": "" }
    },
    {
      "request": { "url": "" },
      "response": { "body": "" }
    }
  ]
);

我最近偶然发现了some documentation that might help解锁它,我似乎无法弄清楚如何使它适合。我真的很挣扎于此,并且非常感谢你解决它的任何帮助。

1 个答案:

答案 0 :(得分:3)

请尝试以下操作。它解析了你的数组所需的和平,并在传递给函数之前将它们放回到新数组中,以便它与sugnature匹配

CREATE TEMPORARY FUNCTION extractDetailed(
customer STRING,
request STRUCT<headers STRING, body STRING>,
outgoing ARRAY<STRUCT<request STRUCT<url STRING>, response STRUCT<body STRING>>>
)
RETURNS STRING
LANGUAGE js AS """

""";

SELECT 
  extractDetailed(
    customer, 
    STRUCT(request.headers, request.body), 
    ARRAY(
      SELECT STRUCT<request STRUCT<url STRING>,response STRUCT<body STRING>>
          (STRUCT(request.url), STRUCT(response.body)) 
      FROM UNNEST(outgoing)
    )
  ) AS details
FROM request_logs  

为了进一步“优化”上面的查询并使其更具可移植性,您可以将从原始数组中提取的部分包装到新的数组中,并将其拆分为单独的SQL UDF

CREATE TEMPORARY FUNCTION extractParts (
  outgoing ARRAY<STRUCT<request STRUCT<url STRING, body STRING, headers STRING>,
                        response STRUCT<size INT64, body STRING>>>
)
RETURNS ARRAY<STRUCT<request STRUCT<url STRING>, response STRUCT<body STRING>>>
AS ((
  SELECT ARRAY(
      SELECT STRUCT<request STRUCT<url STRING>,response STRUCT<body STRING>>
          (struct(request.url), struct(response.body)) 
      FROM UNNEST(outgoing)
    )
));

CREATE TEMPORARY FUNCTION extractDetailed(
  customer STRING,
  request STRUCT<headers STRING, body STRING>,
  outgoing ARRAY<STRUCT<request STRUCT<url STRING>, response STRUCT<body STRING>>>
)
RETURNS STRING
LANGUAGE js AS """
  return outgoing.length;
""";

SELECT 
  extractDetailed(
    customer, 
    STRUCT(request.headers, request.body),
    extractParts(outgoing)
  ) as details
FROM request_logs