Google Big Query中的第三个子字符串

时间:2017-06-21 16:28:02

标签: string google-bigquery sql-standards

我使用标准的sql并希望从最后提取第三个子字符串。

Example Input: "Search-site-variable-brand-0-city-none-18053517"
Output: "city"

4 个答案:

答案 0 :(得分:2)

我只是想指出,如果您计划将此转换应用于多个列,将逻辑拉入UDF可能会很有用。以下是如何执行此操作的示例:

CREATE TEMP FUNCTION SecondSubstringFromEnd(s STRING) AS ((
  SELECT arr[SAFE_OFFSET(ARRAY_LENGTH(arr) - 3)]
  FROM (
    SELECT SPLIT(s, '-') AS arr
  )
));

WITH Input AS (
  SELECT 'Search-site-variable-brand-0-city-none-18053517' AS str UNION ALL
  SELECT 'a-b' UNION ALL
  SELECT 'w-x-yyy-z'
)
SELECT
  str,
  SecondSubstringFromEnd(str) AS second_substring_from_end
FROM Input;

答案 1 :(得分:1)

这可能会起到作用:

WITH data AS(
  select "Search-site-variable-brand-0-city-none-18053517" as Input
)

SELECT
  CASE WHEN ARRAY_LENGTH(SPLIT(Input, '-')) > 3 THEN SPLIT(Input, '-')[OFFSET(ARRAY_LENGTH(SPLIT(Input, '-')) - 3)] END word
FROM data

如果字符串没有拆分,则返回NULL,例如空字符串。

答案 2 :(得分:0)

BigQuery Standard SQL的更多变体:

  
#standardSQL
WITH YourTable AS(
  SELECT 'Search-site-variable-brand-0-city-none-18053517' AS Input UNION ALL
  SELECT 'Second-substring-from-the-end-in-Google-BigQuery' UNION ALL
  SELECT 'bigQuery-assign-a-value-to-table-1-based-on-table-2' UNION ALL
  SELECT 'Error-Message-Too-many-sources-provided-15285-Limit-is-10000' UNION ALL
  SELECT 'Google-Bigquery-data-import-from-Google-Analytics-360' UNION ALL
  SELECT 'Bigquery-Partitioning-data-past-2000-limit'
)
SELECT
  Input,
  REVERSE(SPLIT(REVERSE(Input), '-')[SAFE_ORDINAL(3)]) AS Output_1,
  ARRAY_REVERSE(SPLIT(Input, '-'))[SAFE_ORDINAL(3)] AS Output_2
FROM YourTable

答案 3 :(得分:0)

“ARRAY_REVERSE”函数在这种情况下会产生奇迹。

with input AS 
(
 SELECT "Search-site-variable-brand-0-city-none-18053517" AS to_reverse_string 
)

SELECT ARRAY_REVERSE(SPLIT(to_reverse_string, "-"))[SAFE_OFFSET(2)]
FROM input