使用BigQuery,希望您能在正则表达式中使用特殊字符'|创建此查询。 '或等效的标准sql? 这个想法是使用正则表达式代替该字段的多个字段(“ login”,“ Unknown”,“ registration”,“ login”,“ start”,“ null”) hit.eventInfo.eventCategory
#standardSQL
SELECT
visitNumber,
visitStartTime,
date,
totals.visits,
totals.hits,
totals.pageviews,
totals.timeOnSite,
hit.hitNumber,
hit.page.pagePath,
hit.page.hostname,
hit.page.pageTitle,
hit.eventInfo.eventCategory,
hit.eventInfo.eventAction,
hit.eventInfo.eventLabel,
cd.index,
cd.value
FROM
[bqdatasetnumber.ga_sessions_*],
WHERE
_TABLE_SUFFIX BETWEEN '20180905'
AND '20180911'
AND customDimensions.value != "null"
AND hit.eventInfo.eventCategory != "login"
AND hit.eventInfo.eventCategory != "null"
AND hit.eventInfo.eventCategory != "Unknown"
AND hit.eventInfo.eventCategory != "registration"
AND hit.eventInfo.eventAction != "start"
感谢您的帮助和提示
塞巴斯蒂安
答案 0 :(得分:1)
您可以尝试:
AND REGEXP_CONTAINS(hit.eventInfo.eventCategory, r"^Unknown|registration|login|start$") != true
您可以找到REGEXP_CONTAINS函数here的文档。
“或”操作数为“ |”
^和$,表示该值应与表达式完全匹配。
空值也将被省略。
答案 1 :(得分:1)
您可以使用
AND NOT LOWER(hit.eventInfo.eventCategory) in ("login", "unknown", "registration", "login", "start", "null")
或
AND NOT REGEXP_CONTAINS(hit.eventInfo.eventCategory, r"(?i)^(login|unknown|registration|login|start|null)$")
您所查询的查询似乎已被截断,因此显然仅在上方添加将不起作用,因此我想您需要类似
的内容#standardSQL
SELECT
visitNumber,
visitStartTime,
DATE,
totals.visits,
totals.hits,
totals.pageviews,
totals.timeOnSite,
hit.hitNumber,
hit.page.pagePath,
hit.page.hostname,
hit.page.pageTitle,
hit.eventInfo.eventCategory,
hit.eventInfo.eventAction,
hit.eventInfo.eventLabel,
cd.index,
cd.value
FROM `bqdatasetnumber.ga_sessions_*` a,
UNNEST(hits) hit,
UNNEST(a.customDimensions) cd
WHERE _TABLE_SUFFIX BETWEEN '20130905' AND '20130911'
AND cd.value != "null"
AND NOT REGEXP_CONTAINS(hit.eventInfo.eventCategory, r"(?i)^(login|unknown|registration|login|start|null)$")
AND hit.eventInfo.eventAction != "start"
不确定cd的来源,因此猜测它是在根目录而不是在hits中引用customDimensions的。但这可能很受欢迎:o)