我正在使用下面的脚本(来自上一个问题)查询AWS Redshift,但排除一些referrer_url
时遇到了麻烦。
例如,其中一些结构为https://google.com/erfkhjg/facebook/sdfdfd
或https://bing.com/erfkhjg/facebook/sdfdfd
,应计为search
,但应计为social
。
如何从social
计数中排除这些链接,而将其计为search
?我试图在第一个AND ... NOT LIKE
块中添加多个SUM
语句,但是没有用。感谢您的帮助!
SELECT A.page_id, A.views,
SUM(CASE WHEN referrer_url LIKE '%%facebook%%' OR
referrer_url LIKE '%%instagram%%' OR
referrer_url LIKE '%%twitter%%'
THEN 1 ELSE 0
END) AS social,
SUM(CASE WHEN referrer_url LIKE '%%google%%' OR
referrer_url LIKE '%%bing%%' OR
referrer_url LIKE '%%yahoo%%'
THEN 1 ELSE 0
END) AS search
FROM table1 A LEFT JOIN
table2 B
ON B.page_id = A.page_id
WHERE B.dt BETWEEN '20190401' AND '20190430'
GROUP BY A.page_id, A.views;
答案 0 :(得分:0)
我认为,至少在您的示例中,查找域名前后的时间段是足够的:
SELECT A.page_id, A.views,
SUM(CASE WHEN referrer_url LIKE '%.facebook.%' OR
referrer_url LIKE '%.instagram.%' OR
referrer_url LIKE '%.twitter.%'
THEN 1 ELSE 0
END) AS social,
SUM(CASE WHEN referrer_url LIKE '%.google.%' OR
referrer_url LIKE '%.bing.%' OR
referrer_url LIKE '%.yahoo.%'
THEN 1 ELSE 0
END) AS search
FROM table1 A LEFT JOIN
table2 B
ON B.page_id = A.page_id
WHERE B.dt BETWEEN '20190401' AND '20190430'
GROUP BY A.page_id, A.views;
请注意,搜索模式中不需要两个通配符。