有两个主要表格。我正在提供必要的专栏。
Designers
Category | deisgnID
Impressions
recId | impressionType | impressionId | impressionAction | session_id
以下是sessions
的示例会话。 impressionId
是一个多态列(我认为这个术语;我可能错了)包含不同的数据类型和值。
recId impressionType impressionId impressionAction session_Id
73790 USER 11182 LOGIN acbd1234
73791 UNDOCKED abcd1234 UNDOCKED acbd1234
73792 PRODUCT 1446 TAPPED-WALL acbd1234
73793 CARTS 3586 ADDED acbd1234
73794 CART-PRODUCT 14941 ADDED acbd1234
73801 PRODUCT 1465 TAPPED-RECOMMENDATION acbd1234
73802 CART-PRODUCT 14942 ADDED acbd1234
73811 PRODUCT 1465 TAPPED-RECOMMENDATION acbd1234
73818 PRODUCT 1446 TAPPED-RECOMMENDATION acbd1234
73828 PRODUCT 1965 TAPPED-WALL acbd1234
73829 CART-PRODUCT 14944 ADDED acbd1234
73836 PRODUCT 1952 TAPPED-WALL acbd1234
73837 CART-PRODUCT 14945 ADDED acbd1234
73882 PRODUCT 502 TAPPED-WALL acbd1234
73883 CART-PRODUCT 14949 ADDED acbd1234
73897 CART-PRODUCT 14951 ADDED acbd1234
73942 EMAILED_RECOMM 904 SEND acbd1234
73943 EMAILED_RECOMM 1586 SEND acbd1234
73944 EMAIL-NOTIFICATION abcd@amazon.com SENDMAIL acbd1234
我使用以下代码合并了两个表
SELECT
d.category
,d.designId
,i.session_id
,COUNT( IF(i.impressionAction = 'TAPPED'
OR i.impressionAction = 'TAPPED-LISTPAGE'
OR i.impressionAction = 'TAPPED-wall'
OR i.impressionAction = 'TAPPED-RECOMMENDATION') AS SCANS_total
,COUNT(IF(i.impressionAction = 'TAPPED', 1, NULL)) AS TAPPED
,COUNT(IF(i.impressionAction = 'TAPPED-LISTPAGE', 1, NULL)) AS TAPPED_LISTPAGE
,COUNT(IF(i.impressionAction = 'TAPPED-WALL', 1, NULL)) AS TAPPED_WALL
,COUNT(IF(i.impressionAction = 'TAPPED-RECOMMENDATION', 1, NULL)) AS TAPPED_RECOMMENDATION
,COUNT(IF(i.impressionAction = 'SEND', 1, NULL)) AS IS_ITEM_SENT_BY_EMAIL
FROM
Impressions i
INNER JOIN designers d
ON i.impressionId = d.designId
WHERE
i.createDate >= '2014-06-18'
AND HOUR(i.createDate) >= 10
AND HOUR(i.createDate) < 21
AND i.impressionId not like '%amazon.com%'
GROUP BY
i.session_id, i.impressionId
HAVING
SCANS_total <> 0
ORDER BY
d.category, i.impressionId, SCANS_total desc
基本上,我正在生成按照Category
和designId
分解的不同类型的扫描扫描产品的次数> em> session
。
我的主要问题是:我无法使用i.impressionId not like '%amazon.com%'
查询过滤掉某些电子邮件域名,例如amazon.com 。
对于每个会话,如果用户通过电子邮件发送,impresseiondId
下的impressionAction = SENDMAIL
和impressionType = EMAIL-NOTIFICATION
下都有一个电子邮件地址。
我尝试使用where ... i.impressionId not like '%amazon.com%'
从计算中完全过滤掉某些域电子邮件的会话,但这并不起作用。
有没有办法在做我想做的事情时过滤掉某些电子邮件域?
任何想法都将不胜感激!
更新
我今天醒来并意识到解决问题的子查询。这是我写的用于过滤会话的子查询;但是,查询超时并失败。
本质上,查询会生成电子邮件地址属于某个域的所有会话,然后我否定这些会话的存在。 关于如何优化这个并使其工作的任何想法?
where i.session_id NOT IN(
SELECT session_id from Impressions
where impressionId LIKE '%amazon.com%')
答案 0 :(得分:0)
我一直在不遗余力地尝试解决这个问题,尝试提出不同的解决方案(请参阅更新失败的解决方案)。
我想我拥有它。如果有办法改善这一点,请告诉我。
SELECT *
FROM
(
SELECT
d.category
,d.designId
,i.session_id
,i.createDate
,d.name
,d.productTitle
,d.colors
,COUNT( IF(i.impressionAction = 'TAPPED'
OR i.impressionAction = 'TAPPED-LISTPAGE'
OR i.impressionAction = 'TAPPED-wall'
OR i.impressionAction = 'TAPPED-RECOMMENDATION') AS SCANS_total
,COUNT(IF(i.impressionAction = 'TAPPED', 1, NULL)) AS TAPPED
,COUNT(IF(i.impressionAction = 'TAPPED-LISTPAGE', 1, NULL)) AS TAPPED_LISTPAGE
,COUNT(IF(i.impressionAction = 'TAPPED-WALL', 1, NULL)) AS TAPPED_WALL
,COUNT(IF(i.impressionAction = 'TAPPED-RECOMMENDATION', 1, NULL)) AS TAPPED_RECOMMENDATION
,COUNT(IF(i.impressionAction = 'SEND', 1, NULL)) AS IS_ITEM_SENT_BY_EMAIL
FROM
Impressions i
INNER JOIN designers d
ON i.impressionId = d.designId
WHERE
i.createDate >= '2014-06-18'
AND HOUR(i.createDate) >= 10
AND HOUR(i.createDate) < 21
GROUP BY
i.session_id, i.impressionId
HAVING
SCANS_total <> 0
ORDER BY
d.category, i.impressionId, SCANS_total desc
) AS P
LEFT JOIN
(
SELECT session_id as sessionsToBeRemoved from Impressions
where impressionId LIKE '%amazon.com%'
GROUP BY session_id
) AS U on P.session_id = U.sessionsToBeRemoved
WHERE U.sessionsToBeRemoved IS NULL