我们需要计算在我们的网站中有多少用户反弹。给定一个表,其中包含每个网页浏览的GUID(用户ID),主机名和路径。如果每个主机名只有1个GUID,则认为它是反弹。
Table Pageviews:
- GUID
- Hostname
- Path
我能够进行此查询,但我认为可能会有所改进。特别是在性能方面。
SELECT 1, (Bounces / All) * 100 AS `Bounce Rate` From (
SELECT
count(*) AS All,
(
SELECT count(*)
FROM
(
SELECT GUID
FROM pageviews
GROUP BY GUID
HAVING count(GUID) = 1
)
) AS Bounces
FROM pageviews
)
答案 0 :(得分:2)
按照guid和主机名对数据进行分组(两次,后者过滤后仅反弹)然后将它们连接在一起:)
SELECT count(bounces.guid) `bounces`,
count(uniqueUsers.guid) `total unique users`,
count(bounces.guid) / count(uniqueUsers.guid) * 100 `global bounce rate`
FROM (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
) uniqueUsers
LEFT JOIN (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
HAVING COUNT(1) = 1
) bounces
ON uniqueUsers.guid = bounces.guid
AND uniqueUsers.hostname = bounces.hostname
示例结果:
bounces unique users global bounce rate
------- ------------ ------------------
3 6 50.0000
请注意,所有4' guid 3'针对host1的点击数仅计为1个唯一身份用户,但是#guid 1'命中host1和host2,因此它会计算2个唯一用户(我认为这是所需的逻辑)。
相同但在外部查询上有一个分组:)
SELECT uniqueUsers.hostname,
count(bounces.guid) bounces,
count(uniqueUsers.guid) `unique users`,
count(bounces.guid) / count(uniqueUsers.guid) * 100 `global bounce rate`
FROM (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
) uniqueUsers
LEFT JOIN (
SELECT guid, hostname
FROM PageViews
GROUP BY guid, hostname
HAVING COUNT(1) = 1
) bounces
ON uniqueUsers.guid = bounces.guid
AND uniqueUsers.hostname = bounces.hostname
GROUP BY uniqueUsers.hostname;
示例结果:
hostname bounces unique users bounce rate
-------- ------- ------------ -----------
host1 2 4 50.0000
host2 0 1 0.0000
host3 1 1 100.0000
guid hostname path
---- -------- ----
1 host1 irrelevant => bounce 1
2 host1 irrelevant => bounce 2
3 host1 irrelevant => non-bounce 1 (visit 1/4)
3 host1 irrelevant
3 host1 irrelevant
3 host1 irrelevant
4 host1 irrelevant => non-bounce 2 (visit 1/2)
4 host1 irrelevant
1 host2 irrelevant => non-bounce 3 (visit 1/2)
1 host2 irrelevant
2 host3 irrelevant => bounce 3