采用https://webmasters.stackexchange.com/a/87523
上所述的内容除了我自己的理解,我想出了我认为会被认为是“回归用户”的内容
1.首先查询显示在两年时间内第一次“最近访问”的用户:
SELECT
parsedDate,
CASE
# return fullVisitorId when the first latest visit is between 2 years and today
WHEN parsedDate BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 2 YEAR) AND CURRENT_DATE() THEN fullVisitorId
END fullVisitorId
FROM (
SELECT
# convert the date field from string to date and get the latest date
PARSE_DATE('%Y%m%d',
MAX(date)) parsedDate,
fullVisitorId
FROM
`project.dataset.ga_sessions_*`
WHERE
# only show fullVisitorId if first visit
totals.newVisits = 1
GROUP BY
fullVisitorId)
2.然后单独查询以选择特定日期范围内的某些字段:
SELECT
PARSE_DATE('%Y%m%d',
date) parsedDate,
fullVisitorId,
visitId,
totals.newVisits,
totals.visits,
totals.bounces,
device.deviceCategory
FROM
`project.dataset.ga_sessions_*`
WHERE
_TABLE_SUFFIX = "20180118"
3.将这两个查询连在一起找“返回用户”
SELECT
q1.parsedDate date,
COUNT(DISTINCT q1.fullVisitorId) users,
# Default way to determine New Users
SUM(q1.newVisits) newVisits,
# Number of "New Users" based on my queries (matches with default way above)
COUNT(DISTINCT IF(q2.parsedDate < q1.parsedDate, NULL, q2.fullVisitorId)) newUsers,
# Number of "Returning Users" based on my queries
COUNT(DISTINCT IF(q2.parsedDate < q1.parsedDate, q2.fullVisitorId, NULL)) returningUsers
FROM (
(SELECT
PARSE_DATE('%Y%m%d',
date) parsedDate,
fullVisitorId,
visitId,
totals.newVisits,
totals.visits,
totals.bounces,
device.deviceCategory
FROM
`project.dataset.ga_sessions_*`
WHERE
_TABLE_SUFFIX = "20180118") q1
LEFT JOIN (
SELECT
parsedDate,
CASE
# return fullVisitorId when the first latest visit is between 2 years and today
WHEN parsedDate BETWEEN DATE_SUB(CURRENT_DATE(), INTERVAL 2 YEAR) AND CURRENT_DATE() THEN fullVisitorId
END fullVisitorId
FROM (
SELECT
# convert the date field from string to date and get the latest date
PARSE_DATE('%Y%m%d',
MAX(date)) parsedDate,
fullVisitorId
FROM
`project.dataset.ga_sessions_*`
WHERE
# only show fullVisitorId if first visit
totals.newVisits = 1
GROUP BY
fullVisitorId)) q2
ON q1.fullVisitorId = q2.fullVisitorId)
GROUP BY
date
结果BQ
按用户分组的未抽样新用户/回访者报告GA中的同一时段
问题/问题:
鉴于newVisits
(默认字段)和newUsers
(我的计算)给出了与GA报告新访问者用户一致的相同结果。为什么GAs返回访客用户与我在BQ中returningUsers
的计算不匹配?这两个甚至可以比较,我错过了什么?
我的方法是最有效,更简洁的方法吗?
有没有更好的方法来获取数据,我缺少的东西?
解
根据Martin的回答,我设法在我运行的查询的上下文中创建“返回用户”指标/字段:
SELECT
date,
deviceCategory,
# newUsers - SUM result if it's a new user
SUM(IF(userType="New Visitor", 1, 0)) newUsers,
# returningUsers - COUNT DISTINCT fullvisitorId if it's a returning user
COUNT(DISTINCT IF(userType="Returning Visitor", fullvisitorid, NULL)) returningUsers,
COUNT(DISTINCT fullvisitorid) users,
SUM(visits) sessions
FROM (
SELECT
date,
fullVisitorId,
visitId,
totals.visits,
device.deviceCategory,
IF(totals.newVisits IS NOT NULL, "New Visitor", "Returning Visitor") userType
FROM
`project.dataset.ga_sessions_20180118` )
GROUP BY
deviceCategory,
date
答案 0 :(得分:0)
Google Analytics(分析)使用用户近似值(fullvisitorid) - 即使它基于100%&#34;表示&#34;使用非抽样报告时,您可以获得更好的用户数。
另外需要提及的是:即使totals.visits != 1
,也会考虑使用fullvisitorids,而会话仅计入totals.visits = 1
如果用户在新的位置再返回,则会对用户进行重复计算。这意味着,这应该给你正确的数字:
SELECT
totals.newVisits IS NOT NULL AS isNew,
COUNT(DISTINCT fullvisitorid) AS visitors,
SUM(totals.visits) AS sessions
FROM
`project.dataset.ga_sessions_20180214`
GROUP BY
1
如果你想避免重复计算,你可以使用这个,即使她回来,用户也算作新的:
WITH
visitors AS (
SELECT
fullvisitorid,
-- check if any visit of this visitor was new - will be used for grouping later
MAX(totals.newVisits ) isNew,
SUM(totals.visits) as sessions
FROM
`project.dataset.ga_sessions_20180214`
GROUP BY 1
)
SELECT
isNew IS NOT NULL AS isNew,
COUNT(1) AS visitors,
sum(sessions) as sessions
FROM
visitors
GROUP BY 1
当然,这些数字仅与总数相匹配。