BigQuery GA多个值

时间:2015-01-02 17:09:50

标签: sql google-analytics google-bigquery

我正在尝试在GA中的同一会话中隔离具有字段([hits.customVariables.index])的两个值的用户。

我怀疑有一种更简单的方法可以做到这一点;如果你知道如何,请告诉我。

例如,在下面的sql中,所有会话都返回'2',除了'matches'字段的最后一个(返回1)。

SELECT CONCAT([fullVisitorId] AS [column_A], STRING([visitId])) AS [Session],
(MAX(IF(SUM(CASE WHEN [hits.customVariables.index] IN (3) THEN 1 ELSE 0 END)>0,1,0)) + 
MAX(IF(SUM(CASE WHEN [hits.customVariables.index] IN (46) THEN 1 ELSE 0 END)>0,1,0))) 
AS [matches]
FROM FLATTEN([1271835.ga_sessions_20141216], [hits.customVariables.index])
WHERE CONCAT([fullVisitorId], STRING([visitId])) IN 
('50956211505979902631418751704',
 '86512166255567372671418771317',
 '79580299450214242591418749991',
 '274962317238452051418783657')
GROUP BY [Session]

OUTPUT在这里:

column_A  Session                         matches    
1         274962317238452051418783657       1    
2         50956211505979902631418751704     2    
3         86512166255567372671418771317     2    
4         79580299450214242591418749991     2   

2 个答案:

答案 0 :(得分:1)

您是否要求另一种方法来计算“匹配”?

如果是这样,您可以使用嵌套的SELECT。内部选择以获取所需的会话,然后使用中间选择过滤到您感兴趣的重复索引值,使用外部选择来总结匹配。

以下内容对您有用:

SELECT
  Session,
  COUNT(hits.customVariables.index) AS match
FROM (
  SELECT
    Session,
    hits.customVariables.index
  FROM (
    SELECT
      CONCAT([fullVisitorId], STRING([visitId])) AS [Session],
      hits.customVariables.index
    FROM
      [1271835.ga_sessions_20141216]
    WHERE
      CONCAT([fullVisitorId], STRING([visitId])) IN
      ('50956211505979902631418751704',
        '86512166255567372671418771317',
        '79580299450214242591418749991',
        '274962317238452051418783657')
      )
  -- Drop index values we don't care about
  OMIT [hits.customVariables.index] IF
    NOT [hits.customVariables.index] IN (3, 46)
  -- Drop duplicates so we get a COUNT of unique values
  GROUP BY
    Session,
    hits.customVariables.index
    )
GROUP BY
  Session

如果要将输出限制为匹配>的行1,您可以在任一查询周围添加以下内容:

SELECT Session, match FROM (
  ...
  )
WHERE match > 1

答案 1 :(得分:0)

不完全确定您是否希望将其限制为某些变量索引和/或某些会话ID。

这是我限制它的一个例子

SELECT 
  fullVisitorId, 
  CONCAT([fullVisitorId], STRING([visitId])) as sessionId,
COUNT( DISTINCT customDimensions.index, 100000 ) as unique_count
FROM [86276908.ga_sessions_20150109]
WHERE fullVisitorId IN ('947763264111444','505145409379903', '6201048528944990204')
AND customDimensions.index IN (1,2)
GROUP BY 
  fullVisitorId, 
  sessionId
HAVING unique_count > 1

如果您希望所有访问者在会话中至少拥有2个不同的自定义变量

,它将如下工作
SELECT 
  fullVisitorId, 
  CONCAT([fullVisitorId], STRING([visitId])) as sessionId,
COUNT( DISTINCT customDimensions.index, 100000 ) as unique_count
FROM [86276908.ga_sessions_20150109]
GROUP BY 
  fullVisitorId, 
  sessionId
HAVING unique_count > 1