谷歌分析采样尽管会话较少

时间:2017-06-04 08:24:05

标签: google-analytics google-analytics-api

我正在使用Google Analytics Reporting API,但即使指定日期范围内的会话远小于500K limit,我也会获得抽样结果。我一个月只有~4K会话。

我还将“samplingLevel”设置为“LARGE”。

这是Python查询:

    response=analytics.reports().batchGet(
  body={
    "reportRequests":[
    {
      "viewId":myViewID,
      "dateRanges":[
        {
          "startDate":"2017-05-01",
          "endDate":"2017-05-30"
        }],
    "samplingLevel":"LARGE",
      "metrics":[
        {
          "expression":"ga:sessions"
        }],
      "dimensions": [
        {
          "name":"ga:browser"

        },
        {
          "name":"ga:city",
        }
        ]
      }]
  }
).execute()

如下所示,示例空间为4365个会话,远小于500K的限制

response.get('reports', [])[0].get('data',[]).get('samplesReadCounts',[])
Out[31]: [u'2051']

response.get('reports', [])[0].get('data',[]).get('samplingSpaceSizes',[])
Out[32]: [u'4365']

将请求分解为较小的日期范围也无济于事。我使用R中的GoogleAnalyticsR库和anti_sample = TRUE尝试了这个。

    > web_data <- google_analytics_4(view_id, 
+                                 date_range = c("2017-05-01", "2017-05-30"),
+                                 dimensions = c("city","browser"),
+                                 metrics = c("hits"),
+                                samplingLevel="LARGE",
+                                 anti_sample = TRUE)
2017-06-04 11:54:51> anti_sample set to TRUE. Mitigating sampling via multiple API calls.
2017-06-04 11:54:51> Finding how much sampling in data request...
2017-06-04 11:54:52> Downloaded [10] rows from a total of [15].
2017-06-04 11:54:52> Data is sampled, based on 47% of sessions.
2017-06-04 11:54:52> Finding number of sessions for anti-sample calculations...
2017-06-04 11:54:53> Downloaded [30] rows from a total of [30].
2017-06-04 11:54:53> Calculated [3] batches are needed to download approx. [18] rows unsampled.
2017-06-04 11:54:53> Anti-sample call covering 14 days: 2017-05-01, 2017-05-14
2017-06-04 11:54:54> Downloaded [7] rows from a total of [7].
2017-06-04 11:54:54> Data is sampled, based on 53.2% of sessions.
2017-06-04 11:54:54> Anti-sampling failed
2017-06-04 11:54:54> Anti-sample call covering 9 days: 2017-05-15, 2017-05-23
2017-06-04 11:54:54> Downloaded [4] rows from a total of [4].
2017-06-04 11:54:54> Data is sampled, based on 55.7% of sessions.
2017-06-04 11:54:54> Anti-sampling failed
2017-06-04 11:54:54> Anti-sample call covering 7 days: 2017-05-24, 2017-05-30
2017-06-04 11:54:55> Downloaded [10] rows from a total of [10].
2017-06-04 11:54:55> Data is sampled, based on 52.3% of sessions.
2017-06-04 11:54:55> Anti-sampling failed
Joining, by = c("city", "browser")
Joining, by = c("city", "browser")
2017-06-04 11:54:55> Finished unsampled data request, total rows [13]

当我在自定义请求中检查相同的数据时,我看到类似的采样

Custom report snapshot

我知道为什么我会得到采样结果,甚至认为会话数远远少于限制?

3 个答案:

答案 0 :(得分:1)

尽管https://issuetracker.google.com/issues/62525952

的会话次数很少,但谷歌有一张关于抽样的门票

答案 1 :(得分:0)

500k适用于默认报告

编辑: 属性级别的500k会话,用于您用于临时查询的日期范围。

解释了默认报告:

  

Analytics(分析)在“受众群体,获取,行为和转化次数”下的左窗格中列出了一组预配置的默认报告。

看起来您正在处理具有次要维度的临时报告,因此500k阈值可能不再适用,可能会低得多。您最初链接到here的页面中有更多相关信息。

答案 2 :(得分:0)

您在该视图中只有4k个会话...但是该视图可能正在使用过滤器...通过查看没有过滤器的视图来检查您在该属性中拥有多少流量.... 500k会话是属性级别不在视图级别。