Google AnalyticsAPI分页缺少记录

时间:2017-10-25 06:59:34

标签: c# google-analytics google-api google-analytics-api google-api-dotnet-client

我们有一个系统可以查询GA API以获取大量令牌,主要用于网站访问和会话数据。

我们最近注意到,在查询API时我们得到了奇怪的结果 - 特别是我们从结果集中看到了丢失的记录。更具体地说,看起来当我们有几页行时,当开始下一页时,结果将会跳过"跳过"页面的开头。

此行为不一致 - 每个运行一组不同的站点/令牌会显示此错误,当我尝试手动调试代码时,我从未遇到过此行为。

起初我认为问题出在我们的代码上,可能是某种竞争条件或共享内存,但问题似乎是API访问本身 - 这是因为我检查了TotalResults属性与查询一起返回,当发生此错误时,它显示的总行数少于我手动查询时看到的数量。

例如,我们会查询包含日期和国家/地区维度的网站,并记录的行为:

domain | year | month | day | country | metrics
-----------------------------------------------
X.com 2017 09 22 IT ..... // metrics
// finished result page
X.com 2017 09 24 BW ..... // metrics
....
Total rows - 1295

当我们再次运行相同的代码时,我们获得了此站点2017-09-23值的行,以及总行数 - 1368

这是API中的错误吗?或者也许是我们访问它的方式?我还没有发现这样的问题。

编辑:我已添加了我们使用的API调用方法代码。

private GaDataFlat GetDataV3(string type, string profileID,
                List<Metric> v4metrics, List<MetricFilterClause> v4metricFilters,
                List<Dimension> v4dimensions, List<DimensionFilterClause> v4dimensionFilters,
                List<OrderBy> v4sorting, DateTime start, DateTime end, int maxResults)
    {
        List<string> metrics = (v4metrics == null ? null : v4metrics.Select(x => x.Expression).ToList());
        List<string> dimensions = (v4dimensions == null ? null : v4dimensions.Select(x => x.Name).ToList());
        List<string> sorting = (v4sorting == null ? null : v4sorting.Select(x => x.FieldName).ToList());
        List<string> filters = (v4dimensionFilters == null ? null : v4dimensionFilters.Select(x => deconstructFilter(x)).ToList());

        return ExponentialBackoff.Go(() =>
        {
            var gaData = new GaDataFlat { DataTable = new DataTable() };

            DataResource.GaResource.GetRequest request = service.Data.Ga.Get("ga:" + profileID,
                start.ToString("yyyy-MM-dd"), end.ToString("yyyy-MM-dd"), String.Join(",", metrics));

                //Set the user Quota to not have concurrent limitiation 
                request.QuotaUser = profileID + Thread.CurrentThread.ManagedThreadId;

            if (dimensions != null)
            {
                request.Dimensions = string.Join(",", dimensions);
            }

            if (filters != null)
            {
                request.Filters = string.Join(";", filters);
            }

            if (sorting != null)
            {
                request.Sort = "-" + string.Join(";-", sorting);
            }

            request.SamplingLevel = DataResource.GaResource.GetRequest.SamplingLevelEnum.HIGHERPRECISION;

            bool hasNext;
            int rowCount = 0;
            int iteration = 0;
            do
            {
                iteration++;
                MetricsProvider.Counter("ga.iteration", 1, "type:" + type);
                if (iteration > 100)
                {
                    string error = "Too many iterations ";
                    LogFacade.Fatal(error);
                    throw new Exception(error);
                }

                if (!counter.IncrementAndCheckAvailablility(Constants.APIS.GA))
                {
                    Console.WriteLine("Daily Limit Exceeded - counter");
                    throw new QuotaExceededException();
                }

                GaData DataList = request.Execute();

                gaData.SampleSize = DataList.SampleSize;
                gaData.SampleSpace = DataList.SampleSpace;


                if (DataList.Rows != null)
                {
                    if (gaData.DataTable.Columns.Count == 0)
                    {
                        for (int j = 0; j < DataList.ColumnHeaders.Count; j++)
                        {
                            gaData.DataTable.Columns.Add(new DataColumn
                            {
                                ColumnName = DataList.ColumnHeaders[j].Name
                            });
                        }
                    }

                    foreach (var row in DataList.Rows.ToList())
                    {
                        var reportRow = new List<object>();
                        for (int j = 0; j < DataList.ColumnHeaders.Count; j++)
                        {
                            reportRow.Add(row[j]);
                        }

                        Console.WriteLine(string.Join(":", v4dimensionFilters.SelectMany(f => f.Filters.SelectMany(inner => inner.Expressions))) + "," +
                            string.Join(",", reportRow.Select(cell => cell.ToString())));

                        gaData.DataTable.Rows.Add(reportRow.ToArray());
                    }

                    rowCount += DataList.Rows.Count;
                    request.StartIndex = rowCount;
                    Console.WriteLine(string.Join(":", v4dimensionFilters.SelectMany(f => f.Filters.SelectMany(inner => inner.Expressions))) + ", next page starts " + request.StartIndex);
                    hasNext = rowCount < DataList.TotalResults;
                }
                else
                {
                    hasNext = false;
                }
            } while (hasNext && (maxResults == 0 || rowCount < maxResults));

            return gaData;
        }, type, "GetData " + profileID + " " + Thread.CurrentThread.ManagedThreadId);
    }

编辑:我们使用的过滤器是一致的 - 例如,我们希望获得网站x.com的桌面访问,过滤器将是:

ga:hostname=~x\.com(\/|)$;ga:deviceCategory==desktop

0 个答案:

没有答案