我正在编写一个脚本来从Google Analytics API v4中提取数据。该脚本工作正常。但是,在通过比较GA和提取的数据来验证数据时,我会看到一些差异。没什么不同,但我不明白为什么不一样。
仅需提及的是,我在脚本中使用了动态细分,其条件与我在GA视图中拥有的细分完全相同。 该部分仅通过过滤会话持续时间大于1秒的流量来过滤垃圾邮件流量。
这是我要拉的结构:
body={
"reportRequests":[
{
"viewId": view_id,
"dimensions":[{"name": "ga:date"},{"name": "ga:sourceMedium"},{"name": "ga:campaign"},{"name": "ga:adContent"},{"name": "ga:channelGrouping"},{"name": "ga:segment"}],
"dateRanges":[
{
"startDate":"2018-12-16",
"endDate":"2018-12-20"
}],
"metrics":[{"expression":"ga:sessions","alias":"sessions"}],
"segments":[
{
"dynamicSegment":
{
"name": "sessions_no_spam",
"userSegment":
{
"segmentFilters":[
{
"simpleSegment":
{
"orFiltersForSegment":
{
"segmentFilterClauses": [
{
"metricFilter":
{
"metricName":"ga:sessionDuration",
"operator":"GREATER_THAN",
"comparisonValue":"1"
}
}]
}
}
}]
}
}
}]
}]
}).execute()
不确定我的问题的答案是否将是概念性的而非技术性的,但以防万一,我还包括将结果批量存储在数据库中的功能:
def print_results(no_spam_traffic):
connection = psycopg2.connect(database = 'web_insights_data', user = 'XXXX', password = 'XXXXX', host = 'XXX', port = 'XXXXX')
cursor = connection.cursor()
for report in no_spam_traffic.get('reports', []):
for row in report.get('data', {}).get('rows', []):
gadate = row['dimensions'][0]
gadate = gadate[0:4]+'/'+gadate[4:6]+'/'+gadate[6:8]
gasourcemedium = row['dimensions'][1]
gacampaign = row['dimensions'][2]
gaadcontent = row['dimensions'][3]
gachannel = row['dimensions'][4]
gasessions = row['metrics'][0]['values'][0]
cursor.execute("SELECT * from GA_no_spam_traffic where gadate = %s AND sourcemedium = %s AND campaign = %s AND adcontent = %s", (str(gadate),str(gasourcemedium),str(gacampaign),str(gaadcontent)))
if len(cursor.fetchall())>0: #update old entries
cursor.execute("UPDATE GA_no_spam_traffic set sessions = %s where gadate = %s AND sourcemedium = %s AND campaign = %s AND adcontent = %s", (str(gasessions),str(gadate),str(gasourcemedium),str(gacampaign),str(gaadcontent)))
connection.commit()
else: #Insert new rows
cursor.execute("INSERT INTO GA_no_spam_traffic (gadate,sourcemedium,campaign,adcontent,channel,sessions) VALUES (%s,%s,%s,%s,%s,%s)", (gadate,gasourcemedium,gacampaign,gaadcontent,gachannel,gasessions))
connection.commit()
connection.close()
任何想法可能是什么问题? 谢谢!
答案 0 :(得分:0)
尽管不完全正确,我还是设法改进了它。但是,这是可以接受的差异。页面大小有问题,因此增加了pagesize参数。
以下是Google指南中指向分页部分的链接:https://developers.google.com/analytics/devguides/reporting/core/v4/migration#pagination 谢谢