如何组合/加速多个API调用以提高性能?

时间:2015-08-03 18:43:21

标签: python google-analytics google-api google-analytics-api

更新:我找到something that might be useful,但我仍然在弄清楚如何实现它时遇到一些麻烦。如果我尝试像这样映射get_data,我不知道如何将每个调用的结果分配给相应的变量。

parameters = [
    [service, profile_id, '30daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop'],
    [service, profile_id, '60daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop'],
    ...
    [service, profile_id, '90daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile']
]

with ThreadPoolExecutor(max_workers=4) as executor:
    executor.map(get_data, parameters)

我正在编写一个Python应用程序(使用Google分析API),该应用程序允许用户获取前10个桌面浏览器,按版本细分的桌面浏览器,移动浏览器和用于访问给定的移动操作系统的报告过去30,60和90天的网站。截至目前,一切似乎都运转正常。

但是,表演已经到处都是。有12个API请求 - 对于4组数据中的每一个都有3个。有时应用程序运行大约需要10秒钟,有时需要超过一分钟。似乎这完全取决于API的响应方式。所以我的问题是:有没有办法可以将这些请求中的一些组合起来或以这样的方式安排它们以便它们同时执行?

我尝试了一些方法来整合请求,这样我可能只需要为每组数据做一次请求,这些数据可以返回30天,60天和90天的信息,但我无法遇到任何东西。至于同时获取请求,我只是不太确定如何做这样的事情。我能找到的最接近的是this question/answer,但我无法完全按照批量处理的方式进行操作。

以下是相关代码:

def get_data(service, profile_id, days, dimensions, segment):
    return service.data().ga().get(
        ids='ga:' + profile_id,
        start_date=days,
        end_date='today',
        metrics='ga:sessions',
        dimensions=dimensions,
        sort='-ga:sessions',
        segment=segment,
        max_results=10).execute()


def get_results(service, profile_id):
    global glob_startdate
    global glob_months

    # get top 10 desktop browsers
    print("Getting top 10 desktop browsers...")
    data_1a = get_data(service, profile_id, '30daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
    data_1b = get_data(service, profile_id, '60daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
    data_1c = get_data(service, profile_id, '90daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
    data1 = [data_1a, data_1b, data_1c]

    # get top 10 desktop browser versions
    print("Getting top 10 desktop browser versions...")
    data_2a = get_data(service, profile_id, '30daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==desktop')
    data_2b = get_data(service, profile_id, '60daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==desktop')
    data_2c = get_data(service, profile_id, '90daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==desktop')
    data2 = [data_2a, data_2b, data_2c]

    # get top 10 mobile OS's
    print("Getting top 10 mobile OS's...")
    data_3a = get_data(service, profile_id, '30daysAgo', 'ga:operatingSystem,ga:operatingSystemVersion', 'sessions::condition::ga:deviceCategory==mobile')
    data_3b = get_data(service, profile_id, '60daysAgo', 'ga:operatingSystem,ga:operatingSystemVersion', 'sessions::condition::ga:deviceCategory==mobile')
    data_3c = get_data(service, profile_id, '90daysAgo', 'ga:operatingSystem,ga:operatingSystemVersion', 'sessions::condition::ga:deviceCategory==mobile')
    data3 = [data_3a, data_3b, data_3c]

    # get top 10 mobile browsers
    print("Getting top 10 mobile browsers...")
    data_4a = get_data(service, profile_id, '30daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile')
    data_4b = get_data(service, profile_id, '60daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile')
    data_4c = get_data(service, profile_id, '90daysAgo', 'ga:browser,ga:browserVersion', 'sessions::condition::ga:deviceCategory==mobile')
    data4 = [data_4a, data_4b, data_4c]

谢谢!

1 个答案:

答案 0 :(得分:2)

由于API batch,您一次可以quota and limits最多10个请求。

from apiclient.http import BatchHttpRequest
import httplib2


def call_back(request_id, response, exception):
  """Do something with the response of each call"""
  pass

def get_request(service, profile_id, days, dimensions, segment):
   """Note I removed the execute() from the end of this method."""
   return service.data().ga().get(
     ids='ga:' + profile_id,
     start_date=days,
     end_date='today',
     metrics='ga:sessions',
     dimensions=dimensions,
     sort='-ga:sessions',
     segment=segment,
     max_results=10)

# Create a batch Http Request object
batch = BatchHttpRequest(callback=self.call_back)


# Construct your queries.
# get top 10 desktop browsers
print("Getting top 10 desktop browsers...")
request_1a = get_request(service, profile_id, '30daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
request_1b = get_request(service, profile_id, '60daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')
request_1c = get_request(service, profile_id, '90daysAgo', 'ga:browser', 'sessions::condition::ga:deviceCategory==desktop')

for request in [request_1a, request_1b, request_1c]:
    batch.add(request)

batch.execute(http=httplib2.Http())