Google App Engine和Google表格超出内存限制

时间:2016-10-03 20:44:29

标签: python google-app-engine google-sheets-api

我正在编写一个简单的服务来从几个来源获取数据,将它组合在一起,并使用Google API客户端将其发送到Google表格。简单易用,数据不是那么大。

问题是在构建api服务(即build('sheets', 'v4', http=auth).spreadsheets())之后调用.spreadsheets()会导致大约30兆字节的内存跳转(我做了一些分析以分离出分配内存的位置)。当部署到GAE时,这些尖峰会长时间停留(有时几小时),向上爬行并在几次请求后触发GAE的“超出软私有内存限制”错误。

我使用memcache作为发现文档,使用urlfetch来获取数据,但这些是我正在使用的其他服务。

我尝试了手动垃圾收集,改变了app.yaml中的线程安全,甚至改变了调用.spreadsheets()这一点,并且无法解决这个问题。我也可能只是误解了有关GAE体系结构的一些内容,但我知道spike是由对.spreadsheets()的调用造成的,而且我没有在本地缓存中存储任何内容。

有没有办法要么1)减少内存尖峰的大小来调用.spreadsheets()或2)防止尖峰留在内存中(或者最好同时做两者)。下面是一个非常简单的要点,给出了API调用和请求处理程序的概念,如果需要,我可以提供更全面的代码。我之前已经问过类似的问题,但是我无法解决这个问题。

https://gist.github.com/chill17/18f1caa897e6a20201232165aca05239

1 个答案:

答案 0 :(得分:1)

我在一个只有20MB可用内存的小型处理器上使用电子表格API时遇到了这种情况。问题是谷歌API客户端以字符串格式提取整个API并将其作为资源对象存储在内存中。

如果空闲内存有问题,您应该构建自己的http对象并手动发出所需的请求。请参阅我的Spreadsheet()类,了解如何使用此方法创建新电子表格。

SCOPES = 'https://www.googleapis.com/auth/spreadsheets'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Google Sheets API Python Quickstart'

class Spreadsheet:

    def __init__(self, title):

        #Get credentials from locally stored JSON file
        #If file does not exist, create it
        self.credentials = self.getCredentials()

        #HTTP service that will be used to push/pull data

        self.service = httplib2.Http()
        self.service = self.credentials.authorize(self.service)
        self.headers = {'content-type': 'application/json', 'accept-encoding': 'gzip, deflate', 'accept': 'application/json', 'user-agent': 'google-api-python-client/1.6.2 (gzip)'}        


        print("CREDENTIALS: "+str(self.credentials))


        self.baseUrl = "https://sheets.googleapis.com/v4/spreadsheets"
        self.spreadsheetInfo = self.create(title)   
        self.spreadsheetId = self.spreadsheetInfo['spreadsheetId']    



    def getCredentials(self):
        """Gets valid user credentials from storage.

        If nothing has been stored, or if the stored credentials are invalid,
        the OAuth2 flow is completed to obtain the new credentials.

        Returns:
            Credentials, the obtained credential.
        """
        home_dir = os.path.expanduser('~')
        credential_dir = os.path.join(home_dir, '.credentials')
        if not os.path.exists(credential_dir):
            os.makedirs(credential_dir)
        credential_path = os.path.join(credential_dir,
                                       'sheets.googleapis.com-python-quickstart.json')

        store = Storage(credential_path)
        credentials = store.get()
        if not credentials or credentials.invalid:
            flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
            flow.user_agent = APPLICATION_NAME
            if flags:
                credentials = tools.run_flow(flow, store, flags)
            else: # Needed only for compatibility with Python 2.6
                credentials = tools.run(flow, store)
            print('Storing credentials to ' + credential_path)
        return credentials

    def create(self, title):

        #Only put title in request body... We don't need anything else for now
        requestBody = {
            "properties":{
                "title":title
            },
        }


        print("BODY: "+str(requestBody))
        url = self.baseUrl

        response, content = self.service.request(url, 
                                        method="POST", 
                                        headers=self.headers,
                                        body=str(requestBody))
        print("\n\nRESPONSE\n"+str(response))
        print("\n\nCONTENT\n"+str(content))

        return json.loads(content)