我正在编写一个简单的服务来从几个来源获取数据,将它组合在一起,并使用Google API客户端将其发送到Google表格。简单易用,数据不是那么大。
问题是在构建api服务(即build('sheets', 'v4', http=auth).spreadsheets()
)之后调用.spreadsheets()会导致大约30兆字节的内存跳转(我做了一些分析以分离出分配内存的位置)。当部署到GAE时,这些尖峰会长时间停留(有时几小时),向上爬行并在几次请求后触发GAE的“超出软私有内存限制”错误。
我使用memcache作为发现文档,使用urlfetch来获取数据,但这些是我正在使用的其他服务。
我尝试了手动垃圾收集,改变了app.yaml中的线程安全,甚至改变了调用.spreadsheets()这一点,并且无法解决这个问题。我也可能只是误解了有关GAE体系结构的一些内容,但我知道spike是由对.spreadsheets()的调用造成的,而且我没有在本地缓存中存储任何内容。
有没有办法要么1)减少内存尖峰的大小来调用.spreadsheets()或2)防止尖峰留在内存中(或者最好同时做两者)。下面是一个非常简单的要点,给出了API调用和请求处理程序的概念,如果需要,我可以提供更全面的代码。我之前已经问过类似的问题,但是我无法解决这个问题。
https://gist.github.com/chill17/18f1caa897e6a20201232165aca05239
答案 0 :(得分:1)
我在一个只有20MB可用内存的小型处理器上使用电子表格API时遇到了这种情况。问题是谷歌API客户端以字符串格式提取整个API并将其作为资源对象存储在内存中。
如果空闲内存有问题,您应该构建自己的http对象并手动发出所需的请求。请参阅我的Spreadsheet()类,了解如何使用此方法创建新电子表格。
SCOPES = 'https://www.googleapis.com/auth/spreadsheets'
CLIENT_SECRET_FILE = 'client_secret.json'
APPLICATION_NAME = 'Google Sheets API Python Quickstart'
class Spreadsheet:
def __init__(self, title):
#Get credentials from locally stored JSON file
#If file does not exist, create it
self.credentials = self.getCredentials()
#HTTP service that will be used to push/pull data
self.service = httplib2.Http()
self.service = self.credentials.authorize(self.service)
self.headers = {'content-type': 'application/json', 'accept-encoding': 'gzip, deflate', 'accept': 'application/json', 'user-agent': 'google-api-python-client/1.6.2 (gzip)'}
print("CREDENTIALS: "+str(self.credentials))
self.baseUrl = "https://sheets.googleapis.com/v4/spreadsheets"
self.spreadsheetInfo = self.create(title)
self.spreadsheetId = self.spreadsheetInfo['spreadsheetId']
def getCredentials(self):
"""Gets valid user credentials from storage.
If nothing has been stored, or if the stored credentials are invalid,
the OAuth2 flow is completed to obtain the new credentials.
Returns:
Credentials, the obtained credential.
"""
home_dir = os.path.expanduser('~')
credential_dir = os.path.join(home_dir, '.credentials')
if not os.path.exists(credential_dir):
os.makedirs(credential_dir)
credential_path = os.path.join(credential_dir,
'sheets.googleapis.com-python-quickstart.json')
store = Storage(credential_path)
credentials = store.get()
if not credentials or credentials.invalid:
flow = client.flow_from_clientsecrets(CLIENT_SECRET_FILE, SCOPES)
flow.user_agent = APPLICATION_NAME
if flags:
credentials = tools.run_flow(flow, store, flags)
else: # Needed only for compatibility with Python 2.6
credentials = tools.run(flow, store)
print('Storing credentials to ' + credential_path)
return credentials
def create(self, title):
#Only put title in request body... We don't need anything else for now
requestBody = {
"properties":{
"title":title
},
}
print("BODY: "+str(requestBody))
url = self.baseUrl
response, content = self.service.request(url,
method="POST",
headers=self.headers,
body=str(requestBody))
print("\n\nRESPONSE\n"+str(response))
print("\n\nCONTENT\n"+str(content))
return json.loads(content)