Python BigQuery真的很奇怪超时

时间:2014-05-15 14:12:01

标签: python google-api google-oauth google-bigquery

我正在构建一个服务来将数据流式传输到bigquery。如果我删除需要4-5分钟加载的部分(我正在预先处理一些映射),以下代码可以完美地工作

from googleapiclient import discovery
from oauth2client import file
from oauth2client import client
from oauth2client import tools

from oauth2client.client import SignedJwtAssertionCredentials

## load email and key
credentials = SignedJwtAssertionCredentials(email, key, scope='https://www.googleapis.com/auth/bigquery')

if credentials is None or credentials.invalid:
        raw_input('invalid key')
        exit(0)

http = httplib2.Http()
http = credentials.authorize(http)

service = discovery.build('bigquery', 'v2', http=http)


## this does not hang, because it is before the long operation
service.tabledata().insertAll(...)


## some code that takes 5 minutes to execute
r = load_mappings()
## aka long operation

## this hangs
service.tabledata().insertAll(...)

如果我离开需要5分钟执行的部分,Google API会停止响应我之后执行的请求。它只是挂在那里,甚至没有返回错误。我离开它甚至10-20分钟,看看会发生什么,它只是坐在那里。如果我点击ctrl + c,我得到这个:

^CTraceback (most recent call last):
  File "./to_bigquery.py", line 116, in <module>
    main(sys.argv)
  File "./to_bigquery.py", line 101, in main
    print service.tabledata().insertAll(projectId=p_n, datasetId="XXX", tableId="%s_XXXX" % str(shop), body=_mybody).execute()
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 716, in execute
    body=self.body, headers=self.headers)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 490, in new_request
    redirections, connection_type)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1593, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1335, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1291, in _conn_request
    response = conn.getresponse()
  File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
    line = self.fp.readline()   
  File "/usr/lib/python2.7/socket.py", line 430, in readline
    data = recv(1)
  File "/usr/lib/python2.7/ssl.py", line 241, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 160, in read
    return self._sslobj.read(len)

我已经成功通过在凭据授权之前放置大型加载操作来临时修复它,但这对我来说似乎是个错误。我错过了什么?

编辑:我在等待时遇到了错误:

  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/googleapiclient/http.py", line 716, in execute
    body=self.body, headers=self.headers)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/util.py", line 132, in positional_wrapper
    return wrapped(*args, **kwargs)
  File "/usr/local/lib/python2.7/dist-packages/oauth2client/client.py", line 490, in new_request
    redirections, connection_type)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1593, in request
    (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1335, in _request
    (response, content) = self._conn_request(conn, request_uri, method, body, headers)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 1291, in _conn_request
    response = conn.getresponse()
  File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
    response.begin()
  File "/usr/lib/python2.7/httplib.py", line 407, in begin
    version, status, reason = self._read_status()
  File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
    line = self.fp.readline()
  File "/usr/lib/python2.7/socket.py", line 430, in readline
    data = recv(1)
  File "/usr/lib/python2.7/ssl.py", line 241, in recv
    return self.read(buflen)
  File "/usr/lib/python2.7/ssl.py", line 160, in read
    return self._sslobj.read(len)
socket.error: [Errno 110] Connection timed out

它说超时。这似乎发生在冷表..

1 个答案:

答案 0 :(得分:0)

def refresh_bq(self):
    credentials = SignedJwtAssertionCredentials(email, key, scope='https://www.googleapis.com/auth/bigquery')

    if credentials is None or credentials.invalid:
        raw_input('invalid key')
        exit(0)

    http = httplib2.Http()
    http = credentials.authorize(http)

    service = discovery.build('bigquery', 'v2', http=http)
    self.service = service

我每次运行self.refresh_bq()时都会执行一些不需要预处理的插件,并且它可以完美运行。凌乱的黑客,但我需要尽快让它工作。有def。某处的错误。