无法将数据加载到bigquery(BadStatusLine)

时间:2013-05-01 20:47:37

标签: google-bigquery

我正在尝试将本地文件加载到bigquery中的现有表中。在不同的日子里尝试过3次。文件有1.1米行。我无法发现遇到的任何特定错误。以下是细节吐出......

== Platform ==
CPython:2.7.4:Linux-2.6.18-308.11.1.el5.centos.plus-x86_64-with-redhat-5.8-Final
== bq version ==
v2.0.12
== Command line ==
    ['/opt/./python2.7.4/bin/bq', 'load', '395733598146:apache_l1.sjc_web_201304', 'x.2013-04-23']
== UTC timestamp ==
    2013-05-01 18:48:17
== Error trace ==

File "build/bdist.linux-x86_64/egg/bq.py", line 652, in RunSafely
  return_value = self.RunWithArgs(*args, **kwds)
File "build/bdist.linux-x86_64/egg/bq.py", line 880, in RunWithArgs
  job = client.Load(table_reference, source, schema=schema, **opts)
File "build/bdist.linux-x86_64/egg/bigquery_client.py", line 1634, in Load
  upload_file=upload_file, **kwds)
File "build/bdist.linux-x86_64/egg/bigquery_client.py", line 1366, in ExecuteJob
  job_id=job_id)
File "build/bdist.linux-x86_64/egg/bigquery_client.py", line 1352, in RunJobSynchronously
  upload_file=upload_file, job_id=job_id)
File "build/bdist.linux-x86_64/egg/bigquery_client.py", line 1346, in StartJob
  projectId=project_id).execute()
File "build/bdist.linux-x86_64/egg/bigquery_client.py", line 274, in execute
  return super(BigqueryHttp, self).execute(**kwds)
File "build/bdist.linux-x86_64/egg/oauth2client/util.py", line 120, in positional_wrapper
  return wrapped(*args, **kwargs)
File "build/bdist.linux-x86_64/egg/apiclient/http.py", line 656, in execute
  _, body = self.next_chunk(http=http)
File "build/bdist.linux-x86_64/egg/oauth2client/util.py", line 120, in positional_wrapper
  return wrapped(*args, **kwargs)
File "build/bdist.linux-x86_64/egg/apiclient/http.py", line 784, in next_chunk
  headers=headers)
File "build/bdist.linux-x86_64/egg/oauth2client/util.py", line 120, in positional_wrapper
  return wrapped(*args, **kwargs)
File "build/bdist.linux-x86_64/egg/oauth2client/client.py", line 428, in new_request
  redirections, connection_type)
File "/opt/python2.7.4/lib/python2.7/site-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1570, in request
  (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)
File "/opt/python2.7.4/lib/python2.7/site-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1317, in _request
  (response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "/opt/python2.7.4/lib/python2.7/site-packages/httplib2-0.8-py2.7.egg/httplib2/__init__.py", line 1286, in _conn_request
  response = conn.getresponse()
File "/opt/python2.7.4/lib/python2.7/httplib.py", line 1045, in getresponse
  response.begin()
File "/opt/python2.7.4/lib/python2.7/httplib.py", line 409, in begin
  version, status, reason = self._read_status()
File "/opt/python2.7.4/lib/python2.7/httplib.py", line 373, in _read_status
  raise BadStatusLine(line)

2 个答案:

答案 0 :(得分:0)

BigQuery不喜欢直接上传大型本地文件。首先尝试将其上传到Google云存储分区(gs://),然后从那里将其导入BQ。
来自命令行的Install gsutil directions,或在您的网络浏览器中使用您的Google Developer's console

答案 1 :(得分:0)

您可以将本地文件加载到现有的BigQuery表中

所有行:

bq load --source_format=CSV mydataset.mytable myfile.csv col1:INTEGER,col2:STRING

跳过第一行:

bq load --skip_leading_rows=1 --source_format=CSV mydataset.mytable myfile.csv col1:INTEGER,col2:STRING