将NYS ALL选民(选民文件)加载到BigQuery崩溃中

时间:2014-06-11 15:27:34

标签: google-bigquery

感谢Big Query。

当尝试将NYS选民文件加载到Big Query时,我们会看到:

文件是

-rw-rw-r-- 1 fedex1 fedex1 5.3G Jun  3 06:43 AllNYSVoters.txt

5.3 Gigs是否太大而无法从命令行加载?

只有15,486,253行(15百万行)

看来stackoverflow不喜欢只有代码的问题,所以我会提供更多细节。

> You have encountered a bug in the BigQuery CLI. Google engineers
> monitor and answer questions on Stack Overflow, with the tag
> google-bigquery:
> http://stackoverflow.com/questions/ask?tags=google-bigquery Please
> include a brief description of the steps that led to this issue, as
> well as the following information:
> 
> ========================================
> == Platform ==   CPython:2.7.3:Linux-3.8.11-x86_64-with-Ubuntu-12.04-precise
> == bq version ==
>   2.0.21
> == Command line ==   ['/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bq.py',
> '--credential_file',
> '/home/x1/.config/gcloud/legacy_credentials/x7/singlestore.json',
> '--project', 'personal-real-estate', 'load', 'nys.all_voter_file',
> 'AllNYSVoters.txt',
> 'LASTNAME:string,FIRSTNAME:string,MIDDLENAME:string,NAMESUFFIX:string,RADDNUMBER:string,RHALFCODE:string,RAPARTMENT:string,RPREDIRECTION:string,RSTREETNAME:string,RPOSTDIRECTION:string,RCITY:string,RZIP5:string,RZIP4:string,MAILADD1:string,MAILADD2:string,MAILADD3:string,MAILADD4:string,DOB:string,GENDER:string,ENROLLMENT:string,OTHERPARTY:string,COUNTYCODE:integer,ED:integer,LD:integer,TOWNCITY:string,WARD:string,CD:integer,SD:integer,AD:integer,LASTVOTEDDATE:timestamp,PREVYEARVOTED:string,PREVCOUNTY:string,PREVADDRESS:string,PREVNAME:string,COUNTYVRNUMBER:string,REGDATE:timestamp,VRSOURCE:string,IDREQUIRED:string,IDMET:string,STATUS:string,REASONCODE:string,INACT_DATE:timestamp,PURGE_DATE:timestamp,SBOEID:string,VoterHistory:string']
> == UTC timestamp ==   2014-06-11 11:42:34
> == Error trace ==   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bq.py",
> line 806, in RunSafely
>     return_value = self.RunWithArgs(*args, **kwds)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bq.py",
> line 1047, in RunWithArgs
>     job = client.Load(table_reference, source, schema=schema, **opts)   File
> "/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bigquery_client.py",
> line 2045, in Load
>     upload_file=upload_file, **kwds)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bigquery_client.py",
> line 1642, in ExecuteJob
>     job_id=job_id)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bigquery_client.py",
> line 1627, in RunJobSynchronously
>     upload_file=upload_file, job_id=job_id)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bigquery_client.py",
> line 1535, in StartJob
>     projectId=project_id).execute()   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/platform/bq/bigquery_client.py",
> line 308, in execute
>     return super(BigqueryHttp, self).execute(**kwds)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/oauth2client/util.py",
> line 132, in positional_wrapper
>     return wrapped(*args, **kwargs)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/apiclient/http.py",
> line 688, in execute
>     _, body = self.next_chunk(http=http, num_retries=num_retries)   File
> "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/oauth2client/util.py",
> line 132, in positional_wrapper
>     return wrapped(*args, **kwargs)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/apiclient/http.py",
> line 867, in next_chunk
>     headers=headers)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/oauth2client/util.py",
> line 132, in positional_wrapper
>     return wrapped(*args, **kwargs)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/oauth2client/client.py",
> line 490, in new_request
>     redirections, connection_type)   File "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/httplib2/__init__.py",
> line 1586, in request
>     (response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redirections, cachekey)   File
> "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/httplib2/__init__.py",
> line 1333, in _request
>     (response, content) = self._conn_request(conn, request_uri, method, body, headers)   File
> "/var/host/media/removable/USB_Drive/google-cloud-sdk/bin/bootstrapping/../../lib/httplib2/__init__.py",
> line 1289, in _conn_request
>     response = conn.getresponse()   File "/usr/lib/python2.7/httplib.py", line 1030, in getresponse
>     response.begin()   File "/usr/lib/python2.7/httplib.py", line 407, in begin
>     version, status, reason = self._read_status()   File "/usr/lib/python2.7/httplib.py", line 365, in _read_status
>     line = self.fp.readline()   File "/usr/lib/python2.7/socket.py", line 430, in readline
>     data = recv(1)   File "/usr/lib/python2.7/ssl.py", line 241, in recv
>     return self.read(buflen)   File "/usr/lib/python2.7/ssl.py", line 160, in read
>     return self._sslobj.read(len)
> ========================================
> 
> Unexpected exception in load operation: [Errno 104] Connection reset
> by peer

1 个答案:

答案 0 :(得分:0)

日志的最后一行似乎显示了问题所在:

[Errno 104] Connection reset by peer

网络连接是否中断?