我正在尝试将csv文件上传到BigQuery上的现有表。根据Bigquery的documentation,你就是这样做的:
ROWS_TO_INSERT = [
(u'Phred Phlyntstone', 32),
(u'Wylma Phlyntstone', 29),
]
table.insert_data(ROWS_TO_INSERT)
这是我的代码:
from google.cloud import bigquery
# enter credentials
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset('my_dataset')
table = dataset.table('my_table')
# open csv file and get a list of rows in the form of tuples
with open('my_data.csv') as f:
content = f.readlines()
ROWS_TO_INSERT = [tuple(x.split(",")) for x in content
table.reload()
# everything above worked well, but the below line got a utf8 can't decode error
table.insert_data(ROWS_TO_INSERT)
这是追溯:
Traceback (most recent call last):
File "<input>", line 6, in <module>
File "/Users/layla.zhang/Library/Python/2.7/lib/python/site-packages/google/cloud/bigquery/table.py", line 770, in insert_data
data=data)
File "/Users/layla.zhang/Library/Python/2.7/lib/python/site-packages/google/cloud/_http.py", line 294, in api_request
data = json.dumps(data)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 243, in dumps
return _default_encoder.encode(obj)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode
chunks = self.iterencode(o, _one_shot=True)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode
return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd9 in position 68: invalid continuation byte
我花了很多时间研究类似的问题并尝试了一些编码/解码的解决方法,但到目前为止还没有任何工作。我该怎么办?