Question

我正在尝试将csv文件上传到BigQuery上的现有表。根据Bigquery的documentation，你就是这样做的：

ROWS_TO_INSERT = [
    (u'Phred Phlyntstone', 32),
    (u'Wylma Phlyntstone', 29),
]

table.insert_data(ROWS_TO_INSERT)

这是我的代码：

from google.cloud import bigquery

# enter credentials
bigquery_client = bigquery.Client()
dataset = bigquery_client.dataset('my_dataset')
table = dataset.table('my_table')

# open csv file and get a list of rows in the form of tuples
with open('my_data.csv') as f:
    content = f.readlines()
ROWS_TO_INSERT = [tuple(x.split(",")) for x in content
table.reload()

# everything above worked well, but the below line got a utf8 can't decode error
table.insert_data(ROWS_TO_INSERT)

这是追溯：

Traceback (most recent call last):
  File "<input>", line 6, in <module>
  File "/Users/layla.zhang/Library/Python/2.7/lib/python/site-packages/google/cloud/bigquery/table.py", line 770, in insert_data
    data=data)
  File "/Users/layla.zhang/Library/Python/2.7/lib/python/site-packages/google/cloud/_http.py", line 294, in api_request
    data = json.dumps(data)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 243, in dumps
    return _default_encoder.encode(obj)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 207, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/encoder.py", line 270, in iterencode
    return _iterencode(o, 0)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xd9 in position 68: invalid continuation byte

我花了很多时间研究类似的问题并尝试了一些编码/解码的解决方法，但到目前为止还没有任何工作。我该怎么办？

使用Python客户端库将行上传到现有BigQuery表utf8无法解码

0 个答案: