我有一个非常大的json有多个字段,我想提取其中的一些,然后将它们写入csv。
这是我的代码:
#!/usr/bin/python3
# -*- coding: utf-8 -*-
import json
import csv
data_file = open("book_data.json", "r")
values = json.load(data_file)
data_file.close()
with open("book_data.csv", "wb") as f:
wr = csv.writer(f)
for data in values:
value = data["identifier"]
value = data["authors"]
for key, value in data.iteritems():
wr.writerow([key, value])
它给了我这个错误:
File "json_to_csv.py", line 22, in <module>
wr.writerow([key, value])
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 8: ordinal not in range(128)
但我在顶部给出了utf-8编码,所以我不知道那里有什么错误。
由于
答案 0 :(得分:3)
您需要对数据进行编码:
wr.writerow([key.encode("utf-8"), value.encode("utf-8")])
差异相当于:
In [8]: print u'\u2019'.encode("utf-8")
’
In [9]: print str(u'\u2019')
---------------------------------------------------------------------------
UnicodeEncodeError Traceback (most recent call last)
<ipython-input-9-4e3ad09ee31b> in <module>()
----> 1 print str(u'\u2019')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u2019' in position 0: ordinal not in range(128)
如果您混合使用字符串,列表和值,则可以使用 issinstance 来检查您拥有的内容,如果您有一个列表迭代并编码:
with open("book_data.csv", "wb") as f:
wr = csv.writer(f)
for data in values:
for key, value in data.iteritems():
wr.writerow([key, ",".join([v.encode("utf-8") for v in value]) if isinstance(value, list) else value.encode("utf8")])
要只编写三列creator, contributor
和identifier
,只需使用键提取数据:
import csv
with open("book_data.csv", "wb") as f:
wr = csv.writer(f)
for dct in values:
authors = dct["authors"]
wr.writerow((",".join(authors["creator"]).encode("utf-8"),
"".join(authors["contributor"]).encode("utf-8"),
dct["identifier"].encode("utf-8")))