将带有俄文字符的numpy.ndarray写入文件

时间:2016-07-08 10:52:39

标签: python excel numpy pandas utf-8

我尝试将numpy.ndarray写入文件。 我用

unique1 = np.unique(df['search_term'])
unique1 = unique1.tolist()

然后尝试 1)

edf = pd.DataFrame()
edf['term'] = unique1
writer = pd.ExcelWriter(r'term.xlsx', engine='xlsxwriter')
edf.to_excel(writer)
writer.close()

和2)

thefile = codecs.open('domain.txt', 'w', encoding='utf-8')
for item in unique:
    thefile.write("%s\n" % item)

但所有人都返回UnicodeDecodeError: 'utf8' codec can't decode byte 0xd7 in position 9: invalid continuation byte

1 个答案:

答案 0 :(得分:0)

如果将字符串编码为utf8,则第二个示例应该有效。

以下在Python2中使用utf8编码文件:

# _*_ coding: utf-8

import pandas as pd

edf = pd.DataFrame()
edf['term'] = ['foo', 'bar', u'русском']

writer = pd.ExcelWriter(r'term.xlsx', engine='xlsxwriter')
edf.to_excel(writer)

writer.save()

输出:

enter image description here