使用python写入excel时出现UnicodeDecodeError

时间:2016-06-28 20:59:10

标签: python excel pandas

我尝试使用

将add_sheet添加到exccel文件中

df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно'.decode('utf-8'), path='{}.xlsx'.format(x.name))) 功能

def add_xlsx_sheet(df, sheet_name=u'Смартфоны кратко', index=True, digits=2, path=None):
book = load_workbook(path)
writer = pd.ExcelWriter(path, engine='openpyxl')
writer.book = book
writer.sheets = dict((ws.title, ws) for ws in book.worksheets)
if sheet_name in list(writer.sheets.keys()):
    sh = book.get_sheet_by_name(sheet_name)
    book.remove_sheet(sh)
df.to_excel(excel_writer=writer, sheet_name=sheet_name, startrow=0, startcol=0,
            float_format='%.{}f'.format(digits), index=index, encoding='utf-8')
writer.save()

并收到错误

Traceback (most recent call last): File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 142, in <module> df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно'.decode('utf-8'), path='{}.xlsx'.format(x.name))) File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 651, in apply return self._python_apply_general(f) File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 655, in _python_apply_general self.axis) File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 1527, in apply res = f(group) File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 647, in f return func(g, *args, **kwargs) File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 142, in <lambda> df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно'.decode('utf-8'), path='{}.xlsx'.format(x.name))) File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 137, in add_xlsx_sheet float_format='%.{}f'.format(digits), index=index) File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 1425, in to_excel startrow=startrow, startcol=startcol) File "C:\Python27\lib\site-packages\pandas\io\excel.py", line 1257, in write_cells xcell.value = _conv_value(cell.val) File "C:\Python27\lib\site-packages\openpyxl\cell\cell.py", line 291, in value self._bind_value(value) File "C:\Python27\lib\site-packages\openpyxl\cell\cell.py", line 190, in _bind_value value = self.check_string(value) File "C:\Python27\lib\site-packages\openpyxl\cell\cell.py", line 149, in check_string value = unicode(value, self.encoding) UnicodeDecodeError: 'utf8' codec can't decode byte 0xc4 in position 0: invalid continuation byte

为什么会这样? 但是当我尝试add_sheet而不重写文件时

df1.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы кратко', path='{}.xlsx'.format(x.name)))
df.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы полно', path='{}.xlsx'.format(x.name)))

它返回错误

Traceback (most recent call last):
  File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 141, in <module>
    df1.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы кратко', path='{}.xlsx'.format(x.name)))
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 651, in apply
    return self._python_apply_general(f)
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 655, in _python_apply_general
self.axis)
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 1527, in apply
res = f(group)
  File "C:\Python27\lib\site-packages\pandas\core\groupby.py", line 647, in f
    return func(g, *args, **kwargs)
   File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 141, in <lambda>
    df1.groupby('member_id').apply(lambda x: add_xlsx_sheet(x, u'Десктопы кратко', path='{}.xlsx'.format(x.name)))
  File "C:/Users/�����/PycharmProjects/14-27/desktop.py", line 138, in add_xlsx_sheet
    writer.save()
  File "C:\Python27\lib\site-packages\pandas\io\excel.py", line 732, in save
return self.book.save(self.path)
  File "C:\Python27\lib\site-packages\openpyxl\workbook\workbook.py", line 294, in save
    save_workbook(self, filename)
  File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 270, in save_workbook
    writer.save(filename)
  File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 251, in save
self.write_data()
  File "C:\Python27\lib\site-packages\openpyxl\writer\excel.py", line 94, in write_data
archive.writestr(ARC_WORKBOOK, write_workbook(self.workbook))
  File "C:\Python27\lib\site-packages\openpyxl\writer\workbook.py", line 85, in write_workbook
active = get_active_sheet(wb)
  File "C:\Python27\lib\site-packages\openpyxl\writer\workbook.py", line 59, in get_active_sheet
sheet = wb.active
  File "C:\Python27\lib\site-packages\openpyxl\workbook\workbook.py", line 115, in active
return self._sheets[self._active_sheet_index]
IndexError: list index out of range

1 个答案:

答案 0 :(得分:0)

由于代码段

,错误正在发生
u'Десктопы полно'.decode('utf-8')

前缀&#39; u&#39;使字符串成为 Unicode字符串。 无论如何,Unicode字符串实际上并未编码,并且已经处于解码形式。

例如,

>>> s='Десктопы полно'
>>> u=u'Десктопы полно'
>>> s
'\xd0\x94\xd0\xb5\xd1\x81\xd0\xba\xd1\x82\xd0\xbe\xd0\xbf\xd1\x8b \xd0\xbf\xd0\xbe\xd0\xbb\xd0\xbd\xd0\xbe'
>>> u
u'\u0414\u0435\u0441\u043a\u0442\u043e\u043f\u044b \u043f\u043e\u043b\u043d\u043e'
>>> s.decode('utf-8')
u'\u0414\u0435\u0441\u043a\u0442\u043e\u043f\u044b \u043f\u043e\u043b\u043d\u043e'
>>> u.encode('utf-8')
'\xd0\x94\xd0\xb5\xd1\x81\xd0\xba\xd1\x82\xd0\xbe\xd0\xbf\xd1\x8b \xd0\xbf\xd0\xbe\xd0\xbb\xd0\xbd\xd0\xbe'

我们可以看到 s == u.encode(&#39; utf-8&#39;)

有关原因的进一步详细说明,您可以浏览http://pythoncentral.io/python-unicode-encode-decode-strings-python-2x/

因此,基本上,Unicode字符串必须进行编码而不是解码,即

u'Десктопы полно'.encode('utf-8')