Python UnicodeDecodeError:' utf-8'无法解码字节0x81

时间:2014-07-04 21:39:59

标签: python utf-8

我正在尝试使用arelle来阅读zip填充的XBRL文件。

这是通过命令:

完成的
C:\a>python arelleCmdLine.py -f C:\Python33\sec\2010\03\0000002809-0001047469-10
-002778-xbrl.zip

我收到UnicodeDecodeError

C:\a>python arelleCmdLine.py -f C:\Python33\sec\2010\03\0000002809-0001047469-10
-002778-xbrl.zip
[xmlSchema:syntax] Unrecoverable error: 'utf-8' codec can't decode byte 0x81 in
position 11: invalid start byte, 0000002809-0001047469-10-002778-xbrl.zip, impor
ting source element - 0000002809-0001047469-10-002778-xbrl.zip
Traceback (most recent call last):
  File "C:\a\arelle\ModelDocument.py", line 131, in load
    xmlDocument = etree.parse(file,parser=_parser,base_url=filepath)
  File "lxml.etree.pyx", line 3239, in lxml.etree.parse (src\lxml\lxml.etree.c:6
9970)
  File "parser.pxi", line 1770, in lxml.etree._parseDocument (src\lxml\lxml.etre
e.c:102272)
  File "parser.pxi", line 1790, in lxml.etree._parseFilelikeDocument (src\lxml\l
xml.etree.c:102531)
  File "parser.pxi", line 1685, in lxml.etree._parseDocFromFilelike (src\lxml\lx
ml.etree.c:101457)
  File "parser.pxi", line 1134, in lxml.etree._BaseParser._parseDocFromFilelike
(src\lxml\lxml.etree.c:97084)
  File "parser.pxi", line 582, in lxml.etree._ParserContext._handleParseResultDo
c (src\lxml\lxml.etree.c:91290)
  File "parser.pxi", line 679, in lxml.etree._handleParseResult (src\lxml\lxml.e
tree.c:92441)
  File "lxml.etree.pyx", line 327, in lxml.etree._ExceptionContext._raise_if_sto
red (src\lxml\lxml.etree.c:10196)
  File "parser.pxi", line 373, in lxml.etree._FileReaderContext.copyToBuffer (sr
c\lxml\lxml.etree.c:89098)
  File "C:\Python33\lib\codecs.py", line 301, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x81 in position 11: invalid
 start byte

它有utf-8编码及其代表的字符,但我无法弄清楚应该怎么做。 我发现了一些guide,但没有帮助我解决这个问题。

2 个答案:

答案 0 :(得分:0)

问题是因为程序要求解析不是整个Zip文件夹而是要解析位于zip文件夹子目录中的特定文件(在本例中为实例文件夹)。

要访问zip目录:

If our file inside the zip directory is 1.xml
C:\a>python arelleCmdLine.py -f C:\Python33\sec\2010\03\0000002809-0001047469-10
-002778-xbrl.zip\1.xml

判决:

由于上述原因导致UnicodeDecodeError: 'utf-8' cant decode byte 0x81

答案 1 :(得分:-1)

根据unicode字节数据库,快速搜索后,行为不当的字节似乎是Ctrl密钥。由于Ctrl的外观仅作为haxi数字存在而且没有自己的字母,我认为utf无法将其打印为可见字符,因此出现上述错误。