编码.265文件时出现问题。将它们拆分为NAL单元的Python脚本会产生UnicodeDecodeError

时间:2016-04-27 12:08:15

标签: python encoding

在一个项目上工作时,我已经死了。

每当我尝试使用参数

执行以下python脚本时
-i Bitstreams/BasketballDrive.265

https://gist.github.com/anonymous/5393d6ec4d2c7f8431e2a97fd750a76d

比特流/ BasketballDrive.265是一个编码的视频文件,我得到一个UnicodeDecodeError

Traceback (most recent call last):
  File "C:/Users/Mathieu/Documents/Deel-4--Video-3/extractor.py", line 84, in <module>
    main()
  File "C:/Users/Mathieu/Documents/Deel-4--Video-3/extractor.py", line 79, in main
    extractLayers(args['inputFile'], args['outputFile'], args['temporalLayer'])
  File "C:/Users/Mathieu/Documents/Deel-4--Video-3/extractor.py", line 17, in extractLayers
    gesplit = split_file(voorsplit, "0x00".encode("cp1252"))
  File "C:/Users/Mathieu/Documents/Deel-4--Video-3/extractor.py", line 41, in split_file
    for block in iter(lambda: fp.read(BLOCKSIZE), ''):
  File "C:/Users/Mathieu/Documents/Deel-4--Video-3/extractor.py", line 41, in <lambda>
    for block in iter(lambda: fp.read(BLOCKSIZE), ''):
  File "C:\Users\Mathieu\AppData\Local\Programs\Python\Python35-32\lib\encodings\cp1252.py", line 23, in decode
    return codecs.charmap_decode(input,self.errors,decoding_table)[0]
UnicodeDecodeError: 'charmap' codec can't decode byte 0x8d in position 192: character maps to <undefined>

(生成错误时未指定open(INPUTFILENAME)上的编码)

如果我包括

sys.getdefaultencoding()

我得到了

>>> utf-8

encoding="utf-8添加open(INPUTFILENAME)也不起作用。

Python版本:3.5

Windows版本:W8.1

1 个答案:

答案 0 :(得分:1)

以二进制模式打开文件;

open(INPUTFILENAME, 'rb')

默认情况下,Python 3以文本模式打开文件。这意味着在阅读时尝试从内容中创建str。这通常不是您想要对二进制文件执行的操作。