Question

我正在使用python3.3。我一直试图解码一个看起来像这样的字符串：

b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\xed:\xf9w\xdaH\xd2?\xcf\xbc....

一直在继续。但是，每当我尝试使用str.decode('utf-16')解码此字符串时，我都会收到错误消息：

'utf16' codec can't decode bytes in position 54-55: illegal UTF-16 surrogate

我不确定如何解码此字符串。

Answer 1

gzip data begins with \x1f\x8b\x08所以我的猜测是你的数据是gzip压缩的。解码前请尝试gunzipping the data。

import io
import gzip

# this raises IOError because `buf` is incomplete. It may work if you supply the complete buf
buf = b'\x1f\x8b\x08\x00\x00\x00\x00\x00\x00\x03\xed:\xf9w\xdaH\xd2?\xcf\xbc'
with gzip.GzipFile(fileobj=io.BytesIO(buf)) as f:
    content = f.read()
    print(content.decode('utf-16'))

无法解码utf-16字符串

1 个答案: