Question

我正在尝试解码，然后解析一个大约9MB的JSON文件。但是当我尝试解码json文件时，为了使它成为python字典对象，我得到错误：

'utf8'编解码器无法解码位置3161744-3161747中的字节：无效数据

我认为这可能是因为编码/解码问题，但我并不完全确定。我不知道文件的编码是什么，因为我是从第三方获取的，不幸的是我无法显示该文件，因为它包含敏感信息。

此外，提供JSON文件的人说它是一个有效的JSON文件并传递json lint。以下是我的代码：

import json

""" JSON Parser """
class parser:
    json_file = None

    """ The JSON File name"""
    def json_object(self, file):
        self.json_file = file

    """ Open up file and parse it """
    def json_encode(self):
        try:
            json_data = open(self.json_file)
            data = json_data.read().decode('utf8')
            result = json.loads(data)
        except Exception as e:
            result = e
        return result

""" Instantiate parser and begin parsing the file"""
p = parser()
p.json_object('file.js')
print p.json_encode()

Answer 1

我不认为你应该在读取它之前解码utf-8.Json应该对编码透明，因为你可能在json中有一些字符串是utf-8而其他字符串是latin-9等等。试试：

json.load(open(self.json_file))

Python JSON解码部分文件

1 个答案: