Python无法解码json文件,虽然它似乎有效

时间:2015-03-12 11:57:01

标签: python json

我正在尝试使用以下代码加载和读取json文件:

try:
    json_data = open('sample3.json')
    data = load(json_data)
    json_data.close()
    insert_data(data)
except Exception as e:
    print "Finished with error %s" % (repr(e))

这是Json文件:

{"competitions":
    [
    {"name":"Premiership","nation":"ENG","id":32711,"matches": 
        [
        {"id":7245940,"when":"28.02.2015 12:45",
            "home_team": {"id":430934, "name":"West Ham"},
            "away_team": {"id":430936, "name":"Crystal Palace"},
            "played":1,
            "play_off":0,
            "round":27
                ,"score":{"t1_score":1,"t2_score":3 },
            "score_ht":{"t1_score":0,"t2_score":1}
        }
        ]
    }
    ]
}

这是我得到的错误: Finished with error ValueError('No JSON object could be decoded',)

我在JSONlint中尝试了文件并且说它有效。

我做错了什么?

更新:这是print repr(json_data.read())

的输出
'\xef\xbb\xbf{"competitions":\n    [\n    {"name":"Premiership","nation":"ENG","id":32711,"matches": \n        [\n        {"id":7245940,"when":"28.02.2015 12:45",\n            "home_team": {"id":430934, "name":"West Ham"},\n            "away_team": {"id":430936, "name":"Crystal Palace"},\n            "played":1,\n            "play_off":0,\n            "round":27\n                ,"score":{"t1_score":1,"t2_score":3 },\n            "score_ht":{"t1_score":0,"t2_score":1}\n        }\n        ]\n    }\n    ]\n}\n'
Finished with error ValueError('No JSON object could be decoded',)

1 个答案:

答案 0 :(得分:5)

您的JSON文件以UTF-8 BOM (Byte Order Mark)字符开头; JSON不支持这样的角色。它通常由Microsoft工具(如记事本)添加,以检测编码,但由于没有字节顺序变化,字符在UTF-8中带有无意义

您必须直接跳过这些字节,因为即使使用utf-8-sig编码也无法提供帮助。

您可以使用codecs.BOM_UTF8来检测它:

import codecs

with open('sample3.json') as json_data:
    bom_maybe = json_data.read(3)
    if bom_maybe != codecs.BOM_UTF8:
        # no BOM at the start, rewind
        json_data.seek(0)
    data = load(json_data)
insert_data(data)

或者,使用io.open()加载和解码数据,然后再将其传递给json.loads()

import io

with io.open('sample3.json', encoding='utf-8-sig') as json_data:
    data = json.loads(json_data.read())

演示:

>>> import codecs
>>> import json
>>> open('/tmp/test.json', 'wb').write('\xef\xbb\xbf{"competitions":\n    [\n    {"name":"Premiership","nation":"ENG","id":32711,"matches": \n        [\n        {"id":7245940,"when":"28.02.2015 12:45",\n            "home_team": {"id":430934, "name":"West Ham"},\n            "away_team": {"id":430936, "name":"Crystal Palace"},\n            "played":1,\n            "play_off":0,\n            "round":27\n                ,"score":{"t1_score":1,"t2_score":3 },\n            "score_ht":{"t1_score":0,"t2_score":1}\n        }\n        ]\n    }\n    ]\n}\n')
>>> with open('/tmp/test.json') as json_data:
...     bom_maybe = json_data.read(3)
...     if bom_maybe != codecs.BOM_UTF8:
...         json_data.seek(0)
...     data = json.load(json_data)
... 
>>> data
{u'competitions': [{u'id': 32711, u'matches': [{u'score_ht': {u't2_score': 1, u't1_score': 0}, u'home_team': {u'id': 430934, u'name': u'West Ham'}, u'away_team': {u'id': 430936, u'name': u'Crystal Palace'}, u'played': 1, u'when': u'28.02.2015 12:45', u'round': 27, u'score': {u't2_score': 3, u't1_score': 1}, u'play_off': 0, u'id': 7245940}], u'name': u'Premiership', u'nation': u'ENG'}]}
>>> with io.open('/tmp/test.json', encoding='utf-8-sig') as json_data:
...     data = json.loads(json_data.read())
... 
>>> data
{u'competitions': [{u'id': 32711, u'matches': [{u'score_ht': {u't2_score': 1, u't1_score': 0}, u'home_team': {u'id': 430934, u'name': u'West Ham'}, u'away_team': {u'id': 430936, u'name': u'Crystal Palace'}, u'played': 1, u'when': u'28.02.2015 12:45', u'round': 27, u'score': {u't2_score': 3, u't1_score': 1}, u'play_off': 0, u'id': 7245940}], u'name': u'Premiership', u'nation': u'ENG'}]}