我有一个350mb的大Json文件,我想从中提取项目。我使用的代码是:
with open("commitsJson3.json","r", encoding="utf-8-sig") as json_file:
data = json.load(json_file)
for elem in data['items']:
for e in elem['commit']:
if 'message' in e:
print(elem['commit'][e])
我得到的错误是:
json.decoder.JSONDecodeError:预期值:第1行第2180列(字符2179)
我去了特定的栏目和行,但没有发现任何问题。我试图通过一些在线验证器来验证我的json,但由于它太大而崩溃。我可以向您展示一些示例,但是太大了,希望您理解。
{“ total_count”:3,“ incomplete_results”:“ False”,“ items”:c“ site_admin”:False},“ committer”:{“ login”:“ acosding”,“ id”:1539,“ node_id“:” ASJKDHASAD“,” avatar_url“:” https://gits-5.s.fe.se/avatars/u/1329?“,” gravatar_id“:”“,” url“:” https://gits-5.s.fe.se/api/v3/users/acollden“,” html_url“:” https://gits-5.s.fe.se/acollden“ ,“ followers_url”:“ https://https://gits-5.s.fe.se/api/v3/users/acollden/followers”,“ following_url”:“ https://gits-5.s.fe.se/api/v3/users/acollden/following {/ other_user}”,“ gists_url”:“ https://gits-5.s.fe.se/api/v3/users/acollden/gists {/ gist_id}”,“ starred_url”:“ https://https://gits-5.s.fe.se/api/v3/users/acollden/starred {/ owner} {/ repo}”,“ subscriptions_url”:“ https://https://gits-5.s.fe.se/api/v3/users/acollden/subscriptions”,“ organizations_url”:“ https://gits-5.s.fe.se/api/v3/users/acollden/orgs”,“ repos_url”:“ https://https://gits-5.s.fe.se/api/v3/users/acollden/repos” ,“ events_url”:“ https://https://gits-5.s.fe.se/api/v3/users/acollden/events {/ privacy}”,“ received_events_url”:“ https://https://gits-5.s.fe.se/api/v3/users/acollden/received_events”,“ type”:“ User”
任何帮助将不胜感激的是,如果Json文件是一个如此大的文件等,是否存在如何验证它的问题。
谢谢。
答案 0 :(得分:2)
据我所知,如果我尝试提供的示例格式不正确, 仅解码第一部分:
my_json = '{"total_count": 3, "incomplete_results": "False", "items": c "site_admin": False}'
然后我尝试解析它,我得到:
import json
json.loads(my_json, encoding='utf-8-sig')
>>> JSONDecodeError: Expecting value: line 1 column 60 (char 59)
指的是c
缺少引号,然后如果我解决此问题:
my_json = '{"total_count": 3, "incomplete_results": "False", "items": "c" "site_admin": False}'
print(json.loads(my_json, encoding='utf-8-sig'))
>>> JSONDecodeError: Expecting ',' delimiter: line 1 column 64 (char 63)
,指的是,
键之后丢失的items
。解决此问题后:
my_json = '{"total_count": 3, "incomplete_results": "False", "items": "c", "site_admin": False}'
print(json.loads(my_json, encoding='utf-8-sig'))
>>> JSONDecodeError: Expecting value: line 1 column 79 (char 78)
,指的是最后一个False
。可以通过使用false
或"False"
来解决此问题,具体取决于您希望处理的类型。
但考虑到您的第一个False被视为字符串:
my_json = '{"total_count": 3, "incomplete_results": "False", "items": "c", "site_admin": "False"}'
print(json.loads(my_json, encoding='utf-8-sig'))
>>> {'items': 'c', 'total_count': 3, 'site_admin': 'False', 'incomplete_results': 'False'}
最后成功了