我是json的完全初学者,因此我正在努力完成从通过tweepy收到的json文件中提取twitter screen_names的简单任务。
尝试使用json.loads(file)
加载文件会返回以下错误:
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
File "\Python3.5\lib\json\__init__.py", line 268, in load parse_constant=parse_constant, object_pairs_hook=object_pairs_hook, **kw)
File "\lib\json\__init__.py", line 319, in loads return _default_decoder.decode(s)
File "D:\Programme\Python3.5\lib\json\decoder.py", line 342, in decode raise JSONDecodeError("Extra data", s, end)
json.decoder.JSONDecodeError: Extra data: line 2 column 1 (char 7250)`
ExtraData错误究竟是什么意思?它的格式不正确吗?
json文件中的片段(json文件由大约7000行组成):
{"in_reply_to_user_id_str": null, "source": "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>", "retweeted": false, "in_reply_to_screen_name": null, "geo": null, "contributors": null, "in_reply_to_status_id": null, "retweet_count": 0, "coordinates": null, "id": 802931232334114816, "lang": "de", "id_str": "802931232334114816", "possibly_sensitive": false, "text": "39'788 mal #Danke!\nDie Medienmitteilung zum Resultat der #Regierungsratswahlen finden Sie hier: #WahlAG16 #rrag16", "in_reply_to_status_id_str": null, "favorite_count": 2, "in_reply_to_user_id": null, "favorited": false, "entities": {"urls": [{"display_url": "goo.gl/CkKjHO", "expanded_url": "shortened", "url": "shortened", "indices": [96, 119]}], "hashtags": [{"indices": [11, 17], "text": "Danke"}, {"indices": [57, 78], "text": "Regierungsratswahlen"}, {"indices": [121, 130], "text": "WahlAG16"}, {"indices": [131, 138], "text": "rrag16"}], "symbols": [], "user_mentions": []}, "is_quote_status": false, "created_at": "Sun Nov 27 17:44:58 +0000 2016", "place": null, "truncated": false, "user": {"profile_banner_url": "pbs.twimg.com/profile_banners/85668573/1480374176", "listed_count": 38, "friends_count": 237, "geo_enabled": true, "profile_background_tile": true, "protected": false, "default_profile": false, "profile_link_color": "0084B4", "favourites_count": 359, "has_extended_profile": false, "screen_name": "aargauer_bdp", "translator_type": "none", "default_profile_image": false, "profile_image_url": "http://pbs.twimg.com/profile_images/493457946/bdp_normal.png", "description": "Es zwitschern f\u00fcr Sie @BernhardGuhl und @PhTschopp", "url": "shortened", "follow_request_sent": false, "profile_sidebar_border_color": "D9D9D9", "contributors_enabled": false, "id": 85668573, "lang": "de", "profile_background_image_url": "http://pbs.twimg.com/profile_background_images/48362618/bgbdp.png", "id_str": "85668573", "profile_use_background_image": false, "profile_text_color": "333333", "time_zone": "Bern", "is_translation_enabled": false, "location": "Kanton Aargau", "name": "BDP Kanton Aargau", "profile_background_image_url_https": "pbs.twimg.com/profile_background_images/48362618/bgbdp.png", "utc_offset": 3600, "following": false, "verified": false, "profile_image_url_https": "pbs.twimg.com/profile_images/493457946/bdp_normal.png", "entities": {"description": {"urls": []}, "url": {"urls": [{"display_url": "aargauer-bdp.ch", "expanded_url": "http://www.aargauer-bdp.ch", "url": "shortened", "indices": [0, 22]}]}}, "statuses_count": 425, "followers_count": 368, "is_translator": false, "profile_sidebar_fill_color": "EBEBEB", "profile_background_color": "FFE640", "notifications": false, "created_at": "Tue Oct 27 21:35:17 +0000 2009"}}
编辑: 我目前的想法是使用遵循这种结构的命令:
import json
with open("file.json", "r") as f:
for line in f:
json.loads(line)
print MISSING CODE TO ACCESS USER ENTITY WITH NAME and ID
这看起来像是追求正确的想法吗?