我以前看过这个问题,但是答案并没有帮助我,所以我将在这里给出答案:
我正在使用Twython从Twitter API传输实时推文。这些推文通过以下方式附加到json文件中:
with open('fetched_tweets.json','a') as tf:
y = json.dump(data,tf)
我的问题:
这是来自twitter API的JSON响应的示例:
{
"text": "RT @PostGradProblem: In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during ...",
"truncated": true,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://twitter.com/\" rel=\"nofollow\">Twitter for iPhone</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54691802283900928",
"entities": {
"user_mentions": [
{
"indices": [
3,
19
],
"screen_name": "PostGradProblem",
"id_str": "271572434",
"name": "PostGradProblems",
"id": 271572434
}
],
"urls": [ ],
"hashtags": [ ]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 23:48:36 +0000 2011",
"retweeted_status": {
"text": "In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during company time. #PGP",
"truncated": false,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://www.hootsuite.com\" rel=\"nofollow\">HootSuite</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54640519019642881",
"entities": {
"user_mentions": [ ],
"urls": [ ],
"hashtags": [
{
"text": "PGP",
"indices": [
130,
134
]
}
]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 20:24:49 +0000 2011",
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 31,
"profile_background_color": "C0DEED",
"followers_count": 3066,
"profile_image_url": "http://a2.twimg.com/profile_images/1285770264/PGP_normal.jpg",
"listed_count": 6,
"profile_background_image_url": "http://a3.twimg.com/a/1301071706/images/themes/theme1/bg.png",
"description": "",
"screen_name": "PostGradProblem",
"default_profile": true,
"verified": false,
"time_zone": null,
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "",
"id_str": "271572434",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 21,
"protected": false,
"favourites_count": 0,
"created_at": "Thu Mar 24 19:45:44 +0000 2011",
"profile_link_color": "0084B4",
"name": "PostGradProblems",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 271572434,
"contributors_enabled": false,
"following": null,
"utc_offset": null
},
"id": 54640519019642880,
"coordinates": null,
"geo": null
},
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 351,
"profile_background_color": "C0DEED",
"followers_count": 48,
"profile_image_url": "http://a1.twimg.com/profile_images/455128973/gCsVUnofNqqyd6tdOGevROvko1_500_normal.jpg",
"listed_count": 0,
"profile_background_image_url": "http://a3.twimg.com/a/1300479984/images/themes/theme1/bg.png",
"description": "watcha doin in my waters?",
"screen_name": "OldGREG85",
"default_profile": true,
"verified": false,
"time_zone": "Hawaii",
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "Texas",
"id_str": "80177619",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 81,
"protected": false,
"favourites_count": 0,
"created_at": "Tue Oct 06 01:13:17 +0000 2009",
"profile_link_color": "0084B4",
"name": "GG",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 80177619,
"contributors_enabled": false,
"following": null,
"utc_offset": -36000
},
"id": 54691802283900930,
"coordinates": null,
"geo": null
} {
"text": "RT @PostGradProblem: In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during ...",
"truncated": true,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://twitter.com/\" rel=\"nofollow\">Twitter for iPhone</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54691802283900928",
"entities": {
"user_mentions": [
{
"indices": [
3,
19
],
"screen_name": "PostGradProblem",
"id_str": "271572434",
"name": "PostGradProblems",
"id": 271572434
}
],
"urls": [ ],
"hashtags": [ ]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 23:48:36 +0000 2011",
"retweeted_status": {
"text": "In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during company time. #PGP",
"truncated": false,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://www.hootsuite.com\" rel=\"nofollow\">HootSuite</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54640519019642881",
"entities": {
"user_mentions": [ ],
"urls": [ ],
"hashtags": [
{
"text": "PGP",
"indices": [
130,
134
]
}
]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 20:24:49 +0000 2011",
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 31,
"profile_background_color": "C0DEED",
"followers_count": 3066,
"profile_image_url": "http://a2.twimg.com/profile_images/1285770264/PGP_normal.jpg",
"listed_count": 6,
"profile_background_image_url": "http://a3.twimg.com/a/1301071706/images/themes/theme1/bg.png",
"description": "",
"screen_name": "PostGradProblem",
"default_profile": true,
"verified": false,
"time_zone": null,
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "",
"id_str": "271572434",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 21,
"protected": false,
"favourites_count": 0,
"created_at": "Thu Mar 24 19:45:44 +0000 2011",
"profile_link_color": "0084B4",
"name": "PostGradProblems",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 271572434,
"contributors_enabled": false,
"following": null,
"utc_offset": null
},
"id": 54640519019642880,
"coordinates": null,
"geo": null
},
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 351,
"profile_background_color": "C0DEED",
"followers_count": 48,
"profile_image_url": "http://a1.twimg.com/profile_images/455128973/gCsVUnofNqqyd6tdOGevROvko1_500_normal.jpg",
"listed_count": 0,
"profile_background_image_url": "http://a3.twimg.com/a/1300479984/images/themes/theme1/bg.png",
"description": "watcha doin in my waters?",
"screen_name": "OldGREG85",
"default_profile": true,
"verified": false,
"time_zone": "Hawaii",
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "Texas",
"id_str": "80177619",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 81,
"protected": false,
"favourites_count": 0,
"created_at": "Tue Oct 06 01:13:17 +0000 2009",
"profile_link_color": "0084B4",
"name": "GG",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 80177619,
"contributors_enabled": false,
"following": null,
"utc_offset": -36000
},
"id": 54691802283900930,
"coordinates": null,
"geo": null
} {{
"text": "RT @PostGradProblem: In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during ...",
"truncated": true,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://twitter.com/\" rel=\"nofollow\">Twitter for iPhone</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54691802283900928",
"entities": {
"user_mentions": [
{
"indices": [
3,
19
],
"screen_name": "PostGradProblem",
"id_str": "271572434",
"name": "PostGradProblems",
"id": 271572434
}
],
"urls": [ ],
"hashtags": [ ]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 23:48:36 +0000 2011",
"retweeted_status": {
"text": "In preparation for the NFL lockout, I will be spending twice as much time analyzing my fantasy baseball team during company time. #PGP",
"truncated": false,
"in_reply_to_user_id": null,
"in_reply_to_status_id": null,
"favorited": false,
"source": "<a href=\"http://www.hootsuite.com\" rel=\"nofollow\">HootSuite</a>",
"in_reply_to_screen_name": null,
"in_reply_to_status_id_str": null,
"id_str": "54640519019642881",
"entities": {
"user_mentions": [ ],
"urls": [ ],
"hashtags": [
{
"text": "PGP",
"indices": [
130,
134
]
}
]
},
"contributors": null,
"retweeted": false,
"in_reply_to_user_id_str": null,
"place": null,
"retweet_count": 4,
"created_at": "Sun Apr 03 20:24:49 +0000 2011",
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 31,
"profile_background_color": "C0DEED",
"followers_count": 3066,
"profile_image_url": "http://a2.twimg.com/profile_images/1285770264/PGP_normal.jpg",
"listed_count": 6,
"profile_background_image_url": "http://a3.twimg.com/a/1301071706/images/themes/theme1/bg.png",
"description": "",
"screen_name": "PostGradProblem",
"default_profile": true,
"verified": false,
"time_zone": null,
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "",
"id_str": "271572434",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 21,
"protected": false,
"favourites_count": 0,
"created_at": "Thu Mar 24 19:45:44 +0000 2011",
"profile_link_color": "0084B4",
"name": "PostGradProblems",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 271572434,
"contributors_enabled": false,
"following": null,
"utc_offset": null
},
"id": 54640519019642880,
"coordinates": null,
"geo": null
},
"user": {
"notifications": null,
"profile_use_background_image": true,
"statuses_count": 351,
"profile_background_color": "C0DEED",
"followers_count": 48,
"profile_image_url": "http://a1.twimg.com/profile_images/455128973/gCsVUnofNqqyd6tdOGevROvko1_500_normal.jpg",
"listed_count": 0,
"profile_background_image_url": "http://a3.twimg.com/a/1300479984/images/themes/theme1/bg.png",
"description": "watcha doin in my waters?",
"screen_name": "OldGREG85",
"default_profile": true,
"verified": false,
"time_zone": "Hawaii",
"profile_text_color": "333333",
"is_translator": false,
"profile_sidebar_fill_color": "DDEEF6",
"location": "Texas",
"id_str": "80177619",
"default_profile_image": false,
"profile_background_tile": false,
"lang": "en",
"friends_count": 81,
"protected": false,
"favourites_count": 0,
"created_at": "Tue Oct 06 01:13:17 +0000 2009",
"profile_link_color": "0084B4",
"name": "GG",
"show_all_inline_media": false,
"follow_request_sent": null,
"geo_enabled": false,
"profile_sidebar_border_color": "C0DEED",
"url": null,
"id": 80177619,
"contributors_enabled": false,
"following": null,
"utc_offset": -36000
},
"id": 54691802283900930,
"coordinates": null,
"geo": null
}{"Another JSON response}{"Another JSON response"}
如您所见,这些推文随后被转储到json文件中,并产生多个JSON根元素错误,这使该文件无效。
我需要什么
我需要在API流推文时分离每个对象,因为之后我会将这个文件放在Bigquery中。
答案 0 :(得分:1)
我将文件附加逻辑更改为:
with open('fetched_tweets.json','a') as tf:
json.dump(data, tf)
tf.write("\n")
然后,当您读取文件时:
with open('fetched_tweets.json', 'r') as tf:
for line in tf:
data = json.loads(line)
# save the `data` dictionary to BigQuery
具有上述逻辑将确保每个json对象位于单独的行中,因此用“ \ n”分隔。