如果只存在密钥,我如何迭代嵌套的JSON数据,提取值?
使用Twitter API,如果推文包含主题标签,则它嵌套在推文中 - >实体 - > hashtags - > 0(这是改变的位) - >文本。
如果推文有超过1个#标签,那么API会在hastags中创建一个新的密钥,从0开始。所以你最终可能会发布推文 - >实体 - > hashtags:{0:foo},{1:bar},{2:sup}等等。
{
"_id" : ObjectId("593adcb1b27be5eb5daa7e66"),
"created_at" : "Fri Jun 09 14:54:55 +0000 2017",
"id" : NumberLong(873191685915906049),
"id_str" : "873191685915906049",
"text" : "RT @NiamhMannion_: A beautiful day in Kinvara! Really impressed by the delicious produce @kinvaramarket #FoodieHeaven URL",
"truncated" : false,
"entities" : {
"hashtags" : [
{
"text" : "FoodieHeaven",
"indices" : [
104,
117
]
}
],
"symbols" : [],
"user_mentions" : [
{
"screen_name" : "NiamhMannion_",
"name" : "Niamh Mannion",
"id" : NumberLong(2178812961),
"id_str" : "2178812961",
"indices" : [
3,
17
]
},
{
"screen_name" : "kinvaramarket",
"name" : "KinvaraFarmersMarket",
"id" : NumberLong(3064922836),
"id_str" : "3064922836",
"indices" : [
89,
103
]
}
],
"urls" : []
},
"source" : "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>",
"in_reply_to_status_id" : null,
"in_reply_to_status_id_str" : null,
"in_reply_to_user_id" : null,
"in_reply_to_user_id_str" : null,
"in_reply_to_screen_name" : null,
"user" : {
"id" : 37690003,
"id_str" : "37690003",
"name" : "Gabriela Guedez H",
"screen_name" : "GabyGuedezH",
"location" : "Ireland",
"description" : "Award-wining food and drinks journalist at @TheTaste_ie Passionate about wines, whiskey, rum, spirits and craft beer. WSET Certified",
"url" : "URL",
"entities" : {
"url" : {
"urls" : [
{
"url" : "URL",
"expanded_url" : "URL",
"display_url" : "URL",
"indices" : [
0,
22
]
}
]
},
"description" : {
"urls" : []
}
},
"protected" : false,
"followers_count" : 7900,
"friends_count" : 5047,
"listed_count" : 159,
"created_at" : "Mon May 04 16:02:34 +0000 2009",
"favourites_count" : 8905,
"utc_offset" : 3600,
"time_zone" : "Dublin",
"geo_enabled" : true,
"verified" : false,
"statuses_count" : 12727,
"lang" : "en",
"contributors_enabled" : false,
"is_translator" : false,
"is_translation_enabled" : false,
"profile_sidebar_border_color" : "5ED4DC",
"profile_sidebar_fill_color" : "95E8EC",
"profile_text_color" : "3C3940",
"profile_use_background_image" : true,
"has_extended_profile" : false,
"default_profile" : false,
"default_profile_image" : false,
"following" : true,
"follow_request_sent" : false,
"notifications" : false,
"translator_type" : "none"
},
"geo" : null,
"coordinates" : null,
"place" : null,
"contributors" : null,
"is_quote_status" : false,
"retweet_count" : 4,
"favorite_count" : 14,
"favorited" : true,
"retweeted" : true,
"possibly_sensitive" : false,
"lang" : "en",
"has_hashtags" : false,
"is_retweet" : false
}
我想获取所有现有的标签,提取文本并将其存储在我的新词典中。
for tweet in tweets:
thistweet = {
'text': tweet.text,
'created_at': str(tweet.created_at),
'retweet_count': tweet.retweet_count,
'favorite_count': tweet.favorite_count,
'geo': tweet.geo,
'coordinates': tweet.coordinates
}
for i in tweet[entities][hashtags][i]:
thistweet = thistweet.update({'hashtag'[i]: tweet.entities.hashtags.i.text})
这段代码不起作用。我得到'dict'对象没有属性'hashtags'。
我不完全确定如何开始尝试解决这个问题,说实话。
答案 0 :(得分:1)
试试这个,
for tweet in tweets:
thistweet = {
'text': tweet.text,
'created_at': str(tweet.created_at),
'retweet_count': tweet.retweet_count,
'favorite_count': tweet.favorite_count,
'geo': tweet.geo,
'coordinates': tweet.coordinates,
'hashtags':{}
}
for index, item in enumerate(tweet.entities.hashtags):
try:
thistweet = thistweet['hashtags'].update({item:tweet.entities.hashtags[index]['text']})
except:
pass