如何将key:value赋给dict迭代动态嵌套dict

时间:2017-06-12 12:00:29

标签: python json dictionary twitter

如果只存在密钥,我如何迭代嵌套的JSON数据,提取值?

使用Twitter API,如果推文包含主题标签,则它嵌套在推文中 - >实体 - > hashtags - > 0(这是改变的位) - >文本。

如果推文有超过1个#标签,那么API会在hastags中创建一个新的密钥,从0开始。所以你最终可能会发布推文 - >实体 - > hashtags:{0:foo},{1:bar},{2:sup}等等。

{ "_id" : ObjectId("593adcb1b27be5eb5daa7e66"), "created_at" : "Fri Jun 09 14:54:55 +0000 2017", "id" : NumberLong(873191685915906049), "id_str" : "873191685915906049", "text" : "RT @NiamhMannion_: A beautiful day in Kinvara! Really impressed by the delicious produce @kinvaramarket #FoodieHeaven URL", "truncated" : false, "entities" : { "hashtags" : [ { "text" : "FoodieHeaven", "indices" : [ 104, 117 ] } ], "symbols" : [], "user_mentions" : [ { "screen_name" : "NiamhMannion_", "name" : "Niamh Mannion", "id" : NumberLong(2178812961), "id_str" : "2178812961", "indices" : [ 3, 17 ] }, { "screen_name" : "kinvaramarket", "name" : "KinvaraFarmersMarket", "id" : NumberLong(3064922836), "id_str" : "3064922836", "indices" : [ 89, 103 ] } ], "urls" : [] }, "source" : "<a href=\"http://twitter.com\" rel=\"nofollow\">Twitter Web Client</a>", "in_reply_to_status_id" : null, "in_reply_to_status_id_str" : null, "in_reply_to_user_id" : null, "in_reply_to_user_id_str" : null, "in_reply_to_screen_name" : null, "user" : { "id" : 37690003, "id_str" : "37690003", "name" : "Gabriela Guedez H", "screen_name" : "GabyGuedezH", "location" : "Ireland", "description" : "Award-wining food and drinks journalist at @TheTaste_ie Passionate about wines, whiskey, rum, spirits and craft beer. WSET Certified", "url" : "URL", "entities" : { "url" : { "urls" : [ { "url" : "URL", "expanded_url" : "URL", "display_url" : "URL", "indices" : [ 0, 22 ] } ] }, "description" : { "urls" : [] } }, "protected" : false, "followers_count" : 7900, "friends_count" : 5047, "listed_count" : 159, "created_at" : "Mon May 04 16:02:34 +0000 2009", "favourites_count" : 8905, "utc_offset" : 3600, "time_zone" : "Dublin", "geo_enabled" : true, "verified" : false, "statuses_count" : 12727, "lang" : "en", "contributors_enabled" : false, "is_translator" : false, "is_translation_enabled" : false, "profile_sidebar_border_color" : "5ED4DC", "profile_sidebar_fill_color" : "95E8EC", "profile_text_color" : "3C3940", "profile_use_background_image" : true, "has_extended_profile" : false, "default_profile" : false, "default_profile_image" : false, "following" : true, "follow_request_sent" : false, "notifications" : false, "translator_type" : "none" }, "geo" : null, "coordinates" : null, "place" : null, "contributors" : null, "is_quote_status" : false, "retweet_count" : 4, "favorite_count" : 14, "favorited" : true, "retweeted" : true, "possibly_sensitive" : false, "lang" : "en", "has_hashtags" : false, "is_retweet" : false }

我想获取所有现有的标签,提取文本并将其存储在我的新词典中。

for tweet in tweets:
    thistweet = {
        'text': tweet.text,
        'created_at': str(tweet.created_at),
        'retweet_count': tweet.retweet_count,
        'favorite_count': tweet.favorite_count,
        'geo': tweet.geo,
        'coordinates': tweet.coordinates
    }
    for i in tweet[entities][hashtags][i]:
        thistweet = thistweet.update({'hashtag'[i]: tweet.entities.hashtags.i.text})

这段代码不起作用。我得到'dict'对象没有属性'hashtags'。

我不完全确定如何开始尝试解决这个问题,说实话。

1 个答案:

答案 0 :(得分:1)

试试这个,

for tweet in tweets:
    thistweet = {
        'text': tweet.text,
        'created_at': str(tweet.created_at),
        'retweet_count': tweet.retweet_count,
        'favorite_count': tweet.favorite_count,
        'geo': tweet.geo,
        'coordinates': tweet.coordinates,
        'hashtags':{}
    }
    for index, item in enumerate(tweet.entities.hashtags):
        try:
            thistweet = thistweet['hashtags'].update({item:tweet.entities.hashtags[index]['text']})
        except:
            pass