如何修复与解析JSON相关的django中的TypeError?

时间:2012-10-15 00:08:13

标签: python django json

当我运行下面的代码时,我在浏览器中显示TypeError。错误出现在最后一行,并说'NoneType'对象不可订阅(我试图获取所有项目的所有网址)。然而,这很奇怪,因为在命令行中,Feed中的所有URL都会被打印出来。有关为什么项目在命令行中打印但在浏览器中显示错误的任何想法?我该如何解决这个问题?

#reddit parse
try:
    f = urllib.urlopen("http://www.reddit.com/r/videos/top/.json");
except Exception:
    print("ERROR: malformed JSON response from reddit.com")
reddit_posts = json.loads(f.read().decode("utf-8"))["data"]["children"]
reddit_feed=[]
for post in reddit_posts:
    if "oembed" in post['data']['media']:
        print post["data"]["media"]["oembed"]["url"]
        reddit_feed.append(post["data"]["media"]["oembed"]["url"])  
print reddit_feed

修改

if post["data"]["media"]["oembed"]["url"]:
    print post["data"]["media"]["oembed"]["url"]

1 个答案:

答案 0 :(得分:2)

返回的json中有media=null个帖子,因此post['data']['media']不会有oembed字段(因此url字段):

     {
        "kind" : "t3",
        "data" : {
           "downs" : 24050,
           "link_flair_text" : null,
           "media" : null,
           "url" : "http://youtu.be/aNJgX3qH148?t=4m20s",
           "link_flair_css_class" : null,
           "id" : "rymif",
           "edited" : false,
           "num_reports" : null,
           "created_utc" : 1333847562,
           "banned_by" : null,
           "name" : "t3_rymif",
           "subreddit" : "videos",
           "title" : "An awesome young man",
           "author_flair_text" : null,
           "is_self" : false,
           "author" : "Lostinfrustration",
           "media_embed" : {},
           "permalink" : "/r/videos/comments/rymif/an_awesome_young_man/",
           "author_flair_css_class" : null,
           "selftext" : "",
           "domain" : "youtu.be",
           "num_comments" : 2260,
           "likes" : null,
           "clicked" : false,
           "thumbnail" : "http://a.thumbs.redditmedia.com/xUDtCtRFDRAP5gQr.jpg",
           "saved" : false,
           "ups" : 32312,
           "subreddit_id" : "t5_2qh1e",
           "approved_by" : null,
           "score" : 8262,
           "selftext_html" : null,
           "created" : 1333847562,
           "hidden" : false,
           "over_18" : false
        }
     },

似乎您的异常消息并不适合:urlopen爆炸时可能会抛出多种异常,例如IOError。它不会检查返回的格式是否为有效的JSON,因为您的错误消息是暗示的。

现在,为了缓解这个问题,您需要检查是否"oembed" in post['data']['media'],并且只有当它可以调用post['data']['media']['oembed']['url']时,请注意我假设所有oembed blob有url(主要是因为你需要一个URL来在reddit上嵌​​入媒体)。

**更新: 也就是说,这样的事情可以解决你的问题:

for post in reddit_posts:
    if isinstance(post['data']['media'], dict) \
           and "oembed" in post['data']['media'] \
           and isinstance(post['data']['media']['oembed'], dict) \
           and 'url' in post['data']['media']['oembed']:
        print post["data"]["media"]["oembed"]["url"]
        reddit_feed.append(post["data"]["media"]["oembed"]["url"])
print reddit_feed

您遇到此错误的原因是因为某些postpost["data"]["media"]None,因此您基本上在此处调用None["oembed"]。因此错误:'NoneType' object is not subscriptable。我也意识到post['data']['media']['oembed']可能不是dict,因此如果url在其中,你还需要验证它是否为dict

更新2:

看起来data有时也不存在,所以修复:

import json
import urllib

try:
    f = urllib.urlopen("http://www.reddit.com/r/videos/top/.json")
except Exception:
    print("ERROR: malformed JSON response from reddit.com")
reddit_posts = json.loads(f.read().decode("utf-8"))

if isinstance(reddit_posts, dict) and "data" in reddit_posts \
   and isinstance(reddit_posts['data'], dict) \
   and 'children' in reddit_posts['data']:
    reddit_posts = reddit_posts["data"]["children"]
    reddit_feed = []
    for post in reddit_posts:
        if isinstance(post['data']['media'], dict) \
               and "oembed" in post['data']['media'] \
               and isinstance(post['data']['media']['oembed'], dict) \
               and 'url' in post['data']['media']['oembed']:
            print post["data"]["media"]["oembed"]["url"]
            reddit_feed.append(post["data"]["media"]["oembed"]["url"])
    print reddit_feed