如何使用python-readability提取文章?

时间:2016-05-18 06:34:45

标签: python django

    url = 'http://wired.com/'
    html = urllib.request.urlopen(url).read()
    readable_article = Document(html).summary()
    print(readable_article)
    readable_title = Document(html).short_title()
    data = Document(html).get_article(["p", "pre", "td"], ["h1"])
    print(data)

抓取错误

File "/home/sayone/virtual/poster/lib/python3.4/site-packages/readability/readability.py", line 211, in get_article
    best_candidate['content_score'] * 0.2])
TypeError: list indices must be integers, not str

0 个答案:

没有答案