Question

我正在使用feedparser处理来自pubmed的RSS。

提要链接为https://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi?rss_guid=1RGmO3jHeXUu8o2CWPinET6JLLik93hwR2IAJ5mU-YzoPeX1-O

提要中每篇文章的“摘要”都放在<（description）>元素的HTML中，这是我想在网页中显示的摘要（使用Django）。我可以轻松访问所有其他元素。

我玩耍并在下面编写了一些代码，以删除摘要中没有的任何内容并打印出来，但是即使使用Stackoverflow上找到的解决方案，我也无法将“抽象”变量代入原始feedparser词典中

What exists:

<(item)>
    <(description)> loads of HTML

What I want:

<(item)>
    <(description)> abstract

or:

<(item)>
    <(description)>
    <(abstract)> abstract

希望如此。

代码是：


import feedparser

rss = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/erss.cgi?rss_guid=1RGmO3jHeXUu8o2CWPinET6JLLik93hwR2IAJ5mU-YzoPeX1-O'
feed = feedparser.parse(rss)

for post in feed.entries:
    try:
        abstract = (post.description[post.description.index("<p>Abstract<br/>")+len("<p>Abstract<br/>"):post.description.index("</p><p>PMID:")])[:-14]
        print (abstract)
    except ValueError:
        break

FWIW，这是前端的代码：

{% for post in feed.entries %}
<div class="panel panel-default">
    <div class="panel-heading">
        <h4><a href="{{ post.link }}" target="_blank"> {{ post.title }} </a></h4>
        <h5> {{ post.description }} </h5>
        <h5> {{ post.author }}, {{ post.category }} </h5>
    </div>
</div>
{% endfor %}

感谢一百万个小费！

修改Feedparser中的<item> <description>元素

0 个答案: