Python feedparser返回第一个条目中第一个媒体项的URL

时间:2015-11-16 01:32:58

标签: python xml parsing rss

我第一次使用python而且我有点卡住了。

使用feedparser解析RSS提要,我想获取条目0的第一个媒体项的URL并将其加载到变量中。

下面似乎有效,但我必须按两次输入才能运行它并返回条目0中所有媒体项的URL,其中我只需要第一个(16x9)图像URL。

>>> import feedparser
>>> d = feedparser.parse(http://www.abc.net.au/news/feed/45910/rss)
>>> for content in d.entries[0].media_content: print content['url']

-link to where i got the code above

RSS XML:

            <media:group>
        <media:description>French fighter jets take off to drop bombs on the Islamic State stronghold of Raqqa in Syria. (Supplied)</media:description>
        <media:content url="http://www.abc.net.au/news/image/6943630-16x9-2150x1210.jpg" medium="image" type="image/jpeg" width="2150" height="1210"/>
          <media:content url="http://www.abc.net.au/news/image/6943630-4x3-940x705.jpg" medium="image" type="image/jpeg" width="940" height="705"/>
          <media:content url="http://www.abc.net.au/news/image/6943630-3x2-940x627.jpg" medium="image" type="image/jpeg" width="940" height="627" isDefault="true"/>
          <media:content url="http://www.abc.net.au/news/image/6943630-3x4-940x1253.jpg" medium="image" type="image/jpeg" width="940" height="1253"/>
          <media:content url="http://www.abc.net.au/news/image/6943630-1x1-1400x1400.jpg" medium="image" type="image/jpeg" width="1400" height="1400"/>
          <media:thumbnail url="http://www.abc.net.au/news/image/6943630-4x3-140x105.jpg" width="140" height="105"/>
        </media:group>

在python中运行时看起来像这样:

>>> for content in d.entries[0].media_content: print content['url']
... 
http://www.abc.net.au/news/image/6943630-16x9-2150x1210.jpg
http://www.abc.net.au/news/image/6943630-4x3-940x705.jpg
http://www.abc.net.au/news/image/6943630-3x2-940x627.jpg
http://www.abc.net.au/news/image/6943630-3x4-940x1253.jpg
http://www.abc.net.au/news/image/6943630-1x1-1400x1400.jpg
>>> 

1 个答案:

答案 0 :(得分:2)

快速回答:

url = d.entries[0].media_content[0]['url']

d.entries[n].media_content是一个充满dicts的列表,因此您只需获取该列表中的第一个项目,并将值“url”存储在变量中。

以下是它在Python shell中的外观:

>>> import feedparser
>>> d = feedparser.parse("http://www.abc.net.au/news/feed/45910/rss")
>>> url = d.entries[0].media_content[0]['url']
>>> print url
http://www.abc.net.au/news/image/6943798-16x9-2150x1210.jpg