Question

RSS提要URL可用于网站的元数据（如果有的话）。有没有办法使用urllib2或HTMLParser模块提取页面的源URL（S）？或者是否有更好的模块？

感谢。

Answer 1

我更喜欢lxml。它有一个非常好的API，它的XPath支持使得这很容易实现：

import lxml.html
doc = lxml.html.parse(url_to_site)
feeds = doc.xpath('//link[@type="application/rss+xml"]/@href') # list feed urls