我正在尝试从/ r / Askreddit获取主题标题。下面的代码返回None而不是线程标题。
from BeautifulSoup import BeautifulSoup
import urllib2, json
site='http://www.reddit.com/r/AskReddit/'
soup=BeautifulSoup(urllib2.urlopen(site))
questions=soup.findAll('p',{"class":"title"})
for i in questions:
print i.string
break
答案 0 :(得分:1)
标题位于string
标记的a
属性中,而不是p
标记。
另外,请注意title
之后的空格:
questions=soup.findAll('a',{"class":"title "})
通过查看此HTML代码段找到了上述内容:
<p class="title"><a class="title " href="http://www.reddit.com/r/AskReddit/comments/l5157/whats_the_best_face_you_can_pull_before_and_after/">What's the best face you can pull? Before and after please.</a> <span class="domain">(<a href="http://www.reddit.com/r/AskReddit/">self.AskReddit</a>)</span></p>