BeautifulSoup中没有文字

时间:2016-09-04 23:55:32

标签: python beautifulsoup scraper

当文本中有<br>时,为什么这不起作用?我得到一个空文本。

opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
address = 'http://www.bbc.com'
response = opener.open(address)
html = response.read()
soup = BeautifulSoup(html)
snaptext = soup.find('p', attrs={'class': 'displaytext'})
print snaptext.string

一个例子是:

< p > blahblahblah< br/ >blah2blah2blah2< br/ >< p >

如果文本中有< br >,则结果为无

1 个答案:

答案 0 :(得分:-2)

正如你在这里看到的那样,br不是问题,它是你使用的.string,它总是返回None,因为它没有属性.string。您可能想要使用.getText()

>>> x = bs.find('div', attrs={'id': 'forum-post-body-183'})
>>> x
<div class="j-comment-body forum-post-body u-typography-format text" id="forum-post-body-183" itemprop="text">
<p>Let's try it! I will only replace Sir Finley with Ysera for late game pressing (and ev. win condition).<br>In edit of this comment i would report about results in casual battles (for start).</br></p>
</div>
>>> x.string
>>> print(x.string)
None
>>> x.getText()
"\nLet's try it! I will only replace Sir Finley with Ysera for late game pressing (and ev. win condition).In edit of this comment i would report about results in casual battles (for start).\n"