当文本中有<br>
时,为什么这不起作用?我得到一个空文本。
opener = urllib2.build_opener()
opener.addheaders = [('User-agent', 'Mozilla/5.0')]
address = 'http://www.bbc.com'
response = opener.open(address)
html = response.read()
soup = BeautifulSoup(html)
snaptext = soup.find('p', attrs={'class': 'displaytext'})
print snaptext.string
一个例子是:
< p > blahblahblah< br/ >blah2blah2blah2< br/ >< p >
如果文本中有< br >
,则结果为无
答案 0 :(得分:-2)
正如你在这里看到的那样,br不是问题,它是你使用的.string,它总是返回None
,因为它没有属性.string
。您可能想要使用.getText()
>>> x = bs.find('div', attrs={'id': 'forum-post-body-183'})
>>> x
<div class="j-comment-body forum-post-body u-typography-format text" id="forum-post-body-183" itemprop="text">
<p>Let's try it! I will only replace Sir Finley with Ysera for late game pressing (and ev. win condition).<br>In edit of this comment i would report about results in casual battles (for start).</br></p>
</div>
>>> x.string
>>> print(x.string)
None
>>> x.getText()
"\nLet's try it! I will only replace Sir Finley with Ysera for late game pressing (and ev. win condition).In edit of this comment i would report about results in casual battles (for start).\n"