Python美丽的汤Get_Text回来' - '

时间:2014-11-17 22:19:27

标签: python beautifulsoup

我正在使用美味的汤。我的代码是:

from bs4 import BeautifulSoup

web_address = ('xxxx') # this part is fine I don't want to provide website.

req = urllib2.Request(web_address)


page = urllib2.urlopen(req)    

content = page.read()
soup = BeautifulSoup(content)
td = soup.findAll('td')
for line in td:
    print(line.get_text())

我正在查看的HTML部分是:

<td class="border_TopRight border_Left">
    Text - "TEST_NAME
<td class="border_TopRight">
    Text - TEST_NAME_1
<td class="border_TopRight">
    Text - TEST_NAME_2
<td class="apple dataCell border_TopRight font_green" id="Number of Apples" style="color: #333333; background-color: rgb(255, 255, 255);" rel="Apples ">
    Text - 999999.999

我的python脚本输出是:

TEST_NAME
TEST_NAME_1
TEST_NAME_2
 - 

我无法弄清楚为什么最后一个输出在&#39; - &#39;。我已经阅读了BS4文档并且似乎无法找出为什么前3个文本带有正确的文本,但最后一个是&#39; - &#39;

0 个答案:

没有答案