Question

我正在尝试使用beautifulsoup和regex从html页面之一中获取数据，但是无法这样做。

html_data：

<td class="col-a size a-update">200 MB<span class="next-size">1250</span></td>

我只想提取200 MB而不是1250

我尝试了以下代码：

from bs4 import BeautifulSoup

html_string = '<td class="coll-4 size mob-uploader">194.5 MB<span 
class="seeds">3422</span></td>'
soup = BeautifulSoup(html_string, 'html.parser')
size =  soup.find('td', {'class': 'size'}).getText()
print size

但我都有194.5 MB3422

请提出建议。

Answer 1

我已经通过使用以下代码解决了

from bs4 import BeautifulSoup

html_string = '<td class="coll-4 size mob-uploader">194.5 MB<span 
class="seeds">3422</span></td>'
soup = BeautifulSoup(html_string, 'html.parser')
size =  soup.find('td', {'class': 'size'}).contents[0]
print size

无法从BeautifulSoup中提取文本

1 个答案: