Question

我正在尝试从url获取数据。部分html标签如下所示，我想获取数字“ 397”，该数字将一直更改，就像股票指数一样。我的代码如下所示，当我编译.py文件时，结果是<a class="p_total" name="p_bar_total"></a>，没有数字。

HTML：

<div id="p_bar_bottom" class="p_bar" style="display: inline;">
            <a name="p_bar_total" class="p_total">&nbsp;397&nbsp;</a>
            <a name="p_bar_min" class="p_redirect" style="display: none;">|‹ 1</a>

代码：

with requests.session() as s:
    url = 'https://www.sth.com'
    page = s.get(url)
    soup = BeautifulSoup(page.text, 'html.parser')
    total_list = soup.find(class_ = 'p_bar')
    total_no_list = total_list.find(class_ = 'p_total')
    print(total_no_list)

我的代码有问题吗？谢谢

Answer 1

无需查找两个标签即可获取文本，您可以直接获取文本！

from bs4 import BeautifulSoup
import requests
html = '''<div id="p_bar_bottom" class="p_bar" style="display: inline;">
            <a name="p_bar_total" class="p_total">&nbsp;397&nbsp;</a>
            <a name="p_bar_min" class="p_redirect" style="display: none;">|‹ 1</a>'''
soup = BeautifulSoup(html, 'html.parser')

total = soup.select('.p_total')[0].text
print(total)

输出：

如何通过Python3获取HTML信息

1 个答案: