Question

所以我有看起来像这样的HTML代码。

<li data-ng-repeat="sector in data.sectors"> <a target="_self" data-ng-href="/stocks/quotes/-382G/components/A" href="/stocks/quotes/-382G/components/A"><span>SIC-3826 Laboratory Analytical Instruments</span></a> </li>

我想在span标签中提取信息。不幸的是，当我使用以下代码时：

tags = soup.findAll("li",attrs={"data-ng-repeat":"sector in data.sectors"})
# tags = soup.find_all("a",attrs= {"target=","data-ng-href="})
# tags = soup.find_all("a")
for tag in tags:
print(tag.text)

结果是[[sector.description]]。我要提取的信息包括“ SIC-3826实验室分析仪器”

任何帮助将不胜感激。我尝试了各种替代方法，但我无法获得所需的信息。

提前谢谢！

Answer 1

是的，您需要做的是：

x = """<li data-ng-repeat="sector in data.sectors"> <a target="_self" data-ng-href="/stocks/quotes/-382G/components/A" href="/stocks/quotes/-382G/components/A"><span>SIC-3826 Laboratory Analytical Instruments</span></a> </li>"""

from bs4 import BeautifulSoup
print(BeautifulSoup(x, "lxml").text)

Python 3 BS4-从<span>标签提取数据（续）

1 个答案: