Question

是否有可能在标签之外使用美味的汤。一个恰当的例子是以下页面

http://dsalsrv02.uchicago.edu/cgi-bin/app/biswas-bangala_query.py?page=1

在html标记结束后有数据。

Answer 1

从我看到的内容，您可以对此特定页面使用html.parser orhtml5lib：

import requests
from bs4 import BeautifulSoup

response = requests.get("http://dsalsrv02.uchicago.edu/cgi-bin/app/biswas-bangala_query.py?page=1")

soup = BeautifulSoup(response.content, "html.parser")
# soup = BeautifulSoup(response.content, "html5lib")

lxml解析器无法很好地处理此页面，并且仅部分解析。

使用美丽的汤超越

1 个答案: