Question

我无法获取绿色数据（

yo = requests.get('http://www.nfl.com/schedules/2019/REG11')
soup = bs.BeautifulSoup(yo.text, 'html.parser')
table = soup.find('ul', class_="schedules-table")
print(table) #correctly gathers all data and extraneous data

Answer 1

您看到的绿色数据是html中的注释。可以使用bs4中的Comment类来获取它-

from bs4 import BeautifulSoup
from bs4 import Comment
yo = requests.get('http://www.nfl.com/schedules/2019/REG11')
soup = BeautifulSoup(yo.text, 'html.parser')
comment = soup.find_all(string=lambda text: isinstance(text, Comment))

将获得您在页面上的所有评论。您必须自己过滤掉相关评论。

Answer 2

绿色突出显示的内容是评论。您可以使用以下代码获取它们：

from bs4 import Comment
comments = table.find_all(string=lambda text: isinstance(text, Comment))

它将把它们全部放在列表中。由您决定将它们编码为数据框或其他所需的内容。

Python：使用更改的class_ =名称对</li>和<ul>进行漂亮的汤解析

2 个答案: