我正在尝试使用python的库 BeautifulSoup 获取 html 的<li>
。
我试图解析的HTML就是这个:
https://ccnav6.com/ccna-4-chapter-1-exam-answers-2017-v5-0-3-v6-0-full-100.html
它包含一系列问题和答案,我正在尝试解析这些问题。
我的问题是,无论我如何解析html,我只得到第一个<li>
。
我的代码:
from urllib.request import urlopen as uReq
from bs4 import BeautifulSoup as soup
url = 'https://ccnav6.com/ccna-4-chapter-1-exam-answers-2017-v5-0-3-v6-0-full-100.html'
uClient = uReq(url)
# getting html from connection
page_html = uClient.read()
# close connection
uClient.close()
# use beautifulSoup to parse html
page_soup = soup(page_html, "html.parser")
# get main content of page
contentBlock = page_soup.find("div",{"class":"post-single-content box mark-links entry-content"})
# get all questions and answers
questions = questions = contentBlock.div.ol.li.ol.findAll("li")
# for some reason i'm only getting the first question