the source code of the page is as in the picture
我想找到所有 div 类容器结果,但它不起作用我得到一个空列表
我的代码: `
select
manager_id,
manager_name,
employee_name,
employee_number from emp
WHERE employee_number = @EMPNum
`
答案 0 :(得分:0)
由于网页是由 javascript 呈现的,requests / BeautifulSoup 将无法检索所需的 DOM 元素,因为它们是在页面呈现一段时间后添加的。为此,您可以尝试使用硒,这是一个示例:
from selenium import webdriver
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
# delay for selenium web driver wait
DELAY = 30
# create selenium driver
chrome_options = webdriver.ChromeOptions()
#chrome_options.add_argument('--headless')
#chrome_options.add_argument('--no-sandbox')
driver = webdriver.Chrome('<<PATH TO chromedriver>>', options=chrome_options)
# iterate over pages
for page in range(0, 10, 1):
# open web page
driver.get(f'https://www.apec.fr/candidat/recherche-emploi.html/emploi?page={page}')
# wait for element with class 'container-result' to be added
container_result = WebDriverWait(driver, DELAY).until(EC.presence_of_element_located((By.CLASS_NAME, "container-result")))
# scroll to container-result
driver.execute_script("arguments[0].scrollIntoView();", container_result)
# get source HTML of the container-result element
source = container_result.get_attribute('innerHTML')
# print source
print(source)
# here you can continue work with the source variable either using selenium API or using BeautifulSoup API:
# soup = BeautifulSoup(source, "html.parser")
# quit webdriver
driver.quit()