我试图从网页上运行js代码后获取加载的html,这与“检查元素”中显示的相同。但这没有给出正确的结果。 我尝试执行以下操作:
from selenium import webdriver
import requests
url = 'xxx'
options = webdriver.ChromeOptions()
options.add_argument('headless')
driver = webdriver.Chrome(chrome_options=options)
driver.get(url)
# This will get the initial html - before javascript
html1 = driver.page_source
# This will get the html after on-load javascript
html2 = driver.execute_script("return document.documentElement.innerHTML;")
print(html1)
print('\n\n')
print(html2)
我想从inspect元素(在这种情况下为html2)获取完整代码。我发现这种尝试是在页面完全加载之前从页面获取信息的。我该怎么做才能解决此问题?
答案 0 :(得分:0)
您需要等到页面上显示所需数据为止
https://selenium-python.readthedocs.io/waits.html#explicit-waits
答案 1 :(得分:0)
依赖项:
from selenium.common.exceptions import TimeoutException
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
这将等待其ID等于[ID_OF_ELEMENT]的元素。
timeout = 5
try:
element = WebDriverWait(driver timeout).until(EC.presence_of_element_located((By.ID, '[ID_OF_ELEMENT]')))
#Page ready
except TimeoutException:
#Timeout