我是python的新手,正在尝试学习网络抓取。在学习完一个教程之后,我试图从网站上提取价格,但没有打印任何内容。我的代码有什么问题?
from selenium import webdriver
chrome_path = r"C:\webdrivers\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://reservations.airarabia.com/service-app/ibe/reservation.html#/fare/en/AED/AE/SHJ/KHI/07-09-2019/N/1/0/0/Y//N/N")
price = driver.find_elements_by_class_name("fare-and-services-flight-select-fare-value ng-isolate-scope")
for post in price:
print(post.text)
答案 0 :(得分:1)
第一个原因是因为您要抓取的网页使用javascript加载HTML,因此您需要等到该元素出现后才能使用硒的WebDriverWait
第二个原因是find_elements_by_class_name
方法仅接受一个类,因此您需要使用find_elements_by_css_selector
或find_elements_by_xpath
这是您代码的外观
from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait
chrome_path = r"C:\webdrivers\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://reservations.airarabia.com/service-app/ibe/reservation.html#/fare/en/AED/AE/SHJ/KHI/07-09-2019/N/1/0/0/Y//N/N")
price = WebDriverWait(driver, 10).until(
lambda x: x.find_elements_by_css_selector(".currency-value.fare-value.ng-scope.ng-isolate-scope"))
for post in price:
print(post.get_attribute("innerText"))
答案 1 :(得分:0)
要打印第一个 title ,必须为所需的visibility_of_element_located()
引入 WebDriverWait ,并且可以使用以下Locator Strategies中的任何一个:< / p>
使用CSS_SELECTOR
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "isa-flight-select button:first-child span.fare-and-services-flight-select-fare-value.ng-isolate-scope"))).get_attribute("innerHTML"))
使用XPATH
:
print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//isa-flight-select//following::button[contains(@class, 'button')]//span[@class='fare-and-services-flight-select-fare-value ng-isolate-scope']"))).text)
注意:您必须添加以下导入:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
控制台输出两个背对背执行:
475
您可以在How to retrieve the title attribute through Selenium using Python?
中找到相关的讨论
根据文档:
get_attribute()
方法Gets the given attribute or property of the element.
text
属性返回The text of the element.