如何从航班预订网站https://reservations.airarabia.com获取价格信息

时间:2019-08-31 16:20:22

标签: python selenium xpath css-selectors webdriverwait

我是python的新手,正在尝试学习网络抓取。在学习完一个教程之后,我试图从网站上提取价格,但没有打印任何内容。我的代码有什么问题?

from selenium import webdriver

chrome_path = r"C:\webdrivers\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)
driver.get("https://reservations.airarabia.com/service-app/ibe/reservation.html#/fare/en/AED/AE/SHJ/KHI/07-09-2019/N/1/0/0/Y//N/N")
price = driver.find_elements_by_class_name("fare-and-services-flight-select-fare-value ng-isolate-scope")
for post in price:
        print(post.text)

2 个答案:

答案 0 :(得分:1)

第一个原因是因为您要抓取的网页使用javascript加载HTML,因此您需要等到该元素出现后才能使用硒的WebDriverWait

第二个原因是find_elements_by_class_name方法仅接受一个类,因此您需要使用find_elements_by_css_selectorfind_elements_by_xpath

这是您代码的外观

from selenium import webdriver
from selenium.webdriver.support.wait import WebDriverWait

chrome_path = r"C:\webdrivers\chromedriver.exe"
driver = webdriver.Chrome(chrome_path)

driver.get("https://reservations.airarabia.com/service-app/ibe/reservation.html#/fare/en/AED/AE/SHJ/KHI/07-09-2019/N/1/0/0/Y//N/N")
price = WebDriverWait(driver, 10).until(
    lambda x: x.find_elements_by_css_selector(".currency-value.fare-value.ng-scope.ng-isolate-scope"))

for post in price:
    print(post.get_attribute("innerText"))

答案 1 :(得分:0)

要打印第一个 title ,必须为所需的visibility_of_element_located()引入 WebDriverWait ,并且可以使用以下Locator Strategies中的任何一个:< / p>

  • 使用CSS_SELECTOR

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.CSS_SELECTOR, "isa-flight-select button:first-child span.fare-and-services-flight-select-fare-value.ng-isolate-scope"))).get_attribute("innerHTML"))
    
  • 使用XPATH

    print(WebDriverWait(driver, 20).until(EC.visibility_of_element_located((By.XPATH, "//isa-flight-select//following::button[contains(@class, 'button')]//span[@class='fare-and-services-flight-select-fare-value ng-isolate-scope']"))).text)
    
  • 注意:您必须添加以下导入:

    from selenium.webdriver.support.ui import WebDriverWait
    from selenium.webdriver.common.by import By
    from selenium.webdriver.support import expected_conditions as EC
    
  • 控制台输出两个背对背执行:

    475
    
  

您可以在How to retrieve the title attribute through Selenium using Python?

中找到相关的讨论

Outro

根据文档: