我在通过Selenium Python抓取内容时遇到错误

时间:2020-07-14 08:36:46

标签: python selenium selenium-webdriver web-scraping selenium-chromedriver

我正在通过硒取消https://www.indeed.ae/jobs-in-dubai上的职位结果标题。我认为.text无法正常工作。 我正在通过硒运行代码,该硒转到主网站,输入选择性关键字,然后从结果中删除所有标题。但是我遇到了错误,如何解决这个错误

这是我的代码

import time
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.keys import Keys

Path = "C:\Program Files (x86)\chromedriver.exe"
driver = webdriver.Chrome(Path)

driver.get("https://indeed.ae/")
print(driver.title)
search = driver.find_element_by_name("l")
search.send_keys("Dubai")
search.send_keys(Keys.RETURN)


try:
    td = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "resultsCol"))
    )
    divs = td.find_elements_by_tag_name("div")
    for div in divs:
        header = div.find_element_by_class_name("title")
        print(header)
finally:
    driver.quit()

driver.quit()

我遇到以下错误

Job Search | Indeed
Traceback (most recent call last):
  File "C:/Users/hp/Desktop/python projects/selenium-pycharm/selenium-bot.py", line 24, in <module>
    header = div.find_element_by_class_name("title")
  File "C:\Users\hp\Desktop\python projects\selenium-pycharm\venv\lib\site-packages\selenium\webdriver\remote\webelement.py", line 398, in find_element_by_class_name
    return self.find_element(by=By.CLASS_NAME, value=name)
  File "C:\Users\hp\Desktop\python projects\selenium-pycharm\venv\lib\site-packages\selenium\webdriver\remote\webelement.py", line 659, in find_element
    {"using": by, "value": value})['value']
  File "C:\Users\hp\Desktop\python projects\selenium-pycharm\venv\lib\site-packages\selenium\webdriver\remote\webelement.py", line 633, in _execute
    return self._parent.execute(command, params)
  File "C:\Users\hp\Desktop\python projects\selenium-pycharm\venv\lib\site-packages\selenium\webdriver\remote\webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "C:\Users\hp\Desktop\python projects\selenium-pycharm\venv\lib\site-packages\selenium\webdriver\remote\errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.NoSuchElementException: Message: no such element: Unable to locate element: {"method":"css selector","selector":".title"}
  (Session info: chrome=83.0.4103.116)


Process finished with exit code 1

预先感谢

1 个答案:

答案 0 :(得分:0)

找不到标题,因为您从resultsCol中获得了所有div。这意味着有些div带有标题,有些则没有。

尝试一下:

try:
    td = WebDriverWait(driver, 10).until(
        EC.presence_of_element_located((By.ID, "resultsCol"))
    )
    divs = td.find_elements_by_tag_name("div")
    #print(divs)
    for div in divs:
        try:
            header = div.find_element_by_class_name("title")
            print(header.text)
        except:
            continue
finally:
    driver.quit()

driver.quit()

以标题作为输出:

Receptionist
Administrative Assistant/ Document Controller
RECEPTIONIST
ADMIN OFFICER IN UAE
Data Entry Assistant (Fresh Graduate)
Receptionist
Replenishment Associate - Light Household - Hypermarket
DOCUMENT CONTROLLER
School Administrative Assistant - Dubai
ACCOUNTANT