使用 Selenium 刮玻璃门

时间:2021-06-29 08:49:15

标签: python-3.x selenium selenium-chromedriver

我正在尝试使用此处提供的代码抓取 Glassdoor
https://github.com/PlayingNumbers/ds_salary_proj/blob/master/glassdoor_scraper.py
执行代码时,没有错误,网站打开,但没有任何反应。我认为他们已经更改了网站上的标签。我试过更改标签,但它仍然有效。
这是代码片段:

def get_jobs(keyword, num_jobs, verbose, path, slp_time):

    options = webdriver.ChromeOptions()
    driver = webdriver.Chrome(executable_path=path, options=options)
    driver.set_window_size(1120, 1000)
    url = 'https://www.glassdoor.com/Job/jobs.htm?sc.keyword="' + keyword + '"&locT=C&locId=1147401&locKeyword=San%20Francisco,%20CA&jobType=all&fromAge=-1&minSalary=0&includeNoSalaryJobs=true&radius=100&cityId=-1&minRating=0.0&industryId=-1&sgocId=-1&seniorityType=all&companyId=-1&employerSizes=0&applicationType=0&remoteWorkType=0'
    driver.get(url)
    jobs = []

    while len(jobs) < num_jobs:  
        time.sleep(slp_time)
        try:
            driver.find_element_by_class_name("selected").click()
        except ElementClickInterceptedException:
            pass

        time.sleep(.1)

        try:
            driver.find_element_by_css_selector('[alt="Close"]').click() 
            print(' x out worked')
        except NoSuchElementException:
            print('x out failed')
            pass

您可以在上面给出的链接中找到完整代码。
任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:0)

你能检查一下

生成的网址吗?
url = 'https://www.glassdoor.com/Job/jobs.htm?sc.keyword="' + keyword + '"&locT=C&locId=1147401&locKeyword=San%20Francisco,%20CA&jobType=all&fromAge=-1&minSalary=0&includeNoSalaryJobs=true&radius=100&cityId=-1&minRating=0.0&industryId=-1&sgocId=-1&seniorityType=all&companyId=-1&employerSizes=0&applicationType=0&remoteWorkType=0'

如果结果显示在页面中,并手动验证