我正在尝试使用此处提供的代码抓取 Glassdoor
https://github.com/PlayingNumbers/ds_salary_proj/blob/master/glassdoor_scraper.py
执行代码时,没有错误,网站打开,但没有任何反应。我认为他们已经更改了网站上的标签。我试过更改标签,但它仍然有效。
这是代码片段:
def get_jobs(keyword, num_jobs, verbose, path, slp_time):
options = webdriver.ChromeOptions()
driver = webdriver.Chrome(executable_path=path, options=options)
driver.set_window_size(1120, 1000)
url = 'https://www.glassdoor.com/Job/jobs.htm?sc.keyword="' + keyword + '"&locT=C&locId=1147401&locKeyword=San%20Francisco,%20CA&jobType=all&fromAge=-1&minSalary=0&includeNoSalaryJobs=true&radius=100&cityId=-1&minRating=0.0&industryId=-1&sgocId=-1&seniorityType=all&companyId=-1&employerSizes=0&applicationType=0&remoteWorkType=0'
driver.get(url)
jobs = []
while len(jobs) < num_jobs:
time.sleep(slp_time)
try:
driver.find_element_by_class_name("selected").click()
except ElementClickInterceptedException:
pass
time.sleep(.1)
try:
driver.find_element_by_css_selector('[alt="Close"]').click()
print(' x out worked')
except NoSuchElementException:
print('x out failed')
pass
您可以在上面给出的链接中找到完整代码。
任何帮助将不胜感激!
答案 0 :(得分:0)
你能检查一下
生成的网址吗?url = 'https://www.glassdoor.com/Job/jobs.htm?sc.keyword="' + keyword + '"&locT=C&locId=1147401&locKeyword=San%20Francisco,%20CA&jobType=all&fromAge=-1&minSalary=0&includeNoSalaryJobs=true&radius=100&cityId=-1&minRating=0.0&industryId=-1&sgocId=-1&seniorityType=all&companyId=-1&employerSizes=0&applicationType=0&remoteWorkType=0'
如果结果显示在页面中,并手动验证