我编写了一个程序,该程序从Fl中的python作业页面中获取职位,公司,位置,薪水和描述。我在使工作链接正常工作以及如何使用下一页链接移至结果的下一页时遇到麻烦。
我在这里尝试了与网页,教程和其他类似解决方案不同的使用标签,但似乎没有一个可以与我的代码一起使用。我在下一页链接的底部编写的代码没有返回任何错误,但没有任何输出。
from bs4 import BeautifulSoup
import requests
url = 'https://www.indeed.com/jobs?q=python+developer&l=Florida&explvl=entry_level&sort=date'
while True:
response = requests.get(url)
data = response.text
soup = BeautifulSoup(data, 'html.parser')
jobs = soup.find_all(attrs={"data-tn-component": "organicJob"})
number = 0
for job in jobs:
number += 1
title = job.find("div",{"class":"title"})
print(number,". ", "\nTitle:", title.text.strip())
company = job.find("span",{"class":"company"})
print("Company:", company.text.strip())
location = job.find("span",{"class": "location"})
print("Location:", location.text.strip())
salary = job.find("div",{"class": "salarySnippet"})
if salary in job:
print("Salary:", salary.text.strip())
else:
print("Salary:", None)
summary= job.find("div",{"class":"summary"})
print("Summary:", summary.text.strip())
link = [a['href'] for a in job.find_all('a', href=True) if a.text]
print("link:",link)
#link = job.find("a", attrs = {"class":"jobtitleturnstileLink"}).get("href")
#print("Job link:", link)
print()
print()
url_tag = soup.find("span", {"class":"np"})
if url_tag.get("href"):
url = 'https://www.indeed.com/jobs?q=python+developer' + url_tag.get("href")
print("Next page:",url)
else:
break
注释链接行给了我相同的output-链接:
['/rc/clk?jk=441b435e5b088904&fccid=194a64c267cc7ab6&vjs=3', '/cmp/Foreground-Security', '/cmp/Foreground-Security/reviews', '#', '#', '/q-Foreground-Security-l-Melbourne,-FL-jobs.html', '/l-Melbourne,-FL-jobs.html', '/salaries/Test-Engineer-Salaries,-Melbourne-FL', '/cmp/Foreground-Security', '/forum/cmp/Foreground-Security.html', '/forum/loc/Melbourne-Florida.html']
这是当前作为链接返回的链接:
/rc/clk?jk=441b435e5b088904&fccid=194a64c267cc7ab6&vjs=3