我像往常一样获取信息,但是我想要以键/值格式输出。 例如:
{'Base pay':'$140,000.00 - $160,000.00 /Year'},
{'Employment Type':'Full-Time'},
{'Job Type':'Information Technology, Engineering, Professional Services'}
这是我的代码:
from bs4 import BeautifulSoup
import urllib
website = 'http://www.careerbuilder.com/jobseeker/jobs/jobdetails.aspx?APath=2.21.0.0.0&job_did=J3H7FW656RR51CLG5HC&showNewJDP=yes&IPath=RSKV'
html = urllib2.urlopen(website).read()
soup = BeautifulSoup(html)
for elm in soup.find_all('section',{"id":"job-snapshot-section"}):
dn = elm.get_text()
print dn
这是我的代码输出:
Job Snapshot
Base Pay
$140,000.00 - $160,000.00 /Year
Employment Type
Full-Time
Job Type
Information Technology, Engineering, Professional Services
Education
4 Year Degree
Experience
At least 5 year(s)
Manages Others
Not Specified
Relocation
No
Industry
Computer Software, Banking - Financial Services, Biotechnology
Required Travel
Not Specified
Job ID
EE-1213256
我已根据要求编辑了代码,包括必需的库导入
答案 0 :(得分:1)
我建议:
dict(i.strip().split('\n') for i in text.split('\n\n') if len(i.strip().split('\n')) == 2)
输出:
{'Job ID': 'EE-1213256',
'Manages Others': 'Not Specified',
'Job Type': 'Information Technology, Engineering, Professional Services',
'Relocation': 'No',
'Education': '4 Year Degree',
'Base Pay': '$140,000.00 - $160,000.00 /Year',
'Experience': 'At least 5 year(s)',
'Industry': 'Computer Software, Banking - Financial Services, Biotechnology',
'Employment Type': 'Full-Time',
'Required Travel': 'Not Specified'}