Question

我像往常一样获取信息，但是我想要以键/值格式输出。例如：

{'Base pay':'$140,000.00 - $160,000.00 /Year'},
{'Employment Type':'Full-Time'},
{'Job Type':'Information Technology,  Engineering,  Professional Services'}

这是我的代码：

from bs4 import BeautifulSoup 
import urllib
website = 'http://www.careerbuilder.com/jobseeker/jobs/jobdetails.aspx?APath=2.21.0.0.0&job_did=J3H7FW656RR51CLG5HC&showNewJDP=yes&IPath=RSKV' 
html = urllib2.urlopen(website).read()
soup = BeautifulSoup(html)
for elm in soup.find_all('section',{"id":"job-snapshot-section"}):
    dn = elm.get_text()
print dn

这是我的代码输出：

Job Snapshot


Base Pay
$140,000.00 - $160,000.00 /Year


Employment Type
Full-Time


Job Type
Information Technology,  Engineering,  Professional Services


Education
4 Year Degree


Experience
At least 5 year(s)


Manages Others
Not Specified


Relocation
No


Industry
Computer Software, Banking - Financial Services, Biotechnology


Required Travel
Not Specified


Job ID
EE-1213256

我已根据要求编辑了代码，包括必需的库导入

Answer 1

我建议：

dict(i.strip().split('\n') for i in text.split('\n\n') if len(i.strip().split('\n')) == 2)

输出：

{'Job ID': 'EE-1213256', 
 'Manages Others': 'Not Specified', 
 'Job Type': 'Information Technology,  Engineering,  Professional Services', 
 'Relocation': 'No', 
 'Education': '4 Year Degree', 
 'Base Pay': '$140,000.00 - $160,000.00 /Year', 
 'Experience': 'At least 5 year(s)', 
 'Industry': 'Computer Software, Banking - Financial Services, Biotechnology', 
 'Employment Type': 'Full-Time', 
 'Required Travel': 'Not Specified'}

以字典格式获取已爬网信息

1 个答案: