尝试从包含特定字符串的href
标记中提取文本,下面是我的示例代码的一部分:
Experience = soup.find_all(id='background-experience-container')
Exp = {}
for element in Experience:
Exp['Experience'] = {}
for element in Experience:
role = element.find(href=re.compile("title").get_text()
Exp['Experience']["Role"] = role
for element in Experience:
company = element.find(href=re.compile("exp-company-name").get_text()
Exp['Experience']['Company'] = company
它不喜欢我如何定义Exp['outer_key']['inner_key'] = value
返回SyntaxError
的语法。
我正在试图制作一个Dict.dict
,其中包含有关角色和公司的信息,还会查看每个日期,但还没有到目前为止。
有人能在我的代码中发现任何明显的错误吗?
非常感谢任何帮助!
答案 0 :(得分:1)
find_all
可以返回多个值(即使您按id
搜索),因此最好使用list
保留所有值 - Exp = []
。
Experience = soup.find_all(id='background-experience-container')
# create empty list
Exp = []
for element in Experience:
# create empty dictionary
dic = {}
# add elements to dictionary
dic['Role'] = element.find(href=re.compile("title")).get_text()
dic['Company'] = element.find(href=re.compile("exp-company-name")).get_text()
# add dictionary to list
Exp.append(dic)
# display
print(Exp[0]['Role'])
print(Exp[0]['Company'])
print(Exp[1]['Role'])
print(Exp[1]['Company'])
# or
for x in Exp:
print(x['Role'])
print(x['Company'])
如果您确定find_all
只为您提供了一个元素(并且您需要键'Experience'
),那么您可以
Experience = soup.find_all(id='background-experience-container')
# create main dictionary
Exp = {}
for element in Experience:
# create empty dictionary
dic = {}
# add elements to dictionary
dic['Role'] = element.find(href=re.compile("title")).get_text()
dic['Company'] = element.find(href=re.compile("exp-company-name")).get_text()
# add dictionary to main dictionary
Exp['Experience'] = dic
# display
print(Exp['Experience']['Role'])
print(Exp['Experience']['Company'])
或
Experience = soup.find_all(id='background-experience-container')
# create main dictionary
Exp = {}
for element in Experience:
Exp['Experience'] = {
'Role': element.find(href=re.compile("title")).get_text()
'Company': element.find(href=re.compile("exp-company-name")).get_text()
}
# display
print(Exp['Experience']['Role'])
print(Exp['Experience']['Company'])