我正在执行以下代码,这些代码将在value =
之后打印文本soup = BeautifulSoup(html, 'lxml')
name = soup.find('input')['value']
print(name)
但是页面上有多个具有相同类的div,我尝试了findAll,但出现错误,并且只能打印第一个字段值,即Name。
请参阅所附的屏幕截图
<div class="control-group"><label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls"><input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required="required" class="input-small text-bound datepicker hasDatepicker"></div>
</div>
</div>
</div>
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
谢谢!
答案 0 :(得分:0)
也许像这样:
from bs4 import BeautifulSoup
html = '''
<html>
<head></head>
<body>
<div class="control-group">
<label class="control-label required" for="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe">What date do you want to make a decision?</label>
<div class="controls">
<input type="text" id="client_appbundle_prospecttype_ProspectFirstContact_decision_timeframe" name="client_appbundle_prospecttype[ProspectFirstContact][decision_timeframe]" required class="input-small text-bound datepicker hasDatepicker">
</div>
</div>
<div class="tab-pane active" id="prospect_consultation">
<div class="widget row-fluid">
<div class="span12">
<div class="navbar">
<div class="navbar-inner">
<h6>Personal details</h6>
</div>
</div>
<div class="well">
<div class="control-group">
<label class="control-label">Name</label>
<div class="controls">
Sam Test-March 2018
</div>
</div>
<div class="control-group">
<label class="control-label">Address and postcode</label>
<div class="controls">
</div>
</div>
<div class="control-group">
<label class="control-label">Mobile number</label>
<div class="controls">
12345678
</div>
</div>
<div class="control-group">
<label class="control-label">Email address</label>
<div class="controls">
test@test.com
</div>
</div>
</div>
</div>
</div>
</div>
</body>
</html>
'''
soup = BeautifulSoup(html, "lxml")
items = soup.select('.controls')
print([item.text.strip() for item in items if item.text.strip()])