在我的html页面上,我有一个下拉列表:
<select name="somelist">
<option value="234234234239393">Some Text</option>
</select>
所以请列出我正在做的这个列表:
ddl = soup.findAll('select', name="somelist")
if(ddl):
???
现在我需要这个集合/字典的帮助,我希望能够通过'Some Text'和234234234239393进行查找。
这可能吗?
答案 0 :(得分:5)
请尝试以下操作开始:
str = r'''
<select name="somelist">
<option value="234234234239393">Some Text</option>
<option value="42">Other text</option>
</select>
'''
soup = BeautifulSoup(str)
select_node = soup.findAll('select', attrs={'name': 'somelist'})
if select_node:
for option in select_node[0].findAll('option'):
print option
打印出option
个节点:
<option value="234234234239393">Some Text</option>
<option value="42">Other text</option>
现在,对于每个option
,option['value']
是值属性,而option.text
是标记内的文字(“Some Text”)
答案 1 :(得分:1)
这是一种方式..
ddl_list = soup.findAll('select', attrs={'name': 'somelist'})
if ddl_list:
ddl = ddl_list[0]
# find the optino by value=234234234239393
opt = ddl.findChild('option', attrs={'value': '234234234239393'})
if opt:
# do something
# this list will hold all "option" elements matching 'Some Text'
opt_list = [opt for opt in ddl.findChildren('option') if opt.string == u'Some Text']
if opt_list:
opt2 = opt_list[0]
# do something
答案 2 :(得分:0)
再一次,只是为了说明如何用pyparsing做到这一点:
html = r'''
<select name="somelist">
<option value="234234234239393">Some Text</option>
<option value="42">Other text</option>
</select>
'''
from pyparsing import makeHTMLTags, Group, SkipTo, withAttribute, OneOrMore
select,selectEnd = makeHTMLTags("SELECT")
option,optionEnd = makeHTMLTags("OPTION")
optionEntry = Group(option("option") +
SkipTo(optionEnd)("menutext") +
optionEnd)
somelistSelect = (select.setParseAction(withAttribute(name="somelist")) +
OneOrMore(optionEntry)("options") +
selectEnd)
t,_,_ = somelistSelect.scanString(html).next()
for op in t.options:
print op.menutext, '->', op.option.value
打印:
Some Text -> 234234234239393
Other text -> 42