请求beautifulsoup解析来自选择标签的值

时间:2018-08-03 08:02:17

标签: python parsing beautifulsoup python-requests

我希望修改现有代码,该代码当前可解析网页源的td个属性,并将其替换为value字段的option等于一个数值。让我解释一下

import requests, re, collections
from bs4 import BeautifulSoup

def get_content(url):
    if type(url) != str:
        print('You need to included a string')
        exit()
    else:
        headers = {
            'User-Agent': 'Mozilla/5.0 (Windows NT 6.3; rv:36.0) Gecko/20100101 Firefox/36.0',
            'From': 'user@umbc.edu'  
        }
        req  = requests.get(url,headers=headers)
        soup = BeautifulSoup(req.content, 'html.parser')
        current_month=soup.find_all('td', {'id': 'monatevent'})
        fwk_nextmonth=soup.find_all('td', {'id': 'aevent'})
        curr_month = []
        fwk_next_month = []

我现在想解析以下select option属性,以在option value=08时发出警报,例如:

<select name="month" onchange="submit()">
<option value="09" selected="">09</option>
<option value="10">10</option><option value="11">11</option>
</select>

如果我在代码中使用以下选择器,则不会返回任何内容: current_month = soup.find_all('select',{'option':'08'})

有人可以帮忙吗?谢谢。

2 个答案:

答案 0 :(得分:1)

您可以使用BeautifulSoup中内置的CSS选择器。选择器<option>将找到具有属性selected的标签data = """<select name="month" onchange="submit()"> <option value="09" selected="">09</option> <option value="10">10</option><option value="11">11</option> </select>""" from bs4 import BeautifulSoup soup = BeautifulSoup(data, 'lxml') print(soup.select_one('option[selected]').text)

09

打印:

If you want to find option with value=08, you can do CSS selector `option[value=08]`:

data = """<select name="month" onchange="submit()">
<option value="08">08</option>
<option value="09" selected="">09</option>
<option value="10">10</option><option value="11">11</option>
</select>"""

from bs4 import BeautifulSoup

soup = BeautifulSoup(data, 'lxml')

print(soup.select_one('option[value=08]'))

编辑:

<option value="08">08</option>

打印:

{{1}}

答案 1 :(得分:0)

我认为您尝试从html字符串中查找所选选项的值。 可能是这个帮助。

from bs4 import BeautifulSoup

html_str = """<select name="month" onchange="submit()">
<option value="09" selected="">09</option>
<option value="10">10</option><option value="11">11</option>
</select>"""

soup = BeautifulSoup(html_str, 'html.parser')

select = soup.find('select')
for option in select.find_all('option'):
    if option.has_attr('selected'):
        print('Value:', option.get('value'))