从beautifulsoup4的下拉列表中选择值

时间:2017-11-03 02:20:08

标签: python html web-scraping beautifulsoup

我正在尝试浏览BeautifulSoup4中的这个下拉列表,并且找不到BS4功能以在正确的位置插入“选中”。该列表如下所示:

 <select name="sport" id="sport" onchange="mask('Processing'); changeSports(this.value);">
  <option value="">Select Sport</option>
    <option value="MBA" >Baseball</option>
    <option value="MBB" >Men&#x27;s Basketball</option>
    <option value="MFB" >Football</option>
    <option value="MIH" >Men&#x27;s Ice Hockey</option>
    <option value="MLA" >Men&#x27;s Lacrosse</option>
    <option value="MSO" >Men&#x27;s Soccer</option>
    <option value="MTE" >Men&#x27;s Tennis</option>
    <option value="MVB" >Men&#x27;s Volleyball</option>
    <option value="WBB" >Women&#x27;s Basketball</option>
    <option value="WBW" >Women&#x27;s Bowling</option>
    <option value="WFH" >Field Hockey</option>
    <option value="WIH" >Women&#x27;s Ice Hockey</option>
    <option value="WLA" >Women&#x27;s Lacrosse</option>
    <option value="WSB" selected>Softball</option>
    <option value="WSO" >Women&#x27;s Soccer</option>
    <option value="WSV" >Women&#x27;s Beach Volleyball</option>
    <option value="WTE" >Women&#x27;s Tennis</option>
    <option value="WVB" >Women&#x27;s Volleyball</option>
</select>

我一直试图将“已选择”插入

<option value="WSB" >Softball</option>

使用这个python代码:

from bs4 import BeautifulSoup,NavigableString
import requests
headers = {'User-Agent': 'Mozilla/5.0'}
url = 'http://stats.ncaa.org/rankings/ranking_summary'
page = requests.get(url,headers=headers)
soup = BeautifulSoup(page.content, "html.parser")
sport = soup.find(value="WSB")
sport.insert(0,"selected")
print(sport)

但这会产生结果:

<option value="WSB">selectedSoftball</option>

我真的不太了解HTML,所以我很难找到寻找解决方案的地方,任何建议都会非常感激。

1 个答案:

答案 0 :(得分:0)

在BeautifulSoup4中,属性的存储方式与字典类似。要修改selected的{​​{1}}属性,请使用<option>