从下拉菜单python中的每个选项中刮取表格

时间:2019-10-02 13:53:42

标签: python python-3.x web-scraping drop-down-menu beautifulsoup

我正试图从该网站上抓取所有数据:
http://www.dartsdatabase.co.uk/PlayerStats.aspx?statKey=1&pg=7

但是,我不知道如何通过'stat'下拉菜单进行迭代。这些选项中的每一个都包含我需要抓取的表格。

到目前为止,我有以下代码,该代码列出了与下拉列表中的每个元素相关的选项和值:

url = 'http://www.dartsdatabase.co.uk/PlayerStats.aspx'

response = requests.get(url).text

soup = BeautifulSoup(response,"lxml")

drop = soup.find('select',{'name':'stat'}).findAll("option")

options = []

val = []

for i in range(0,len(drop)):

    options.append(drop[i].text)

    val.append(drop[i]['value'])

任何帮助将不胜感激!

1 个答案:

答案 0 :(得分:2)

发出POST请求以更改stat参数。您可以从选项的页面value属性中收集适当的值

import requests
import pandas as pd
from bs4 import BeautifulSoup as bs

data = {
  'nameSearch': '',
  'dateFrom': '02/10/2017',
  'dateTo': '02/10/2019',
  'organStat': 'All',
  'stat': '1',
  'tourns': 'All',
  'pg': '7'
}

def get_soup():
    r = s.post('http://www.dartsdatabase.co.uk/PlayerStats.aspx?statKey=1&pg=7', data=data)
    soup = bs(r.content, 'lxml')  
    return soup

with requests.Session() as s:
    soup = get_soup()
    table = pd.read_html(str(soup.select_one('br + table')))[0]
    stats = [i['value'] for i in soup.select('[name="stat"] option')][1:]
    print(table)

    for i in stats:
        data['stat']=i
        soup = get_soup()
        table = pd.read_html(str(soup.select_one('br + table')))[0]
        print(table)