我通过使用python beautifulsoup提交帖子请求来获得医生链接

时间:2018-12-11 14:33:09

标签: python beautifulsoup

import requests

from bs4 import BeautifulSoup

try:

    for count in range(123401,123405):
        ctl00_RightContetHolder_TextBox1 = count

        r = requests.post('http://karnatakamedicalcouncil.com/RenewalReport.aspx',
                                  data={'ctl00_RightContetHolder_TextBox1': count, 'Search': "submit"})

        soup = BeautifulSoup(r.text, 'html.parser')

        for i in soup.find('table', {'class': 'mGrid'}):
            for links in i.find('a',class_='Viewdetails'):
                print links

except:
    pass

我正在尝试获取mGrid表中的每个链接,但无法用漂亮的汤来检索它们。我不明白为什么找不到锚标签,或者为什么找不到锚标签。请帮助我。

1 个答案:

答案 0 :(得分:1)

它缺少必需的数据__VIEWSTATE__EVENTVALIDATION,要获取它,您需要创建GET请求并提取具有该ID的隐藏输入值,然后您可以创建POST或搜索要求提供该数据。

url = 'http://karnatakamedicalcouncil.com/RenewalReport.aspx'

html = requests.get(url).text
soup = BeautifulSoup(html, 'html.parser')

VIEWSTATE = soup.find(id='__VIEWSTATE')['value']
EVENTVALIDATION = soup.find(id='__EVENTVALIDATION')['value']

for count in range(123401,123405):
    data = {
            '__VIEWSTATE' : VIEWSTATE,
            '__VIEWSTATEENCRYPTED' : '',
            '__EVENTVALIDATION' : EVENTVALIDATION,
            'ctl00$RightContetHolder$TextBox1': count,
            'ctl00$RightContetHolder$hdnSearch': "Search",
          }

    r = requests.post(url, data=data)
    soup = BeautifulSoup(r.text, 'html.parser')

    for links in soup.findAll('a', class_='Viewdetails'):
        print links['href']