抓网站时出错

时间:2018-06-14 11:10:15

标签: web-scraping beautifulsoup python-3.6

我正在尝试使用beautilfulsoup来抓取一个网站。但是当我刮掉它时,我收到了一些错误

我收到的错误是:

505|error|500|Invalid postback or callback argument.  Event validation is enabled using <pages enableeventvalidation="true"></pages> in configuration or &lt;%@ Page EnableEventValidation="true" %&gt; in a page.  For security purposes, this feature verifies that arguments to postback or callback events originate from the server control that originally rendered them.  If the data is valid and expected, use the ClientScriptManager.RegisterForEventValidation method in order to register the postback or callback data for validation.|

我的代码是:

from bs4 import BeautifulSoup
import requests
import csv

final_data = []
url = "https://rera.cgstate.gov.in/Default.aspx"

def writefiles(alldata, filename):
    with open ("./"+ filename, "w") as csvfile:
        csvfile = csv.writer(csvfile, delimiter=",")
        csvfile.writerow("")
        for i in range(0, len(alldata)):
            csvfile.writerow(alldata[i])

def getbyGet(url, values):
    res = requests.get(url, data=values)
    text = res.text
    return text

def readHeaders():
    global url
    html = getbyGet(url, {})
    soup  = BeautifulSoup(html, "html.parser")
    EVENTVALIDATION = soup.select("#__EVENTVALIDATION")[0]['value']
    VIEWSTATE = soup.select("#__VIEWSTATE")[0]['value']
    #VIEWSTATEGENERATOR = soup.select("#__VIEWSTATEGENERATOR")[0]["value"]
    headers= {'Accept':'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
              'Content-Type':'application/x-www-form-urlencoded',
              'User-Agent':'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:59.0) Gecko/20100101 Firefox/59.0'}
    formfields = {"__ASYNCPOST":"true",
              "__EVENTARGUMENT":"",
              "__EVENTTARGET":"" ,
              "__EVENTVALIDATION":EVENTVALIDATION,  
              "__LASTFOCUS":"", 
              "__VIEWSTATE":VIEWSTATE,
              "ApplicantType":"",
              "Button1":"Search",
              "District_Name":"0",
              "DropDownList1":"0",
              "DropDownList2":"0",
              "DropDownList4":"0",
              "DropDownList5":"0",
              "group1":"on",
              "hdnSelectedOption":"0",
              "hdnSelectedOptionForContractor":"0",
              "Mobile":"",
              "Tehsil_Name":"0",
              "TextBox1":"",
              "TextBox2":"",
              "TextBox3":"",
              "TextBox4":"",
              "TextBox5":"",
              "TextBox6":"",
              "ToolkitScriptManager1":"appr1|Button1",
              "txt_otp":"", 
              "txt_proj_name":"",
              "txtRefNo":"",
              "txtRefNoForContractor":""}
    s = requests.session()
    res = s.post(url, data=formfields, headers=headers).text
    soup = BeautifulSoup(res, "html.parser")
    print(soup)
readHeaders()
我在做错了什么?有人可以指导吗?我读了另一篇文章但有人收到同样的错误,但他的帖子也没有任何灵魂。这是帖子链接EVENTVALIDATION error while scraping asp.net page

0 个答案:

没有答案