我如何从ASPX网站__doPostBack表单中抓取数据?

时间:2019-11-09 22:08:24

标签: c# web-scraping

我正在尝试使用c#控制台应用程序爬网/抓取网站,获取初始页面不是问题,但是当我单击具有__doPostBack操作的按钮时,我需要获取页面。

我尝试使用这些设置,但这已经从初始页面返回结果:

我更新代码

            var client1 = new RestClient("https://example.com");
            var request1 = new RestRequest(Method.POST);
            IRestResponse initialResponse = client1.Execute(request1);


            HtmlDocument doc = new HtmlDocument();
            doc.LoadHtml(initialResponse.Content);

            var formData = new Dictionary<string, string>();
            formData.Add("__EVENTTARGET", "ctl00$MainContentExample");
            formData.Add("__EVENTARGUMENT", "");
            formData.Add("ctl00$MainContent$CustomHiddenField", "");
            formData.Add("__VIEWSTATEGENERATOR", "B5682C7D");
            var divViewState = doc.DocumentNode
                .SelectSingleNode("//input[@name='__VIEWSTATE']").Attributes[3].Value;
            formData.Add("__VIEWSTATE", divViewState);
            var divEventValidation = doc.DocumentNode
                .SelectSingleNode("//input[@name='__EVENTVALIDATION']").Attributes[3].Value;
            formData.Add("__EVENTVALIDATION", divEventValidation);


            var client = new RestClient("https://example.com");
            var request = new RestRequest("/methodName",Method.POST);
            request.AddHeader("cache-control", "no-cache");
            request.AddHeader("Connection", "keep-alive");

            var c = "";
            var i = 0;
            foreach (var cookie in initialResponse.Cookies)
            {
                if(i==0)
                 c += cookie.Name + "=" + cookie.Value + "; ";
                else
                    c += cookie.Name + "=" + cookie.Value;

                i++;
            }
            request.AddHeader("Cookie", c);
           request.AddHeader("Accept-Encoding", "gzip, deflate, br");
            request.AddHeader("Host", "example.com");
            request.AddHeader("Cache-Control", "max-age=0");
            request.AddHeader("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3");
            request.AddHeader("User-Agent", "Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/78.0.3904.108 Safari/537.36");
            request.AddHeader("Content-Type", "application/x-www-form-urlencoded");
            request.AddHeader("Accept-Language", "en-US,en;q=0.9");
            request.AddHeader("Content-Encoding", "utf-8");

            var json = JsonConvert.SerializeObject(formData);
            byte[] byteData = Encoding.UTF8.GetBytes(json);

            request.AddHeader("Content-Length", byteData.Length.ToString())


            request.AddParameter("undefined", byteData, ParameterType.RequestBody);

            IRestResponse response2 = client.Execute(request);