是否可以为受表单登录保护的网站编写屏幕抓取工具。当然,我可以访问该网站,但我不知道如何登录该网站并将我的凭据保存在C#中。
此外,C#中任何关于屏幕分析器的好例子都会非常受欢迎。
这已经完成了吗?
答案 0 :(得分:6)
这很简单。您需要自定义登录(HttpPost)方法。
你可以想出这样的东西(以这种方式,你将在登录后获得所有需要的cookie,你只需将它们传递给下一个HttpWebRequest):
public static HttpWebResponse HttpPost(String url, String referer, String userAgent, ref CookieCollection cookies, String postData, out WebHeaderCollection headers, WebProxy proxy)
{
try
{
HttpWebRequest http = WebRequest.Create(url) as HttpWebRequest;
http.Proxy = proxy;
http.AllowAutoRedirect = true;
http.Method = "POST";
http.ContentType = "application/x-www-form-urlencoded";
http.UserAgent = userAgent;
http.CookieContainer = new CookieContainer();
http.CookieContainer.Add(cookies);
http.Referer = referer;
byte[] dataBytes = UTF8Encoding.UTF8.GetBytes(postData);
http.ContentLength = dataBytes.Length;
using (Stream postStream = http.GetRequestStream())
{
postStream.Write(dataBytes, 0, dataBytes.Length);
}
HttpWebResponse httpResponse = http.GetResponse() as HttpWebResponse;
headers = http.Headers;
cookies.Add(httpResponse.Cookies);
return httpResponse;
}
catch { }
headers = null;
return null;
}
答案 1 :(得分:4)
当然,这已经完成了。我做了几次。这(通常)称为Screen-scraping或Web Scraping。
您应该查看this question(并浏览标记“screen-scraping”下的问题。请注意,Scraping不仅涉及从网络资源中提取数据。还涉及提交数据到在线表单,以便在提交输入(例如登录表单)时模仿用户的操作。