我正在ASP.net上为网站做一个webscraping项目,因为需要使用Catpcha代码,因此我需要在继续之前获取用户密码的Captcha代码。
到目前为止,该项目工作正常,但我发现的唯一问题是有时候验证码代码响应没有被完全捕获,因此将响应流转换为Image会导致以下错误: “参数无效。”
我注意到Web浏览器没有这个问题,只要服务器没有关闭,它总能很好地显示验证码。
然而,这对HttpWebRequest没有意义,它有时能够得到它,有时候没有,我可以知道有没有办法确保响应流完整?
我的代码段如下:
public Image GetCaptchaCode()
{
Image returnVal = null;
Uri uri = new Uri(URL_CAPTCHA);
HttpWebRequest request = null;
HttpWebResponse response = null;
try
{
// Get Cookies
CookieCollection cookies = this.GetCookies();
foreach (Cookie cookie in cookies)
{
Console.WriteLine(cookie.Name + ": " + cookie.Value);
}
// Get Catpcha
request = (HttpWebRequest)HttpWebRequest.Create(uri);
request.ProtocolVersion = HttpVersion.Version11;
request.Method = WebRequestMethods.Http.Get; // use GET for loading Captcha
request.CookieContainer = this._cookies; // Store Cookies Info
System.Net.ServicePointManager.Expect100Continue = false;
// Add more cookies
if (cookies != null)
{
request.CookieContainer.Add(cookies);
}
// Handle Gzip Compression
request.Headers.Add(HttpRequestHeader.AcceptEncoding, HEADER_TYPE);
request.AutomaticDecompression = DecompressionMethods.GZip;
request.Referer = URL_REFERER;
request.UserAgent = USER_AGENT;
// Get Response
response = (HttpWebResponse)request.GetResponse();
returnVal = Image.FromStream(response.GetResponseStream());
}
catch (Exception ex)
{
string errMsg = ex.Message;
}
finally
{
if (uri != null) uri = null;
if (request != null) request = null;
if (response != null)
{
response.Close();
response = null;
}
}
return returnVal;
}