我知道有很多关于使用System.Net.WebRequest检查远程网页是否可用的问题和答案,但我还没有发现这些方法对我的情况100%有用。 / p>
我需要检查页面是否可用,不会返回404或500错误,如果有重定向,那么我想跟踪它们,直到找到工作页面。如果远程页面需要身份验证(401未经授权),我想知道,因为在某些情况下这可能是可接受的。
发出请求并且远程服务器返回内部服务器错误(500)时,WebRequest会引发异常。
我还发现301重定向也会引发异常,但在我的情况下,我想检查重定向是否是有效页面。
是否有其他方法可以检查页面是否存在,并且只有在获取标题时出现错误(即无效的域名等)时才会获取实际的HTTP状态代码或异常?
现在这就是我现在做的不够好......
protected bool URLExists(string url)
{
bool result = false;
try
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
webRequest.Timeout = 1200;
webRequest.Method = "GET";
//webRequest.AllowAutoRedirect = true;
//webRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36";
HttpWebResponse response = null;
try
{
response = (HttpWebResponse)webRequest.GetResponse();
//result = true;
int statusCode = (int)response.StatusCode;
if (statusCode >= 100 && statusCode < 400) //Good requests
{
return true;
}
else if (statusCode >= 500 && statusCode <= 510) //Server Errors
{
return false;
}
}
catch (WebException webException)
{
}
finally
{
if (response != null)
{
response.Close();
}
}
}
catch (Exception ex)
{
}
return result;
}
答案 0 :(得分:5)
考虑一下这段代码,您需要获得200个响应状态代码以确保网页可用,如果重定向发生,请遵循终点URL响应状态代码。
public static bool URLExists(string url)
{
HttpStatusCode result = default(HttpStatusCode);
var request =(HttpWebRequest)WebRequest.Create(url);
request.AllowAutoRedirect = false;
request.Method = "HEAD";
try
{
using (var response = request.GetResponse() as HttpWebResponse)
{
if (response != null)
{
if (response.StatusCode == HttpStatusCode.OK)
return true;
if (response.StatusCode == HttpStatusCode.Redirect)
{
string uriString = response.Headers["Location"];
return URLExists(uriString);
}
response.Close();
}
}
return false;
}
catch (WebException e)
{
using (WebResponse response = e.Response)
{
HttpWebResponse httpResponse = (HttpWebResponse)response;
Console.WriteLine("Error code: {0}", httpResponse.StatusCode);
return false;
}
}
}
够好吗? ;)
答案 1 :(得分:1)
您可以添加可选参数以在必要时接受某些状态代码(例如401,403)。此外,您的功能同时执行两项操作,我将它们分成Func<>
来说明,但可以自由地在您的班级或任何最合适的地方创建私人功能。由于401正在抛出异常,因此您需要从异常中提取状态代码。
protected bool URLExists(string url, params int[] acceptableCodes)
{
Func<string, int> getStatusCode = pageUrl =>
{
var statusCode = -1; // Default status code
var webRequest = (HttpWebRequest)WebRequest.Create(pageUrl);
webRequest.Timeout = 1200;
webRequest.Method = "GET";
HttpWebResponse response = null;
try
{
response = webRequest.GetResponse() as HttpWebResponse;
}
catch (WebException webException)
{
response = webException.Response as HttpWebResponse;
}
finally
{
if (response != null)
{
statusCode = (int)response.StatusCode;
response.Close();
}
}
return statusCode;
};
Func<int, bool> isStatusCodeOk = code =>
{
if (acceptableCodes != null && acceptableCodes.Contains(code))
{
// Accept this code
return true;
}
if (code >= 100 && code < 400) //Good requests
{
return true;
}
if (code >= 500 && code <= 510) //Server Errors
{
return false;
}
// Default
return false;
};
var statusCode = getStatusCode(url);
return isStatusCodeOk(statusCode);
}
以下传递:
Assert.IsTrue(URLExists("http://httpstat.us/200"));
Assert.IsTrue(URLExists("http://httpstat.us/301"));
Assert.IsTrue(URLExists("http://httpstat.us/302"));
Assert.IsFalse(URLExists("http://httpstat.us/400"));
Assert.IsTrue(URLExists("http://httpstat.us/401", 401, 403));
Assert.IsTrue(URLExists("http://httpstat.us/403", 401, 403));
Assert.IsFalse(URLExists("http://httpstat.us/404"));
Assert.IsFalse(URLExists("http://httpstat.us/500"));
Assert.IsFalse(URLExists("http://httpstat.us/502"));
Assert.IsFalse(URLExists("http://httpstat.us/503"));
Assert.IsFalse(URLExists("http://httpstat.us/504"));
Assert.IsFalse(URLExists("http://whatever.invalid"));