检查远程/外部网页是否可用并获取状态代码

时间:2016-04-07 22:26:10

标签: c# asp.net

我知道有很多关于使用System.Net.WebRequest检查远程网页是否可用的问题和答案,但我还没有发现这些方法对我的情况100%有用。 / p>

我需要检查页面是否可用,不会返回404或500错误,如果有重定向,那么我想跟踪它们,直到找到工作页面。如果远程页面需要身份验证(401未经授权),我想知道,因为在某些情况下这可能是可接受的。

发出请求并且远程服务器返回内部服务器错误(500)时,WebRequest会引发异常。

我还发现301重定向也会引发异常,但在我的情况下,我想检查重定向是否是有效页面。

是否有其他方法可以检查页面是否存在,并且只有在获取标题时出现错误(即无效的域名等)时才会获取实际的HTTP状态代码或异常?

现在这就是我现在做的不够好......

protected bool URLExists(string url)
{
    bool result = false;

    try
    {
        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create(url);
        webRequest.Timeout = 1200;
        webRequest.Method = "GET";
        //webRequest.AllowAutoRedirect = true;
        //webRequest.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.110 Safari/537.36";

        HttpWebResponse response = null;

        try
        {
            response = (HttpWebResponse)webRequest.GetResponse();
            //result = true;

            int statusCode = (int)response.StatusCode;
            if (statusCode >= 100 && statusCode < 400) //Good requests
            {
                return true;
            }
            else if (statusCode >= 500 && statusCode <= 510) //Server Errors
            {
                return false;
            }
        }
        catch (WebException webException)
        {

        }
        finally
        {
            if (response != null)
            {
                response.Close();
            }
        }
    }
    catch (Exception ex)
    {

    }

    return result;
}

2 个答案:

答案 0 :(得分:5)

考虑一下这段代码,您需要获得200个响应状态代码以确保网页可用,如果重定向发生,请遵循终点URL响应状态代码。

public static bool URLExists(string url)
        {
            HttpStatusCode result = default(HttpStatusCode);

            var request =(HttpWebRequest)WebRequest.Create(url);
            request.AllowAutoRedirect = false;
            request.Method = "HEAD";
            try
            {
                using (var response = request.GetResponse() as HttpWebResponse)
                {
                    if (response != null)
                    {
                        if (response.StatusCode == HttpStatusCode.OK)
                            return true;
                        if (response.StatusCode == HttpStatusCode.Redirect)
                        {
                            string uriString = response.Headers["Location"];
                            return URLExists(uriString);
                        }

                        response.Close();
                    }
                }
                return false;
            }
            catch (WebException e)
            {
                using (WebResponse response = e.Response)
                {
                    HttpWebResponse httpResponse = (HttpWebResponse)response;
                    Console.WriteLine("Error code: {0}", httpResponse.StatusCode);
                    return false;
                }
            }
        }

够好吗? ;)

答案 1 :(得分:1)

您可以添加可选参数以在必要时接受某些状态代码(例如401,403)。此外,您的功能同时执行两项操作,我将它们分成Func<>来说明,但可以自由地在您的班级或任何最合适的地方创建私人功能。由于401正在抛出异常,因此您需要从异常中提取状态代码。

protected bool URLExists(string url, params int[] acceptableCodes)
{
    Func<string, int> getStatusCode = pageUrl =>
    {
        var statusCode = -1; // Default status code
        var webRequest = (HttpWebRequest)WebRequest.Create(pageUrl);
        webRequest.Timeout = 1200;
        webRequest.Method = "GET";

        HttpWebResponse response = null;
        try
        {
            response = webRequest.GetResponse() as HttpWebResponse;
        }
        catch (WebException webException)
        {
            response = webException.Response as HttpWebResponse;
        }
        finally
        {
            if (response != null)
            {
                statusCode = (int)response.StatusCode;

                response.Close();
            }
        }

        return statusCode;
    };

    Func<int, bool> isStatusCodeOk = code =>
    {
        if (acceptableCodes != null && acceptableCodes.Contains(code))
        {
            // Accept this code
            return true;
        }

        if (code >= 100 && code < 400) //Good requests
        {
            return true;
        }

        if (code >= 500 && code <= 510) //Server Errors
        {
            return false;
        }

        // Default
        return false;
    };

    var statusCode = getStatusCode(url);

    return isStatusCodeOk(statusCode);
}

以下传递:

Assert.IsTrue(URLExists("http://httpstat.us/200"));
Assert.IsTrue(URLExists("http://httpstat.us/301"));
Assert.IsTrue(URLExists("http://httpstat.us/302"));

Assert.IsFalse(URLExists("http://httpstat.us/400"));
Assert.IsTrue(URLExists("http://httpstat.us/401", 401, 403));
Assert.IsTrue(URLExists("http://httpstat.us/403", 401, 403));
Assert.IsFalse(URLExists("http://httpstat.us/404"));

Assert.IsFalse(URLExists("http://httpstat.us/500"));
Assert.IsFalse(URLExists("http://httpstat.us/502"));
Assert.IsFalse(URLExists("http://httpstat.us/503"));
Assert.IsFalse(URLExists("http://httpstat.us/504"));

Assert.IsFalse(URLExists("http://whatever.invalid"));