Question

我可以在IE中手动下载。

http://scholar.google.com/scholar.ris?q=info:j8ymU9rzMsEJ:scholar.google.com/&output=citation&hl=zh-CN&as_sdt=2000&oe=GB&ct=citation&cd=0

但是，使用以下代码

WebClient client = new WebClient();
client.DownloadFile(address, filename);

显示例外： 403禁止

怎么了？我怎么能这样做？

其他

http://scholar.google.com/scholar.ris?q=info:sskrpr5jlLwJ:scholar.google.com/&output=citation&hl=zh-CN&as_sdt=2000&oe=GB&ct=citation&cd=1

Answer 1

只需在下载之前添加一行简单的内容：

string url = ... 
string fileName = ...

WebClient wb = new WebClient();
wb.Headers.Add("User-Agent: Other");   //that is the simple line!
wb.DownloadFile(url, fileName);

就是这样。

Answer 2

403也可能是由TLS问题引起的。要进行验证，您应该检查WebException.Response对象的文本。

     catch (WebException ex)
     {
        if (ex.Response != null)
        {
           var response = ex.Response;
           var dataStream = response.GetResponseStream();
           var reader = new StreamReader(dataStream);
           var details = reader.ReadToEnd();
        }
     }

如果是TLS，请尝试将其添加到您的代码中以强制使用TLS1.2。

对于.net4：

ServicePointManager.SecurityProtocol = (SecurityProtocolType)3072;

对于.net4.5或更高版本：

ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;

Answer 3

我在尝试从SharePoint网站URL下载图像时遇到此问题。在我的情况下，仅将user-agent设置为“其他”或标题中的空白是不够的，我不得不按以下方式设置user-agent：

client.Headers.Add("user-agent", " Mozilla/5.0 (Windows NT 6.1; WOW64; rv:25.0) Gecko/20100101 Firefox/25.0");

该解决方案来自this answer。

Answer 4

我在IE中获得403，我猜您需要登录才能检索资源。您的浏览器可能已缓存凭据，但您的应用并非旨在让您登录。或者您是否在浏览器中登录Google - 尝试退出并查看您是否仍然可以访问....

Answer 5

在调用DownloadFile方法之前，您需要设置相应的http标头。

WebClient webClient = new WebClient();
webClient.Headers.Add("???", "???");
webClient.Headers.Add("???", "???");
webClient.Headers.Add("???", "???");
webClient.DownloadFile(address, filename);

要放置正确的值而不是这些问号可能会很棘手。您需要下载Fiddler或其他程序或webbrowser扩展程序，以显示您的webbrowser向Google发送的HTTP标头，并基本上在您的程序中复制相同的请求。

Answer 6

这就是我发生的事情：

我试图下载一个（公共）.xls文件（通过DownloadFile方法），该文件可以从所有浏览器中轻松下载。

在尝试并努力解决所有问题后（但没有运气），我终于打开了堆栈并注意到一些奇怪的东西（参见截图）。

虽然该文件是通过浏览器中的 http 下载的，但它通过 DownloadFile 方法提供了403错误。

最后，我刚刚将网址从http://something更改为https://something，而且效果很好。

希望这有帮助！

Answer 7

我遇到了同样的问题，试图在Amazon 3S网址上下载文件。我在这里写博客： http://blog.cdeutsch.com/2010/11/net-webclient-403-forbidden-error.html

我使用的最终解决方案在这里找到： GETting a URL with an url-encoded slash

Answer 8

解决这个问题的关键是通过代码执行一次请求，第二次在浏览器中使用Fiddler记录这两个请求，并确保标题匹配。

我最终不得不为：

添加标题

接受
接受编码
接受语言
的User-Agent
升级不安全-请求

我希望这有助于将来的人们。

Answer 9

我遇到了类似的问题，试图从几个特定网站下载文件导致某些文件返回403错误，但其他文件没有。

我尝试了User-Agent标头，接受标头，尝试https网址和各种其他设置，但仍未成功。

两个URL都将在浏览器中加载，并且不需要在网站上进行任何身份验证即可访问它们（它们是公共访问权限），但是一个将下载，另一个将返回403.

对原因是什么以及如何解决的任何帮助。

static void Main(string[] args)
    {
        WebClient webClient = new WebClient();
        webClient.Headers.Add("Accept: text/html, application/xhtml+xml, application/pdf, */*");
        webClient.Headers.Add("User-Agent: Mozilla/5.0 (compatible; MSIE 9.0; Windows NT 6.1; WOW64; Trident/5.0)");
        webClient.Headers.Add("Accept-Encoding: gzip, deflate, br");
        webClient.Headers.Add("Accept-Language: en-US,en;q=0.9");
        webClient.Headers.Add("Cache-Control: no-cache");
        webClient.Headers.Add("Upgrade-Insecure-Requests: 1");
        try
        {

            webClient.DownloadFile(new Uri("https://www.vigil.aero/wp-content/uploads/PSB-10-2013-06-14-.pdf"), "test1.pdf");             
            Console.WriteLine("Complete");
        }
        catch (Exception ex)
        {
            Console.WriteLine("{0}", ex.Message);
        }
        try
        {


            webClient.DownloadFile(new Uri("https://www.vigil.aero/wp-content/uploads/PSB-9-2013-06-14.pdf"), "test2.pdf");
            Console.WriteLine("Complete");
        }
        catch (Exception ex)
        {
            Console.WriteLine("{0}", ex.Message);
        }
        Console.ReadLine();
    }

WebClient 403禁止使用

9 个答案: