WebClient挂起直到超时

时间:2018-12-20 16:23:59

标签: c# timeout webclient hang

我尝试使用WebClient下载网页,但是该网页一直挂起,直到达到WebClient的超时,然后失败并显示异常。

以下代码不起作用

WebClient client = new WebClient();
string url = "https://www.nasdaq.com/de/symbol/aapl/dividend-history";
string page = client.DownloadString(url);

使用其他URL,传输可以正常进行。例如

WebClient client = new WebClient();
string url = "https://www.ariva.de/apple-aktie";
string page = client.DownloadString(url);

非常快速地完成,并且页面变量中包含整个html。

使用HttpClient或WebRequest / WebResponse在第一个URL:块上获得相同的结果,直到超时异常为止。

两个URL都可以在浏览器中正常加载,大约需要2-5秒。 知道问题出在哪里,有什么解决方案可用?

我注意到,在Windows窗体对话框上使用WebBrowser控件时,第一个URL加载有20多个javascript错误,需要确认单击。在访问第一个URL时,在浏览器中打开开发人员工具时,也会观察到同样的情况。

但是,WebClient不会对其获得的回报采取行动。它不会运行javascript,也不会加载引用的图片,css或其他脚本,因此这应该不是问题。

谢谢!

拉尔夫

2 个答案:

答案 0 :(得分:1)

第一个站点"https://www.nasdaq.com/de/symbol/aapl/dividend-history";要求:

这里的User-agent很重要。如果在WebRequest.UserAgent中指定了最新的User-agent,则WebSite将激活Http 2.0协议以及一些仅由最近的浏览器支持/理解的安全措施(作为参考,FireFox 56或较新)。

必须使用较新的浏览器作为User-agent,否则WebSite会期望(并等待)动态响应。使用 User-agent,网站将激活Http 1.1协议。

第二个站点"https://www.ariva.de/apple-aktie";要求:

  • ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12
  • 不需要服务器证书验证
  • 不需要特定的用户代理

我建议以这种方式设置WebRequest(或相应的HttpClient设置):
(WebClient 可以起作用,但是可能需要派生的自定义控件)

private async void button1_Click(object sender, EventArgs e)
{
    button1.Enabled = false;
    Uri ResourceURI = new Uri("https://www.nasdaq.com/de/symbol/aapl/dividend-history");
    string DestinationFile = "[Some Local File]";
    await HTTPDownload(ResourceURI, DestinationFile);
    button1.Enabled = true;
}


CookieContainer CookieJar_HTTPDownload = new CookieContainer();

//The 32bit IE11 header is the User-agent used here
public async Task HTTPDownload(Uri ResourceURI, DestinationFile)
{
    ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
    ServicePointManager.ServerCertificateValidationCallback += (s, cert, ch, sec) => { return true; };
    ServicePointManager.DefaultConnectionLimit = 50;

    HttpWebRequest httpRequest = WebRequest.CreateHttp(ResourceURI);

    try
    {
        httpRequest.CookieContainer = CookieJar_HTTPDownload;
        httpRequest.Timeout = (int)TimeSpan.FromSeconds(15).TotalMilliseconds;
        httpRequest.AllowAutoRedirect = true;
        httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
        httpRequest.ServicePoint.Expect100Continue = false;
        httpRequest.UserAgent = "Mozilla / 5.0(Windows NT 6.1; WOW32; Trident / 7.0; rv: 11.0) like Gecko";
        httpRequest.Accept = "ext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
        httpRequest.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip, deflate;q=0.8");
        httpRequest.Headers.Add(HttpRequestHeader.CacheControl, "no-cache");

        using (HttpWebResponse httpResponse = (HttpWebResponse)await httpRequest.GetResponseAsync())
        using (Stream ResponseStream = httpResponse.GetResponseStream())
        {
            if (httpResponse.StatusCode == HttpStatusCode.OK)
            {
                try
                {
                    int buffersize = 132072;
                    using (FileStream fileStream = File.Create(DestinationFile, buffersize, FileOptions.Asynchronous))
                    {
                        int read;
                        byte[] buffer = new byte[buffersize];
                        while ((read = await ResponseStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
                        {
                            await fileStream.WriteAsync(buffer, 0, read);
                        }
                    };
                }
                catch (DirectoryNotFoundException) { /* Log or throw */}
                catch (PathTooLongException) { /* Log or throw */}
                catch (IOException) { /* Log or throw */}
            }
        };
    }
    catch (WebException) { /* Log and message */} 
    catch (Exception) { /* Log and message */}
}

第一个返回的WebSite(nasdaq.com)有效负载长度为101.562个字节
返回的第二个网站(www.ariva.de)的有效载荷长度为56.919个字节

答案 1 :(得分:0)

显然,下载该链接存在问题(错误的url,未经授权的访问等),但是您可以使用异步方法来解决对接部分:

  WebClient client = new WebClient();
  client.DownloadStringCompleted += (s, e) =>
  {
       //here deal with downloaded file
  };
  client.DownloadStringAsync(url);