Webclient.DownloadStringAsync多个url下载速度太慢

时间:2017-04-21 11:26:01

标签: c# webclient webclient-download

我试图同时下载文件。

- 在主链接上下载字符串

- 从中​​获取子链接

- 同时下载子链接(下载590~子链接) 就像一棵树。

    static async Task Direct()
    {
        WebClient wc2 = new WebClient() { UseDefaultCredentials = true, Encoding = Encoding.GetEncoding(1251) };
        HtmlDocument hd = new HtmlDocument();
        wc2.Proxy.Credentials = CredentialCache.DefaultCredentials;
        hd.LoadHtml(wc2.DownloadString(link));
        foreach (var item in hd.DocumentNode.SelectNodes(xpath).Select(x => x.GetAttributeValue("href", "")))
        {
            Task.Run(() =>
                {
                    using (WebClient wc = new WebClient() { UseDefaultCredentials = true, Encoding = Encoding.GetEncoding(1251) })
                    {
                        wc.DownloadStringCompleted += Over;
                        wc.Proxy.Credentials = CredentialCache.DefaultCredentials;
                        wc.DownloadStringAsync(new Uri(item));
                    }
                });
        }
        Console.Title = "ready";
    }
    static Stopwatch sw = new Stopwatch();
    static void Over(object sender, DownloadStringCompletedEventArgs e)
    {
        Task.Run(()=>overTask((sender as WebClient)));
    }
    static async Task overTask(WebClient sender)
    {
        if (sw.IsRunning)
        {
            sw.Stop();
            Console.WriteLine(sw.ElapsedMilliseconds);
            sw.Reset();
        }
        sw.Start();
        sender.Dispose();
    }

1 - 我有这个代码,它下载子链接但不是同一时间,比它慢。它需要DownloadStringAsync请求,我希望所有字符串最多20秒,但我的大秒表说min。 35秒

2 - 这1000ms,我希望最大。 20毫秒因为他们必须同时开始,他们之间并没有太大的区别。相同的网站,不同的页面。

我在第一个代码中做了一些更改。有一些改进,但仍然比我排除的速度慢。The Result.

正如@Liam所说,我试图通过线程来实现。

static async Task Direct()
    {
        WebClient wc2 = new WebClient() { UseDefaultCredentials = true, Encoding = Encoding.GetEncoding(1251) };
        HtmlDocument hd = new HtmlDocument();
        wc2.Proxy.Credentials = CredentialCache.DefaultCredentials;
        hd.LoadHtml(wc2.DownloadString(link));
        foreach (var item in hd.DocumentNode.SelectNodes(xpath).Select(x => x.GetAttributeValue("href", "")))
        {
            /* Task.Run(() =>
                {
                    using (WebClient wc = new WebClient() { UseDefaultCredentials = true, Encoding = Encoding.GetEncoding(1251) })
                    {
                        wc.DownloadStringCompleted += Over;
                        wc.Proxy.Credentials = CredentialCache.DefaultCredentials;
                        wc.DownloadStringAsync(new Uri(item));
                    }
                });*/

            new Thread(() =>
            {
                using (WebClient wc = new WebClient() { UseDefaultCredentials = true, Encoding = Encoding.GetEncoding(1251) })
                {
                    wc.DownloadStringCompleted += Over;
                    wc.Proxy.Credentials = CredentialCache.DefaultCredentials;
                    wc.DownloadStringAsync(new Uri(item));
                }
            }).Start();
        }
        Console.Title = "ready";
    }
    static Stopwatch sw = new Stopwatch();
    static void Over(object sender, DownloadStringCompletedEventArgs e)
    {
        new Thread(() =>
        {
            if (sw.IsRunning)
            {
                sw.Stop();
                Console.WriteLine(sw.ElapsedMilliseconds);
                sw.Reset();
            }
            sw.Start();
        }).Start();

输出ms(两次下载之间的ms)是 here

编辑:伙计们,foreach循环必须通过微小的差异启动它们。这个问题没有得到答案。我说“如果任务同时开始(至少,它们之间的差别很小),并且他们下载的字符串长度相同(20-30字符更改),那么为什么它们之间的差异太大了?这取决于网站吗?我们可以加快速度吗?

0 个答案:

没有答案