使用httpclient异步任务下载http源

时间:2019-05-16 18:12:41

标签: c# httpclient

尝试使用以下https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/how-to-extend-the-async-walkthrough-by-using-task-whenall中的代码下载/获取页面html源

尝试学习,非常新。我试图在这里和那里修改一些东西,但是没有用。

    private async void startButton_Click(object sender, RoutedEventArgs e)  
    {  
        resultsTextBox.Clear();  

        // One-step async call.  
        await SumPageSizesAsync();  

        // Two-step async call.  
        //Task sumTask = SumPageSizesAsync();  
        //await sumTask;  

        resultsTextBox.Text += "\r\nControl returned to startButton_Click.\r\n";  
    }  

    private async Task SumPageSizesAsync()  
    {  
        // Make a list of web addresses.  
        List<string> urlList = SetUpURLList();  

        // Declare an HttpClient object and increase the buffer size. The  
        // default buffer size is 65,536.  
        HttpClient client = new HttpClient() { MaxResponseContentBufferSize = 1000000 };  

        // Create a query.  
        IEnumerable<Task<int>> downloadTasksQuery =   
            from url in urlList select ProcessURLAsync(url, client);  

        // Use ToArray to execute the query and start the download tasks.  
        Task<int>[] downloadTasks = downloadTasksQuery.ToArray();  

        // You can do other work here before awaiting.  

        // Await the completion of all the running tasks.  
        int[] lengths = await Task.WhenAll(downloadTasks);  

        //// The previous line is equivalent to the following two statements.  
        //Task<int[]> whenAllTask = Task.WhenAll(downloadTasks);  
        //int[] lengths = await whenAllTask;  

        int total = lengths.Sum();  

        //var total = 0;  
        //foreach (var url in urlList)  
        //{  
        //    // GetByteArrayAsync returns a Task<T>. At completion, the task  
        //    // produces a byte array.  
        //    byte[] urlContent = await client.GetByteArrayAsync(url);  

        //    // The previous line abbreviates the following two assignment  
        //    // statements.  
        //    Task<byte[]> getContentTask = client.GetByteArrayAsync(url);  
        //    byte[] urlContent = await getContentTask;  

        //    DisplayResults(url, urlContent);  

        //    // Update the total.  
        //    total += urlContent.Length;  
        //}  

        // Display the total count for all of the web addresses.  
        resultsTextBox.Text +=  
            $"\r\n\r\nTotal bytes returned:  {total}\r\n";
    }  

    private List<string> SetUpURLList()  
    {  
        List<string> urls = new List<string>   
        {   
            "https://msdn.microsoft.com",  
            "https://msdn.microsoft.com/library/hh290136.aspx",  
            "https://msdn.microsoft.com/library/ee256749.aspx",  
            "https://msdn.microsoft.com/library/hh290138.aspx",  
            "https://msdn.microsoft.com/library/hh290140.aspx",  
            "https://msdn.microsoft.com/library/dd470362.aspx",  
            "https://msdn.microsoft.com/library/aa578028.aspx",  
            "https://msdn.microsoft.com/library/ms404677.aspx",  
            "https://msdn.microsoft.com/library/ff730837.aspx"  
        };  
        return urls;  
    }  

    // The actions from the foreach loop are moved to this async method.  
    async Task<int> ProcessURLAsync(string url, HttpClient client)  
    {  
        byte[] byteArray = await client.GetByteArrayAsync(url);  
        DisplayResults(url, byteArray);  
        return byteArray.Length;  
    }  

    private void DisplayResults(string url, byte[] content)  
    {  
        // Display the length of each web site. The string format   
        // is designed to be used with a monospaced font, such as  
        // Lucida Console or Global Monospace.  
        var bytes = content.Length;  
        // Strip off the "https://".  
        var displayURL = url.Replace("https://", "");  
        resultsTextBox.Text += $"\n{displayURL,-58} {bytes,8}";
    }

编辑-

此代码来自我发布的链接,我测试了该代码,它返回内容的字节/长度。但是我想要页面源而不是字节/长度信息。我尝试通过更改processurlasync函数,并能够获取html源。但这是正确的方法,也是最有效的方法吗?

    async Task<int> ProcessURLAsync(string url, HttpClient client)
    {
        byte[] byteArray = await client.GetByteArrayAsync(url);
        //DisplayResults(url, byteArray);
        return byteArray.Length;
    }
    async Task<string> ProcessURLAsyncS(string url, HttpClient client)
    {
        var byteArrayS = new StreamReader(await client.GetStreamAsync(url));
        //byte[] byteArray = await client.GetByteArrayAsync(url);
        DisplayResults(url, byteArrayS.ReadToEnd());
        return byteArrayS.ReadToEnd();
    }

1 个答案:

答案 0 :(得分:0)

几件事:

  • 如果要发出GET请求并以字符串形式读取内容,则只需使用GetStringAsync

    string content = await client.GetStringAsync(url);
    
  • 在以一个字符串读取整个内容时要小心。那可能要消耗很多内存。相反,使用GetStreamAsync的原因是要从流中分块读取并将该输出写入其他位置。

  • 您引用的示例代码示例很旧,并且会犯一些错误:

    1. default buffer size is 2GB,而不是64KB。在大多数情况下,无需显式设置它。

    2. HttpClient is meant to be created as a single instance and shared,不是为每个请求(在这种情况下都不是为按钮的每次单击)创建的。

    3. 调用downloadTasksQuery.ToArray()的行是不必要的。一旦您开始等待任务,它们就会开始。