为什么同步代码在异步等待的任务中比异步代码慢得多

时间:2018-03-16 10:59:14

标签: c# wpf asynchronous async-await

我一直在玩无聊,同时从wiki中检索随机文章。首先我写了这段代码:

> sessionInfo()
R version 3.4.3 (2017-11-30)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 16.04.4 LTS

Matrix products: default
BLAS: /usr/lib/libblas/libblas.so.3.6.0
LAPACK: /usr/lib/lapack/liblapack.so.3.6.0

locale:
 [1] LC_CTYPE=fr_BE.UTF-8       LC_NUMERIC=C               LC_TIME=fr_BE.UTF-8       
 [4] LC_COLLATE=fr_BE.UTF-8     LC_MONETARY=fr_BE.UTF-8    LC_MESSAGES=fr_BE.UTF-8   
 [7] LC_PAPER=fr_BE.UTF-8       LC_NAME=C                  LC_ADDRESS=C              
[10] LC_TELEPHONE=C             LC_MEASUREMENT=fr_BE.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] MASS_7.3-49     compiler_3.4.3  Matrix_1.2-11   parallel_3.4.3  tools_3.4.3    
 [6] mgcv_1.8-23     yaml_2.1.18     nlme_3.1-131.1  grid_3.4.3      permute_0.9-4  
[11] vegan_2.4-6     cluster_2.0.6   lattice_0.20-35

但我想摆脱这种异步匿名方法:private async void Window_Loaded(object sender, RoutedEventArgs e) { await DownloadAsync(); } private async Task DownloadAsync() { Stopwatch sw = new Stopwatch(); sw.Start(); var tasks = new List<Task>(); var result = new List<string>(); for (int index = 0; index < 60; index++) { var task = Task.Run(async () => { var scheduledAt = DateTime.UtcNow.ToString("mm:ss.fff"); using (var client = new HttpClient()) using (var response = await client.GetAsync("https://en.wikipedia.org/wiki/Special:Random")) using (var content = response.Content) { var page = await content.ReadAsStringAsync(); var receivedAt = DateTime.UtcNow.ToString("mm:ss.fff"); var data = $"Job done at thread: {Thread.CurrentThread.ManagedThreadId}, Scheduled at: {scheduledAt}, Recieved at: {receivedAt} {page}"; result.Add(data); } }); tasks.Add(task); } await Task.WhenAll(tasks.ToArray()); sw.Stop(); Console.WriteLine($"Process took: {sw.Elapsed.Seconds} sec {sw.Elapsed.Milliseconds} ms"); foreach (var item in result) { Debug.WriteLine(item); } } ,所以我将相关的代码部分替换为:

Task.Run(async () => ...

我希望它执行完全相同,因为我用synchronous替换的异步代码被包装在一个任务中,所以我保证任务调度程序(WPF任务调度程序)会将它排在某个空闲线程上。线程池。这正是我看到返回结果时发生的事情,我得到的值如下:

for (int index = 0; index < 60; index++)
{
    var task = Task.Run(() => {
        var scheduledAt = DateTime.UtcNow.ToString("mm:ss.fff");
        using (var client = new HttpClient())
        // Get this synchronously.
        using (var response = client.GetAsync("https://en.wikipedia.org/wiki/Special:Random").Result)
        using (var content = response.Content)
        {
            // Get this synchronously.
            var page = content.ReadAsStringAsync().Result;
            var receivedAt = DateTime.UtcNow.ToString("mm:ss.fff");
            var data = $"Job done at thread: {Thread.CurrentThread.ManagedThreadId}, Scheduled at: {scheduledAt}, Recieved at: {receivedAt} {page}";
            result.Add(data);
        }
    });

    tasks.Add(task);
}

问题是第一个代码在~6秒内执行,第二个代码(同步Job done at thread: 6, Scheduled at: 53:57.534, Recieved at: 54:54.545 ... Job done at thread: 21, Scheduled at: 54:06.742, Recieved at: 54:54.574 ... Job done at thread: 41, Scheduled at: 54:26.742, Recieved at: 54:54.576 ... Job done at thread: 10, Scheduled at: 53:59.018, Recieved at: 54:54.614 ... )需要 ~50秒。随着减少任务数量,差异变小。任何人都可以解释为什么他们花了这么长时间,即使他们在不同的线程上执行并执行完全相同的单一操作?

1 个答案:

答案 0 :(得分:6)

因为线程池在请求新线程时可能会引入延迟,所以如果池中的线程总数大于可配置的最小值。默认情况下,该最小值为number of cores。在使用.Result的示例中,您将对60个任务进行排队,这些任务在执行的整个持续时间内都拥有线程池线程。这意味着只有number of cores任务将立即启动,然后休息将以延迟开始(如果已经忙碌的线程可用,线程池将等待一定时间,如果没有 - 将添加新线程)。

更糟糕的是 - client.GetAsync(在从服务器收到回复后在GetAsync函数内执行的代码)的延续也被安排到线程池线程。这包含了所有60个任务,因为它们在从GetAsync接收结果之前无法完成,并且GetAsync需要空闲线程池线程来运行其继续。结果,还有一个额外的争用:你创建了60个任务,还有60个来自GetAsync的延续,它们也希望线程池线程能够运行(而你的60个任务被阻塞,等待那些延续的结果)。

await的示例中 - 线程池线程在异步http调用期间被释放。因此,当您调用await GetAsync()并且GetAsync到达异步IO点(实际上发出http请求)时 - 您的线程将被释放回池中。现在它可以自由处理其他请求。这意味着await示例可以在更短的时间内保存线程池线程,并且在等待线程池线程可用时(几乎)没有延迟。

您可以通过执行此操作轻松确认(请勿使用真实代码,仅用于测试)

ThreadPool.SetMinThreads(100, 100);

增加上面提到的池中可配置的最小线程数。当你将它增加到大值时 - 例如.Result的所有60个任务将在60个线程池线程上同时启动,没有延迟,因此你的示例将在大致相同的时间内完成。

以下是观察其工作原理的示例应用程序:

public class Program {
    public static void Main(string[] args) {
        DownloadAsync().Wait();
        Console.ReadKey();
    }

    private static async Task DownloadAsync() {
        Stopwatch sw = new Stopwatch();
        sw.Start();
        var tasks = new List<Task>();
        for (int index = 0; index < 60; index++) {
            var tmp = index;
            var task = Task.Run(() => {
                ThreadPool.GetAvailableThreads(out int wt, out _);
                ThreadPool.GetMaxThreads(out int mt, out _);
                Console.WriteLine($"Started: {tmp} on thread {Thread.CurrentThread.ManagedThreadId}. Threads in pool: {mt - wt}");
                var res = DoStuff(tmp).Result;
                Console.WriteLine($"Done {res} on thread {Thread.CurrentThread.ManagedThreadId}");
            });

            tasks.Add(task);
        }

        await Task.WhenAll(tasks.ToArray());

        sw.Stop();
        Console.WriteLine($"Process took: {sw.Elapsed.Seconds} sec {sw.Elapsed.Milliseconds} ms");
    }

    public static async Task<string> DoStuff(int i) {
        await Task.Delay(1000); // web request
        Console.WriteLine($"continuation of {i} on thread {Thread.CurrentThread.ManagedThreadId}"); // continuation
        return i.ToString();
    }
}