使用SemaphoreSlim和Continuewith下载Urls

时间:2015-04-07 06:40:53

标签: c# async-await task-parallel-library dotnet-httpclient

我正在尝试使用SemaphoreSlimContinueWith来限制我正在运行的并发任务的数量。但是运行时行为对我的期望有很大的不同。

我为ServicePointManager.DefaultConnectionLimit设置的值等于288,因为我初始化了SemaphoreSlim(100),我对运行时行为的期望是代码应首先生成100个线程然后开始一个新任务当第一项任务完成时。

var sr =
    new StreamReader(
        @"UrlList.tsv");

var urlList = new List<string>();
for (int i = 0; i < 1000; i++)
{
    string line = sr.ReadLine();
    string[] tokens = line.Split('\t');
    string url = tokens[4];
    urlList.Add(url);
}

ServicePointManager.DefaultConnectionLimit = 12*Environment.ProcessorCount;
Console.WriteLine(DateTime.Now + "\t" + ServicePointManager.DefaultConnectionLimit);

var tasks = new Task[urlList.Count];
var semaphore = new SemaphoreSlim(100);
var client = new HttpClient();
int cnt = 0;
for (int i = 0; i < urlList.Count; i++)
{
    int i1 = i;
    tasks[i] = semaphore.WaitAsync().ContinueWith(task =>
    {
        Console.WriteLine(DateTime.Now + "\t" + ++cnt);
        var t = client.GetStringAsync(urlList[i1]);
        Console.WriteLine(t.Result);
        semaphore.Release();
        return t.Result;
    });
}
Task.WhenAll(tasks).GetAwaiter().GetResult();

输出看起来像这样:

4/6/2015 11:36:12 PM    288
4/6/2015 11:36:12 PM    1
4/6/2015 11:36:12 PM    7
4/6/2015 11:36:12 PM    10
4/6/2015 11:36:12 PM    11
4/6/2015 11:36:12 PM    12
4/6/2015 11:36:12 PM    3
4/6/2015 11:36:12 PM    4
4/6/2015 11:36:12 PM    2
4/6/2015 11:36:12 PM    5
4/6/2015 11:36:12 PM    8
4/6/2015 11:36:12 PM    6
4/6/2015 11:36:12 PM    9
4/6/2015 11:36:12 PM    21
4/6/2015 11:36:12 PM    17
4/6/2015 11:36:12 PM    14
4/6/2015 11:36:12 PM    15
4/6/2015 11:36:12 PM    13
4/6/2015 11:36:12 PM    22
4/6/2015 11:36:12 PM    16
4/6/2015 11:36:12 PM    23
4/6/2015 11:36:12 PM    20
4/6/2015 11:36:12 PM    19
4/6/2015 11:36:12 PM    24
4/6/2015 11:36:12 PM    25
4/6/2015 11:36:12 PM    18
4/6/2015 11:36:13 PM    26
4/6/2015 11:36:14 PM    27
4/6/2015 11:36:15 PM    28

所以似乎线程没有以我期望的方式生成,也没有显示Url内容。我的代码中究竟出现了什么问题?

1 个答案:

答案 0 :(得分:3)

尝试这样的事情:

async Task<IEnumerable<string>> DoItAsync(int threads, IEnumerable<string> urls)
{
    ServicePointManager.DefaultConnectionLimit = 12*Environment.ProcessorCount;
    Console.WriteLine("{0:HH:mm:ss.ffffff}\t{1}", DateTime.Now, ServicePointManager.DefaultConnectionLimit);

    var semaphore = new SemaphoreSlim(threads);
    var client = new HttpClient();
    var cnt = 0;
    var tasks = new List<Task<string>>();
    foreach (var url in urls)
    {
        tasks.Add(((Func<Task<string>>)(async () =>
            {
                await semaphore.WaitAsync();

                var c = ++cnt;
                Console.WriteLine("{0:HH:mm:ss.ffffff}\t{1}\t{2}", DateTime.Now, c, url);
                var s = await client.GetStringAsync(url);
                Console.WriteLine("{0:HH:mm:ss.ffffff}\t{1}\t{2}\t{3}", DateTime.Now, c, url, s.Substring(0, 20));
                semaphore.Release();
                return s;
            }))());
    }

    return await Task.WhenAll(tasks);
}