给定包含Urls的输入文本文件,我想一次下载相应的文件。我用这个问题的答案 UserState using WebClient and TaskAsync download from Async CTP作为参考。
public void Run()
{
List<string> urls = File.ReadAllLines(@"c:/temp/Input/input.txt").ToList();
int index = 0;
Task[] tasks = new Task[urls.Count()];
foreach (string url in urls)
{
WebClient wc = new WebClient();
string path = string.Format("{0}image-{1}.jpg", @"c:/temp/Output/", index+1);
Task downloadTask = wc.DownloadFileTaskAsync(new Uri(url), path);
Task outputTask = downloadTask.ContinueWith(t => Output(path));
tasks[index] = outputTask;
}
Console.WriteLine("Start now");
Task.WhenAll(tasks);
Console.WriteLine("Done");
}
public void Output(string path)
{
Console.WriteLine(path);
}
我预计文件的下载将从&#34; Task.WhenAll(tasks)&#34;开始。但事实证明输出看起来像
c:/temp/Output/image-2.jpg c:/temp/Output/image-1.jpg c:/temp/Output/image-4.jpg c:/temp/Output/image-6.jpg c:/temp/Output/image-3.jpg [many lines deleted] Start now c:/temp/Output/image-18.jpg c:/temp/Output/image-19.jpg c:/temp/Output/image-20.jpg c:/temp/Output/image-21.jpg c:/temp/Output/image-23.jpg [many lines deleted] Done
为什么在调用WaitAll之前开始下载?我可以改变什么来实现我想要的(即所有任务将同时开始)?
由于
答案 0 :(得分:3)
为什么在调用WaitAll之前开始下载?
首先,你没有调用同步阻止的Task.WaitAll
,你正在调用Task.WhenAll
,它会返回等待的等待值。
现在,正如其他人所说,当你调用异步方法时,即使不使用await
,它也会触发异步操作,因为符合TAP的任何方法都将返回“热门任务”。
我可以改变什么来达到我想要的(即所有任务都会 同时开始)?
现在,如果您希望将执行推迟到Task.WhenAll
,则可以使用Enumerable.Select
将每个元素投影到Task
,并在将其传递给Task.WhenAll
时实现它。 }:
public async Task RunAsync()
{
IEnumerable<string> urls = File.ReadAllLines(@"c:/temp/Input/input.txt");
var urlTasks = urls.Select((url, index) =>
{
WebClient wc = new WebClient();
string path = string.Format("{0}image-{1}.jpg", @"c:/temp/Output/", index);
var downloadTask = wc.DownloadFileTaskAsync(new Uri(url), path);
Output(path);
return downloadTask;
});
Console.WriteLine("Start now");
await Task.WhenAll(urlTasks);
Console.WriteLine("Done");
}
答案 1 :(得分:0)
为什么在调用WaitAll之前开始下载?
由其公共构造函数创建的任务被称为“冷” 任务,因为他们在非预定的生命周期开始 TaskStatus.Created状态,直到在这些上调用Start 他们进展到预定的实例。所有其他任务开始 他们的生命周期处于“热”状态,意味着异步 他们所代表的行动已经启动了 TaskStatus是Created之外的枚举值。 所有任务 从TAP方法返回必须“热”。
由于DownloadFileTaskAsync
是TAP方法,因此返回&#34; hot&#34; (即已经开始)任务。
我可以改变什么来实现我想要的(即所有任务将同时开始)?
我看TPL Data Flow。这样的事情(我使用HttpClient
而不是WebClient
,但实际上,它并不重要):
static async Task DownloadData(IEnumerable<string> urls)
{
// we want to execute this in parallel
var executionOptions = new ExecutionDataflowBlockOptions { MaxDegreeOfParallelism = Environment.ProcessorCount };
// this block will receive URL and download content, pointed by URL
var donwloadBlock = new TransformBlock<string, Tuple<string, string>>(async url =>
{
using (var client = new HttpClient())
{
var content = await client.GetStringAsync(url);
return Tuple.Create(url, content);
}
}, executionOptions);
// this block will print number of bytes downloaded
var outputBlock = new ActionBlock<Tuple<string, string>>(tuple =>
{
Console.WriteLine($"Downloaded {(string.IsNullOrEmpty(tuple.Item2) ? 0 : tuple.Item2.Length)} bytes from {tuple.Item1}");
}, executionOptions);
// here we tell to donwloadBlock, that it is linked with outputBlock;
// this means, that when some item from donwloadBlock is being processed,
// it must be posted to outputBlock
using (donwloadBlock.LinkTo(outputBlock))
{
// fill downloadBlock with input data
foreach (var url in urls)
{
await donwloadBlock.SendAsync(url);
}
// tell donwloadBlock, that it is complete; thus, it should start processing its items
donwloadBlock.Complete();
// wait while downloading data
await donwloadBlock.Completion;
// tell outputBlock, that it is completed
outputBlock.Complete();
// wait while printing output
await outputBlock.Completion;
}
}
static void Main(string[] args)
{
var urls = new[]
{
"http://www.microsoft.com",
"http://www.google.com",
"http://stackoverflow.com",
"http://www.amazon.com",
"http://www.asp.net"
};
Console.WriteLine("Start now.");
DownloadData(urls).Wait();
Console.WriteLine("Done.");
Console.ReadLine();
}
输出:
立即开始。
从http://www.microsoft.com下载了1020个字节 从http://www.google.com下载了53108个字节 从http://stackoverflow.com下载了244143个字节 从http://www.amazon.com下载了468922个字节 从http://www.asp.net下载了27771个字节 完成。
答案 2 :(得分:-1)
我可以改变什么来达到我想要的(即所有任务都会 同时开始)?
要同步下载的开头,您可以使用Barrier
类。
public void Run()
{
List<string> urls = File.ReadAllLines(@"c:/temp/Input/input.txt").ToList();
Barrier barrier = new Barrier(url.Count, ()=> {Console.WriteLine("Start now");} );
Task[] tasks = new Task[urls.Count()];
Parallel.For(0, urls.Count, (int index)=>
{
string path = string.Format("{0}image-{1}.jpg", @"c:/temp/Output/", index+1);
tasks[index] = DownloadAsync(Uri(urls[index]), path, barrier);
})
Task.WaitAll(tasks); // wait for completion
Console.WriteLine("Done");
}
async Task DownloadAsync(Uri url, string path, Barrier barrier)
{
using (WebClient wc = new WebClient())
{
barrier.SignalAndWait();
await wc.DownloadFileAsync(url, path);
Output(path);
}
}