这是代码:
for (int x = 0; x < imagesSatelliteUrls.Count; x++)
{
if (!imagesSatelliteUrls[x].StartsWith("http://"))
{
imagesSatelliteUrls[x] = stringForSatelliteMapUrls + imagesSatelliteUrls[x];
}
using (WebClient client = new WebClient())
{
if (!imagesSatelliteUrls[x].Contains("href"))
{
client.DownloadFile(imagesSatelliteUrls[x],
UrlsDir + "SatelliteImage" + counter.ToString("D6"));
}
}
counter++;
}
它将按文件下载文件。 List imagesSatelliteUrls包含260个按组排序的文件链接。
例如:
index[0] "Group 1"
index[1] some link ....
index[2] some link ....
.
.
.
index[34] "Group 2"
index[35] some link ....
index[36] some link ....
.
.
.
.
index[71] "Group 3"
等等有7组。 我希望它从每个组下载第一个文件togeather,意思是下载并行7个文件。第1组中的第一个文件2 3 4 5 6 7 然后,如果其中一个文件在任何组中完成,它将开始从该组下载下一个文件。
所以我会看到每个第二个7个文件下载,每个文件来自另一个组。 一个文件在某个组中完成下载,它应该移动到同一组中的下一个文件并开始下载。
我该怎么办?由于这个client.DownloadFile我现在使用只会按文件下载文件。
试图下载并行:
这是代码:
Parallel.For(0, imagesSatelliteUrls.Count, /*new ParallelOptions { MaxDegreeOfParallelism = 20 },*/ x =>
{
if (!imagesSatelliteUrls[x].StartsWith("http://"))
{
imagesSatelliteUrls[x] = stringForSatelliteMapUrls + imagesSatelliteUrls[x];
}
using (WebClient client = new WebClient())
{
if (!imagesSatelliteUrls[x].Contains("href"))
{
client.DownloadFile(imagesSatelliteUrls[x],
UrlsDir + "SatelliteImage" + counter.ToString("D6"));
}
}
counter++;
}); // end of Paralle
例外是:
System.Net.WebException was unhandled by user code
HResult=-2146233079
Message=An exception occurred during a WebClient request.
Source=System
StackTrace:
at System.Net.WebClient.DownloadFile(Uri address, String fileName)
at System.Net.WebClient.DownloadFile(String address, String fileName)
at WeatherMaps.ExtractImages.<>c__DisplayClass2.<.ctor>b__0(Int32 x) in d:\C-Sharp\WeatherMaps\WeatherMaps\WeatherMaps\ExtractImages.cs:line 145
at System.Threading.Tasks.Parallel.<>c__DisplayClassf`1.<ForWorker>b__c()
InnerException: System.IO.IOException
HResult=-2147024864
Message=The process cannot access the file 'd:\localpath\Urls\SatelliteImage000000' because it is being used by another process.
Source=mscorlib
StackTrace:
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access)
at System.Net.WebClient.DownloadFile(Uri address, String fileName)
InnerException:
我也试过这段代码:
for (int i = 0; i < 7; i++)
{
Task.Factory.StartNew(() =>
{
// Here you can easily implement your checking algo as you see fit
while (counter < imagesSatelliteUrls.Count)
{
if (!imagesSatelliteUrls[count].StartsWith("http://"))
{
imagesSatelliteUrls[count] = stringForSatelliteMapUrls + imagesSatelliteUrls[count];
}
using (WebClient client = new WebClient())
{
if (!imagesSatelliteUrls[count].Contains("href"))
{
client.DownloadFile(imagesSatelliteUrls[count], UrlsDir + "SatelliteImage" + counter.ToString("D6"));
}
}
lock (this)
{
count++;
counter++;
}
}
});
}
System.Net.WebException was unhandled by user code
HResult=-2146233079
Message=An exception occurred during a WebClient request.
Source=System
StackTrace:
at System.Net.WebClient.DownloadFile(Uri address, String fileName)
at System.Net.WebClient.DownloadFile(String address, String fileName)
at WeatherMaps.ExtractImages.<>c__DisplayClass4.<.ctor>b__2() in d:\C-Sharp\WeatherMaps\WeatherMaps\WeatherMaps\ExtractImages.cs:line 122
at System.Threading.Tasks.Task.InnerInvoke()
at System.Threading.Tasks.Task.Execute()
InnerException: System.IO.IOException
HResult=-2147024864
Message=The process cannot access the file 'd:\localpath\Urls\SatelliteImage000000' because it is being used by another process.
Source=mscorlib
StackTrace:
at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access)
at System.Net.WebClient.DownloadFile(Uri address, String fileName)
InnerException:
答案 0 :(得分:1)
使用Parallel.For
//for (int x = 0; x < imagesSatelliteUrls.Count; x++)
Parallel.For(0, imagesSatelliteUrls.Count, /*new ParallelOptions { MaxDegreeOfParallelism = 20 },*/ x =>
{
if (!imagesSatelliteUrls[x].StartsWith("http://"))
{
imagesSatelliteUrls[x] = stringForSatelliteMapUrls + imagesSatelliteUrls[x];
}
using (WebClient client = new WebClient())
{
if (!imagesSatelliteUrls[x].Contains("href"))
{
client.DownloadFile(imagesSatelliteUrls[x],
UrlsDir + "SatelliteImage" + x.ToString("D6"));
}
}
counter++;
}); // end of Parallel.For
答案 1 :(得分:0)
如果您添加对System.Net.Http.dll
的引用并使用HttpClient
类,我创建了一个独立的示例,说明如何执行此操作。
// Create a mock list of data
string someImageUrl = "..."; // some test url of an image file
string urlsDirectory = @"C:\Temp"; // some working directory
var urls = new string[7 * 20];
for (int i = 0; i < urls.Length; i += 7)
{
urls[i] = String.Format("Group {0}", (i / 7) + 1);
for (int j = 1; j < 7; j++)
{
urls[i + j] = someImageUrl;
}
}
// Download 6 files at a time.
var client = new HttpClient();
for (int i = 0; i < urls.Length; i += 7)
{
var directoryPath = Directory.CreateDirectory(Path.Combine(urlsDirectory, urls[i])).FullName;
var tasks = urls.Skip(i + 1).Take(6).Select(url =>
{
return client.GetAsync(url);
}).ToArray();
Task.WaitAll(tasks);
for (int j = 0; j < tasks.Length; j++)
{
var response = tasks[j].Result;
using (var fs = new FileStream(Path.Combine(directoryPath, String.Format("Image {0}.jpg", j + 1)), FileMode.OpenOrCreate))
{
using (var responseStream = response.Content.ReadAsStreamAsync().Result)
{
responseStream.CopyTo(fs);
}
}
}
}
需要注意的重要一点是,我认为你失去了一些WebClient的自动文件名协商。这是值得的,但你可以在我的例子中看到我只是标记了图像“Image 1.jpg”,“Image 2.jpg”等。
从技术上讲,通过HTTP请求文件时,您可以请求包含以下URL的图像:
http://somehost.com/getImage?id=5
在这种情况下,很难说文件名应该是什么。处理此问题的HTTP标准方法是添加名为Content-Disposition
的标头,该标头告诉HTTP客户端文件的名称应该是什么。
但每个 Web服务器都不会为您提供Content-Disposition标头,因此您需要回退以尝试将上述URL解析为与Windows兼容的文件名。您可以尝试找到一个简单的函数来剥离所有非NTFS兼容字符的URL。但请记住,在这种情况下,你不会得到一个扩展(jpg,gif等)。服务器可能会给你一个Content-Type
标题来告诉你MIME类型,比如“image / jpeg”,但是由你决定要给它的扩展名。