我正在尝试从公共网址下载大文件。它起初似乎运行良好,但1/10计算机似乎超时。我最初的尝试是使用WebClient.DownloadFileAsync
但是因为它永远不会完成我回到使用WebRequest.Create
并直接读取响应流。
我使用WebRequest.Create
的第一个版本发现了与WebClient.DownloadFileAsync
相同的问题。操作超时,文件未完成。
如果下载超时,我的下一个版本会添加重试次数。这是奇怪的。下载最终完成1次重试以完成最后7092字节。因此,文件的下载大小完全相同但文件已损坏且与源文件不同。现在我希望腐败在最后7092字节,但事实并非如此。
使用BeyondCompare我发现损坏的文件中缺少2个字节块,总计缺少7092个字节!这个缺失的字节位于1CA49FF0
和1E31F380
,在下载超时并重新启动之前。
这可能会发生什么?有关如何进一步追踪此问题的任何提示?
以下是相关代码。
public void DownloadFile(string sourceUri, string destinationPath)
{
//roughly based on: http://stackoverflow.com/questions/2269607/how-to-programmatically-download-a-large-file-in-c-sharp
//not using WebClient.DownloadFileAsync as it seems to stall out on large files rarely for unknown reasons.
using (var fileStream = File.Open(destinationPath, FileMode.Create, FileAccess.Write, FileShare.Read))
{
long totalBytesToReceive = 0;
long totalBytesReceived = 0;
int attemptCount = 0;
bool isFinished = false;
while (!isFinished)
{
attemptCount += 1;
if (attemptCount > 10)
{
throw new InvalidOperationException("Too many attempts to download. Aborting.");
}
try
{
var request = (HttpWebRequest)WebRequest.Create(sourceUri);
request.Proxy = null;//http://stackoverflow.com/questions/754333/why-is-this-webrequest-code-slow/935728#935728
_log.AddInformation("Request #{0}.", attemptCount);
//continue downloading from last attempt.
if (totalBytesReceived != 0)
{
_log.AddInformation("Request resuming with range: {0} , {1}", totalBytesReceived, totalBytesToReceive);
request.AddRange(totalBytesReceived, totalBytesToReceive);
}
using (var response = request.GetResponse())
{
_log.AddInformation("Received response. ContentLength={0} , ContentType={1}", response.ContentLength, response.ContentType);
if (totalBytesToReceive == 0)
{
totalBytesToReceive = response.ContentLength;
}
using (var responseStream = response.GetResponseStream())
{
_log.AddInformation("Beginning read of response stream.");
var buffer = new byte[4096];
int bytesRead = responseStream.Read(buffer, 0, buffer.Length);
while (bytesRead > 0)
{
fileStream.Write(buffer, 0, bytesRead);
totalBytesReceived += bytesRead;
bytesRead = responseStream.Read(buffer, 0, buffer.Length);
}
_log.AddInformation("Finished read of response stream.");
}
}
_log.AddInformation("Finished downloading file.");
isFinished = true;
}
catch (Exception ex)
{
_log.AddInformation("Response raised exception ({0}). {1}", ex.GetType(), ex.Message);
}
}
}
}
以下是损坏下载的日志输出:
Request #1.
Received response. ContentLength=939302925 , ContentType=application/zip
Beginning read of response stream.
Response raised exception (System.Net.WebException). The operation has timed out.
Request #2.
Request resuming with range: 939295833 , 939302925
Received response. ContentLength=7092 , ContentType=application/zip
Beginning read of response stream.
Finished read of response stream.
Finished downloading file.
答案 0 :(得分:1)
这是我经常使用的方法,到目前为止,我还没有让你失败,因为你需要同样的负载。尝试使用我的代码更改你的代码,看看是否有帮助。
if (!Directory.Exists(localFolder))
{
Directory.CreateDirectory(localFolder);
}
try
{
HttpWebRequest httpRequest = (HttpWebRequest)WebRequest.Create(Path.Combine(uri, filename));
httpRequest.Method = "GET";
// if the URI doesn't exist, exception gets thrown here...
using (HttpWebResponse httpResponse = (HttpWebResponse)httpRequest.GetResponse())
{
using (Stream responseStream = httpResponse.GetResponseStream())
{
using (FileStream localFileStream =
new FileStream(Path.Combine(localFolder, filename), FileMode.Create))
{
var buffer = new byte[4096];
long totalBytesRead = 0;
int bytesRead;
while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
{
totalBytesRead += bytesRead;
localFileStream.Write(buffer, 0, bytesRead);
}
}
}
}
}
catch (Exception ex)
{
throw;
}
答案 1 :(得分:0)
您应该更改超时设置。似乎有两种可能的超时问题:
答案 2 :(得分:0)
对我来说,关于如何通过缓冲读取文件的方法看起来很奇怪。 也许问题是,你做的
while(bytesRead > 0)
如果由于某种原因,流在某些时候没有返回任何字节但仍未完成下载,那么它将退出循环并且永远不会返回。您应该获取Content-Length,并通过bytesRead递增变量totalBytesReceived。最后,将循环更改为
while(totalBytesReceived < ContentLength)
答案 3 :(得分:0)
分配大于预期文件大小的缓冲区大小。
byte [] byteBuffer = new byte [65536];
这样,如果文件大小为1GiB,则分配1 GiB缓冲区,然后尝试在一次调用中填充整个缓冲区。这种填充可能返回更少的字节,但您仍然分配了整个缓冲区。请注意,.NET中单个数组的最大长度是32位数,这意味着即使您重新编译64位程序并且实际上有足够的可用内存。