我正在尝试自动从某个网站下载文本文件列表。下载文本文件的过程如下:
我想使用HttpWebRequest下载此文件。
我的代码如下:
string sTmpCookieString = GetGlobalCookies(webBrowser1.Url.AbsoluteUri);
HttpWebRequest fstRequest = (HttpWebRequest)WebRequest.Create(URL);
fstRequest.Method = "GET";
fstRequest.CookieContainer = new System.Net.CookieContainer();
fstRequest.CookieContainer.SetCookies(webBrowser1.Document.Url, sTmpCookieString);
HttpWebResponse fstResponse = (HttpWebResponse)fstRequest.GetResponse();
StreamReader sr = new StreamReader(fstResponse.GetResponseStream());
string sPageData = sr.ReadToEnd();
sr.Close();
string sViewState = ExtractInputHidden(sPageData, "__VIEWSTATE");
string sEventValidation = this.ExtractInputHidden(sPageData, "__EVENTVALIDATION");
string sUrl = URL;
HttpWebRequest hwrRequest = (HttpWebRequest)WebRequest.Create(sUrl);
hwrRequest.Method = "POST";
hwrRequest.CookieContainer = new System.Net.CookieContainer();
string sPostData = "__EVENTTARGET=&__EVENTARGUMENT=&__VIEWSTATE=" + sViewState + "&__EVENTVALIDATION=" + sEventValidation + "&Name=test" + "&Button1=Button";
byte[] bByteArray = Encoding.UTF8.GetBytes(sPostData);
hwrRequest.ContentType = "text/plain";
hwrRequest.CookieContainer.SetCookies(webBrowser1.Document.Url, sTmpCookieString);
hwrRequest.ContentLength = bByteArray.Length;
Stream sDataStream = hwrRequest.GetRequestStream();
sDataStream.Write(bByteArray, 0, bByteArray.Length);
sDataStream.Close();
using (WebResponse response = hwrRequest.GetResponse())
{
using (sDataStream = response.GetResponseStream())
{
StreamReader reader = new StreamReader(sDataStream);
{
string sResponseFromServer = reader.ReadToEnd();
FileStream fs = File.Open(path, FileMode.OpenOrCreate, FileAccess.Write);
Byte[] info = new System.Text.UTF8Encoding(true).GetBytes(sResponseFromServer);
fs.Write(info, 0, info.Length);
fs.Close();
}
}
}
..而且我一直在获取HTML,如下所示:
<!DOCTYPE html>
<html>
<head>
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252" />
<meta content="Microsoft Visual Studio 7.0" name="GENERATOR" />
...
我认为我的问题可能在某人的某个地方不清楚。因此,如果有人指出,我将尝试详细解释。
我将不胜感激。
答案 0 :(得分:1)
您正在尝试一次读取整个响应:
string sResponseFromServer = reader.ReadToEnd();
相反,请考虑使用类似以下内容的
:using (sDataStream = response.GetResponseStream())
{
FileStream fs = File.Open(path, FileMode.OpenOrCreate, FileAccess.Write);
sDataStream.CopyTo(fs, 10000);
fs.Close();
}
第二个参数是缓冲区大小,您可以将其设置为任何合理的值。
答案 1 :(得分:1)
使用异步版本WebRequest.GetResponseAsync()
的WebRequest下载从您的using (WebResponse response = hwrRequest.GetResponse()) { };
开始
其余代码大部分都很好。
根据需要调整用于下载/存储文件的缓冲区大小(此处{132072
个字节)。
不要无缘无故地缩小它。
使用File.Create()创建目标文件,该文件默认为Create New or Overwrite
和FileShare.None
。
using (HttpWebResponse httpResponse = (HttpWebResponse)await httpRequest.GetResponseAsync())
using (Stream ResponseStream = httpResponse.GetResponseStream())
{
if (httpResponse.StatusCode == HttpStatusCode.OK)
{
try
{
int buffersize = 132072;
using (FileStream fileStream = File.Create(["YourFileName"], buffersize, FileOptions.Asynchronous))
{
int read;
byte[] buffer = new byte[buffersize];
while ((read = await ResponseStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
{
await fileStream.WriteAsync(buffer, 0, read);
}
};
}
catch (DirectoryNotFoundException dnf_ex)
{
throw; //Log, store¬ify. Your usual handling.
}
catch (PathTooLongException ptl_ex)
{
throw; //Same
}
catch (IOException io_ex)
{
throw; //Same
}
}
};
return ["YourFileName"];