如何使用带有凭据的HttpWebRequest下载(巨大的)文本文件?

时间:2018-07-31 13:25:06

标签: c# winforms download httpwebrequest

我正在尝试自动从某个网站下载文本文件列表。下载文本文件的过程如下:

  1. 单击文件名称,以打开一个弹出窗口。
  2. 内容在弹出窗口中// //可以作为字符串下载,但是由于我遇到了内存不足异常,因此使用StreamWriter不能下载。
  3. 右键单击->另存为。

我想使用HttpWebRequest下载此文件。

我的代码如下:

string sTmpCookieString = GetGlobalCookies(webBrowser1.Url.AbsoluteUri);
HttpWebRequest fstRequest = (HttpWebRequest)WebRequest.Create(URL);
fstRequest.Method = "GET";
fstRequest.CookieContainer = new System.Net.CookieContainer();
fstRequest.CookieContainer.SetCookies(webBrowser1.Document.Url, sTmpCookieString);
HttpWebResponse fstResponse = (HttpWebResponse)fstRequest.GetResponse();
StreamReader sr = new StreamReader(fstResponse.GetResponseStream());
string sPageData = sr.ReadToEnd();
sr.Close();

string sViewState = ExtractInputHidden(sPageData, "__VIEWSTATE");
string sEventValidation = this.ExtractInputHidden(sPageData, "__EVENTVALIDATION");

string sUrl = URL;
HttpWebRequest hwrRequest = (HttpWebRequest)WebRequest.Create(sUrl);
hwrRequest.Method = "POST";
hwrRequest.CookieContainer = new System.Net.CookieContainer();

string sPostData = "__EVENTTARGET=&__EVENTARGUMENT=&__VIEWSTATE=" + sViewState + "&__EVENTVALIDATION=" + sEventValidation + "&Name=test" + "&Button1=Button";

byte[] bByteArray = Encoding.UTF8.GetBytes(sPostData);
hwrRequest.ContentType = "text/plain";
hwrRequest.CookieContainer.SetCookies(webBrowser1.Document.Url, sTmpCookieString);
hwrRequest.ContentLength = bByteArray.Length;

Stream sDataStream = hwrRequest.GetRequestStream();
sDataStream.Write(bByteArray, 0, bByteArray.Length);
sDataStream.Close();
using (WebResponse response = hwrRequest.GetResponse())
{
    using (sDataStream = response.GetResponseStream())
    {
        StreamReader reader = new StreamReader(sDataStream);
        {
            string sResponseFromServer = reader.ReadToEnd();
            FileStream fs = File.Open(path, FileMode.OpenOrCreate, FileAccess.Write);
            Byte[] info = new System.Text.UTF8Encoding(true).GetBytes(sResponseFromServer);
            fs.Write(info, 0, info.Length);
            fs.Close();
        }
    }
}

..而且我一直在获取HTML,如下所示:

<!DOCTYPE html>
<html>

<head>    

<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />

<meta http-equiv="Content-Type" content="text/html; charset=windows-1252" />
<meta content="Microsoft Visual Studio 7.0" name="GENERATOR" />
...

我认为我的问题可能在某人的某个地方不清楚。因此,如果有人指出,我将尝试详细解释。

我将不胜感激。

2 个答案:

答案 0 :(得分:1)

您正在尝试一次读取整个响应:

string sResponseFromServer = reader.ReadToEnd();

相反,请考虑使用类似以下内容的

using (sDataStream = response.GetResponseStream())
{
    FileStream fs = File.Open(path, FileMode.OpenOrCreate, FileAccess.Write);
    sDataStream.CopyTo(fs, 10000);
    fs.Close();                            
}

第二个参数是缓冲区大小,您可以将其设置为任何合理的值。

答案 1 :(得分:1)

使用异步版本WebRequest.GetResponseAsync()

的WebRequest下载

从您的using (WebResponse response = hwrRequest.GetResponse()) { };开始
其余代码大部分都很好。

根据需要调整用于下载/存储文件的缓冲区大小(此处{132072个字节)。
不要无缘无故地缩小它。

使用File.Create()创建目标文件,该文件默认为Create New or OverwriteFileShare.None

using (HttpWebResponse httpResponse = (HttpWebResponse)await httpRequest.GetResponseAsync())
using (Stream ResponseStream = httpResponse.GetResponseStream())
{
    if (httpResponse.StatusCode == HttpStatusCode.OK)
    {
        try
        {
            int buffersize = 132072;
            using (FileStream fileStream = File.Create(["YourFileName"], buffersize, FileOptions.Asynchronous))
            {
                int read;
                byte[] buffer = new byte[buffersize];
                while ((read = await ResponseStream.ReadAsync(buffer, 0, buffer.Length)) > 0)
                {
                    await fileStream.WriteAsync(buffer, 0, read);
                }
            };
        }

        catch (DirectoryNotFoundException dnf_ex)
        {
            throw;  //Log, store&notify. Your usual handling.
        }
        catch (PathTooLongException ptl_ex)
        {
            throw;  //Same
        }
        catch (IOException io_ex)
        {
            throw;  //Same
        }
    }
};
return ["YourFileName"];