使用C#POST

时间:2016-05-20 05:55:27

标签: c# http post web-scraping httpwebrequest

我正在尝试使用一些C#代码访问网站上的数据。在浏览器中,用户将转到http://www.hkexnews.hk/sdw/search/search_sdw.asp,输入日期(2016年5月17日)和股票代码(&#34; 00001&#34;)并转到包含相关数据的表格。< / p>

我使用了Firefox附加组件(LiveHTTPHeaders)来查看此进程是否产生以下POST请求:

http://www.hkexnews.hk/sdw/search/search_sdw.asp

POST /sdw/search/search_sdw.asp HTTP/1.1
Host: www.hkexnews.hk
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-US,en;q=0.5
Accept-Encoding: gzip, deflate
Referer: http://www.hkexnews.hk/sdw/search/search_sdw.asp
Cookie: ASPSESSIONIDCASQQTQQ=FOLELNMCLAOEPAAEMIICCPAN; TS0161f2e5=017038eb4956ab204b9c8ba8c23ae9307c7879ba5fccd4664528a1407b7fab58353e89b27052a3d4d4df3b4b47034b5a31085f2c11
Connection: keep-alive
Content-Type: application/x-www-form-urlencoded
Content-Length: 341
txt_today_d=20&txt_today_m=5&txt_today_y=2016&current_page=1&stock_market=HKEX&IsExist_Slt_Stock_Id=False&IsExist_Slt_Part_Id=False&rdo_SelectSortBy=Shareholding&sessionToken=1458.568&sel_ShareholdingDate_d=17&sel_ShareholdingDate_m=05&sel_ShareholdingDate_y=2016&txt_stock_code=00001&txt_stock_name=&txt_ParticipantID=&txt_Participant_name=

HTTP/1.1 200 OK
Cache-Control: private
Pragma: No-Cache
Content-Length: 300802
Content-Type: text/html
Expires: 0
X-Powered-By: ASP.NET
Date: Fri, 20 May 2016 04:58:57 GMT

我的代码:

string PostData = "txt_today_d=20&txt_today_m=5&txt_today_y=2016&current_page=1&stock_market=HKEX&IsExist_Slt_Stock_Id=False&IsExist_Slt_Part_Id=False&rdo_SelectSortBy=Shareholding&sessionToken=1458.568&sel_ShareholdingDate_d=17&sel_ShareholdingDate_m=05&sel_ShareholdingDate_y=2016&txt_stock_code=00001&txt_stock_name=&txt_ParticipantID=&txt_Participant_name=";

byte[] ByteArray = Encoding.UTF8.GetBytes(PostData);
HttpWebRequest Request = (HttpWebRequest)WebRequest.Create("http://www.hkexnews.hk/sdw/search/search_sdw.asp");
Uri Target = new Uri("http://www.hkexnews.hk/sdw/search/search_sdw.asp");
Request.Method = "POST";
Request.Host = "www.hkexnews.hk";
Request.UserAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0";
Request.Accept = "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
WebHeaderCollection Headers = Request.Headers;
Headers.Add("Accept-Language", "en-US,en;q=0.5");
Headers.Add("Accept-Encoding", "gzip, deflate");
Request.Referer = "http://www.hkexnews.hk/sdw/search/search_sdw.asp";
Request.CookieContainer = new CookieContainer();
Request.CookieContainer.Add(new Cookie("ASPSESSIONIDCASQQTQQ", "FOLELNMCLAOEPAAEMIICCPAN") { Domain = Target.Host });
Request.CookieContainer.Add(new Cookie("TS0161f2e5", "017038eb49a575136721aebc3f3ef14ec9c810d2e25d5e909be6fe49639ec38eec624237bf9759851675d091f3a4ed0ebc3d8bb3d2") { Domain = Target.Host });
Request.KeepAlive = true;
Request.ContentType = "Content-Type: application/x-www-form-urlencoded";
Request.ContentLength = ByteArray.Length;

Stream dataStream = Request.GetRequestStream();
// Write the data to the request stream.
dataStream.Write(ByteArray, 0, ByteArray.Length);
// Close the Stream object.
dataStream.Close();
// Get the response.
WebResponse response = Request.GetResponse();
// Display the status.
Console.WriteLine(((HttpWebResponse)response).StatusDescription);
// Get the stream containing content returned by the server.
dataStream = response.GetResponseStream();
// Open the stream using a StreamReader for easy access.
StreamReader reader = new StreamReader(dataStream);
// Read the content.
string responseFromServer = reader.ReadToEnd();
// Display the content.
Console.WriteLine(responseFromServer);
// Clean up the streams.
reader.Close();
dataStream.Close();
response.Close();

此代码似乎只是吐出原始搜索屏幕的Html,而不是将我带到结果。

编辑:我也尝试过对HttpClient类做同样的事情,如下所示:

using (var client = new HttpClient())
{
client.BaseAddress = new Uri("http://www.hkexnews.hk/sdw/search/search_sdw.asp");
var content = new FormUrlEncodedContent(new[]
{
new KeyValuePair<string, string>("txt_today_d", "20"),
new KeyValuePair<string, string>("txt_today_m", "5"),
new KeyValuePair<string, string>("txt_today_y", "2016"),
new KeyValuePair<string, string>("current_page", "1"),
new KeyValuePair<string, string>("stock_market", "HKEX"),
new KeyValuePair<string, string>("IsExist_Slt_Stock_Id", "False"),
new KeyValuePair<string, string>("IsExist_Slt_Part_Id", "False"),
new KeyValuePair<string, string>("rdo_SelectSortBy", "Shareholding"),
new KeyValuePair<string, string>("sessionToken", "1458.568"),
new KeyValuePair<string, string>("sel_ShareholdingDate_d", "17"),
new KeyValuePair<string, string>("sel_ShareholdingDate_m", "05"),
new KeyValuePair<string, string>("sel_ShareholdingDate_y", "2016"),
new KeyValuePair<string, string>("txt_stock_code", "00001"),
new KeyValuePair<string, string>("txt_stock_name", ""),
new KeyValuePair<string, string>("txt_ParticipantID", ""),
new KeyValuePair<string, string>("txt_Participant_name", "")
});
var result = client.PostAsync("/sdw/search/search_sdw.asp", content).Result;
string resultContent = result.Content.ReadAsStringAsync().Result;
Console.WriteLine(resultContent);

这会产生相同的结果(原始形式的Html)

1 个答案:

答案 0 :(得分:0)

因为您的请求中存在错误(可能Cookie不正确),并且您的请求已重定向到原始网站。

HTTP/1.1 302 Object moved
Cache-Control: private
Pragma: No-Cache
Content-Length: 135
Content-Type: text/html
Expires: 0
Location: search_sdw.asp
X-Powered-By: ASP.NET
Date: Fri, 20 May 2016 07:58:52 GMT

<head><title>Object moved</title></head>
<body><h1>Object Moved</h1>This object may be found <a HREF="search_sdw.asp">here</a>.</body>