我在过去的4-5个月或者6个月之前成功地做到了这一点,但现在我看到网站已经改变了。我能够使用 HTTPWEBREQUEST 获得所需的搜索结果问题是下载CSV文件。
下载无法正常工作我使用 WEBCLIENT 完全相同地获取所有Cookie,但它仍无法正常工作。
当我这样做时,我会在文件
中得到这个..... meta http-equiv =“refresh”content =“0; url ='http://www.google.com/trends#content=1& geo = US-AL& q = snooker& CMPT = q&安培; HL = EN-AU'“> location.replace( “http://www.google.com/trends#content\x3d1\x26geo\x3dUS-AL\x26q\x3dsnooker\x26cmpt\x3dq\x26hl\x3den-AU”)
下载文件的代码如下:
public void downloadsheet(string url, string path)
{
try
{
using (WebClient client = new WebClient())
{
string tmpCookieString = string.Empty;
string[] array = webBrowser1.Document.Cookie.Split(new char[]
{
';'
});
for (int i = 0; i < array.Length; i++)
{
string cookie = array[i];
string name = cookie.Split(new char[]
{
'='
})[0];
string value = cookie.Substring(name.Length + 1);
//client.Headers.Add(name, value);
if (i < array.Length - 1)
{
tmpCookieString = tmpCookieString + name + "=" + value + ";";
}
else
{
tmpCookieString = tmpCookieString + name + "=" + value;
}
}
client.Headers.Add(HttpRequestHeader.Cookie, tmpCookieString);
client.Headers.Add("Accept", "text/html, application/xhtml+xml, */*");
client.Headers.Add("User-Agent", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)");
client.Headers.Add("Accept-Language", "en-US");
using (FileStream file = File.Create(path))
{
byte[] bytes = client.DownloadData(url);
file.Write(bytes, 0, bytes.Length);
}
}
}
catch (Exception exp_DE)
{
}
}
使用的网址是:
http://www.google.com/trends/trendsReport?hl=en-AU&q=snooker&geo=US-AL&cmpt=q&content=1&export=2
任何帮助都非常感激。
更多信息:
如果我使用 WebBrowser 控件导航到上面的相应链接,它会打开一个对话框..
答案 0 :(得分:1)
问题是security purpose的WebBrowser.Document.Cookie
中缺少HttpOnly Cookie(即 SID 和 HSID )。
以下是解决方案:
[DllImport("wininet.dll", CharSet = CharSet.Auto, SetLastError = true)]
static extern bool InternetGetCookieEx(string pchURL, string pchCookieName, StringBuilder pchCookieData, ref uint pcchCookieData, int dwFlags, IntPtr lpReserved);
const int INTERNET_COOKIE_HTTPONLY = 0x00002000;
private static string GetGlobalCookies(string uri)
{
uint datasize = 2048;
StringBuilder cookieData = new StringBuilder((int)datasize);
if (InternetGetCookieEx(uri, null, cookieData, ref datasize, INTERNET_COOKIE_HTTPONLY, IntPtr.Zero)
&& cookieData.Length > 0)
{
return cookieData.ToString();
}
else
{
return null;
}
}
public void downloadsheet(string url, string path)
{
try
{
using (WebClient client = new WebClient())
{
string tmpCookieString = GetGlobalCookies(webBrowser1.Url.AbsoluteUri);
client.Headers.Add(HttpRequestHeader.Cookie, tmpCookieString);
client.Headers.Add("Accept", "text/html, application/xhtml+xml, */*");
client.Headers.Add("User-Agent", "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2)");
client.Headers.Add("Accept-Language", "en-US");
using (FileStream file = File.Create(path))
{
byte[] bytes = client.DownloadData(url);
file.Write(bytes, 0, bytes.Length);
}
}
}
catch (Exception exp_DE)
{
}
}
当然,您应该在致电 InternetGetCookieEx 之前登录您的帐户。