我正在尝试下载此网站的csv文件,该文件只需2秒即可下载任何浏览器。
http://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=AMEX&render=download
使用HttpWebRequest和WebClient,但看起来像nasdaq.com并没有让数据通过这两种方法流过,我也试过Fiddler并且没有回来。我只能使用任何浏览器下载这些数据。
我尝试更改标头,代理,安全协议,重定向,一点cookie和许多设置,但我仍然坚持这个问题。
如果有人对如何使其工作有任何想法请告诉我,如果您有解决方案,请回复此帖。谢谢。
下面的代码在C#.Net Framework 4.5 +
中以下代码可以下载其他网站,但不能下载nasdaq.com网站。
static void Main(string[] args)
{
try
{
string testUrl = "https://www.nasdaq.com/screening/companies-by-name.aspx?letter=0&exchange=AMEX&render=download";
HttpWebRequestTestDownload(testUrl);
}catch(Exception ex)
{
Console.WriteLine(ex.Message);
}
}
public static void HttpWebRequestTestDownload(string address)
{
//Example from
//https://msdn.microsoft.com/en-us/library/system.net.httpwebrequest.getresponse(v=vs.110).aspx
System.Net.HttpWebRequest wReq = (System.Net.HttpWebRequest)System.Net.WebRequest.Create(address);
wReq.KeepAlive = false;
System.Net.ServicePointManager.SecurityProtocol = System.Net.SecurityProtocolType.Ssl3;
ServicePointManager.Expect100Continue = true;
ServicePointManager.SecurityProtocol = SecurityProtocolType.Tls12;
ServicePointManager.ServerCertificateValidationCallback = delegate { return true; };
//I also tried the below and still not working
//wReq.AllowAutoRedirect = true;
//wReq.KeepAlive = false;
//wReq.Timeout = 10 * 60 * 1000;//10 minutes
////Accept-Encoding
//wReq.Accept = "application/csv,application/json,text/csv,text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
////Request format text/html. Will improve this if nessary Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
////http://www.useragentstring.com/
//wReq.UserAgent = "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3325.162 Safari/537.36";
//wReq.ProtocolVersion = HttpVersion.Version11;
//// wReq.Headers.Add("Accept-Language", "en_eg");
//wReq.ServicePoint.Expect100Continue = false;
////Fixing invalid SSL problem
//System.Net.ServicePointManager.ServerCertificateValidationCallback = delegate { return true; };
////Fixing the underlying connection was closed: An unexpected error occurred on a send for Framework 4.5 or higher
//ServicePointManager.SecurityProtocol = SecurityProtocolType.Ssl3 | SecurityProtocolType.Tls | SecurityProtocolType.Tls11 | SecurityProtocolType.Tls12;
//wReq.Headers.Add("Accept-Encoding", "gzip, deflate");//Accept encoding
// Set some reasonable limits on resources used by this request
wReq.MaximumAutomaticRedirections = 4;
wReq.MaximumResponseHeadersLength = 4;
// Set credentials to use for this request.
wReq.Credentials = System.Net.CredentialCache.DefaultCredentials;
System.Net.HttpWebResponse response = (System.Net.HttpWebResponse)wReq.GetResponse();
Console.WriteLine("Content length is {0}", response.ContentLength);
Console.WriteLine("Content type is {0}", response.ContentType);
// Get the stream associated with the response.
System.IO.Stream receiveStream = response.GetResponseStream();
// Pipes the stream to a higher level stream reader with the required encoding format.
System.IO.StreamReader readStream = new StreamReader(receiveStream, Encoding.UTF8);
Console.WriteLine("Response stream received.");
Console.WriteLine(readStream.ReadToEnd());
response.Close();
readStream.Close();
}
public static void WebClientTestDownload(string address)
{
System.Net.WebClient client = new System.Net.WebClient();
string reply = client.DownloadString(address);
}
答案 0 :(得分:1)
我能够解决问题。 每个人的提示,使用fiddler捕获网络并使用相同的标题。它可以在我拥有本网站所需的所有标题之后工作。
<!DOCTYPE html>
<html lang="en" class="no-js">
<head>
<meta charset="utf-8">
<meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
<meta http-equiv="x-ua-compatible" content="ie=edge">
<link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/css/bootstrap.min.css" integrity="sha384-Gn5384xqQ1aoWXA+058RXPxPg6fy4IWvTNh0E263XmFcJlSAwiGgFAW/dAiS6JXm" crossorigin="anonymous">
</head>
<body>
<div class="sticky-top">
<nav class="navbar navbar-expand-md navbar-dark bg-dark mb-3">
<button class="navbar-toggler custom-toggler" type="button" data-toggle="collapse" data-target="#navbarSupportedContent" aria-controls="navbarSupportedContent" aria-expanded="false" aria-label="Toggle navigation">
<span class="navbar-toggler-icon"></span>
</button>
<div class="collapse navbar-collapse justify-content-around" id="navbarSupportedContent">
<ul class="nav navbar-nav">
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle" href="#" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
BOOKS
</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink">
<a class="dropdown-item" href="brandingsutra">Branding Sutra</a>
</div>
</li>
</ul>
<ul class="nav navbar-nav">
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle" href="#" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
LEARNING LAB
</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink">
<a class="dropdown-item" href="#">Workshops</a>
<a class="dropdown-item" href="#">Podcast</a>
<a class="dropdown-item" href="#">Classes</a>
</div>
</li>
</ul>
<ul class="nav navbar-nav">
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle bgGradient2" href="#" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
SERVICES
</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink">
<a class="dropdown-item" href="#">Coaching</a>
<a class="dropdown-item" href="#">Mindfulness</a>
</div>
</li>
</ul>
<ul class="nav navbar-nav">
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle bgGradient" href="#" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
FREEBIES
</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink">
<a class="dropdown-item" href="#">Challenge</a>
<a class="dropdown-item" href="#">DIY Copywriting</a>
</div>
</li>
</ul>
<ul class="nav navbar-nav">
<li class="nav-item dropdown">
<a class="nav-link dropdown-toggle bgGradient" href="#" id="navbarDropdownMenuLink" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false">
ABOUT
</a>
<div class="dropdown-menu" aria-labelledby="navbarDropdownMenuLink">
<a class="dropdown-item" href="#">Merry Carole</a>
<a class="dropdown-item" href="#">Branding Powers</a>
</div>
</li>
</ul>
</div>
</nav>
</div>
<script src="https://ajax.googleapis.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0/js/bootstrap.min.js" integrity="sha384-JZR6Spejh4U02d8jOt6vLEHfe/JQGiRRSQQxSfFWpi1MquVdAyjUar5+76PVCmYl" crossorigin="anonymous"></script>
</body>
</html>