使用浏览器在纳斯达克下载公司列表时,一切正常。
URL:https://www.nasdaq.com/screening/companies-by-industry.aspx?exchange=NYSE&render=download
但是在一段时间后尝试使用PHP进行相同操作时,将出现错误“无法打开HTTP流”。
我在同一主题上发现了另一个问题:HttpWebRequest Unable to download data from nasdaq.com but able from browsers
尽管不确定如何在PHP中实现此功能。尝试使用file_get_contents并使用不同的头文件集进行卷曲,但无法使其正常工作。测试了与类似问题中相同的标题,以及我的浏览器成功下载所使用的标题。
谁能给我一个例子,说明如何使用PHP进行此工作?头文件根本就没有问题吗?
答案 0 :(得分:0)
您可以使用shell_exec或普通的PHP curl库来使用curl请求。我对shell_exec很满意,这就是为什么这样显示代码。
$url = 'https://www.nasdaq.com/screening/companies-by-industry.aspx?exchange=NYSE&render=download';
$output = shell_exec('curl -J -O -L -k -s "'. $url .'"');
使用的带有curl的选项如下,有关这些选项的更多详细信息,您可以执行man curl
:
-J, --remote-header-name
(HTTP) This option tells the -O, --remote-name option to use the server-specified Content-Disposition filename instead of extracting a filename from the URL.
There's no attempt to decode %-sequences (yet) in the provided file name, so this option may provide you with rather unexpected file names.
-k, --insecure
(SSL) This option explicitly allows curl to perform "insecure" SSL connections and transfers. All SSL connections are attempted to be made secure by using the CA certificate bun‐
dle installed by default. This makes all connections considered "insecure" fail unless -k, --insecure is used.
See this online resource for further details: http://curl.haxx.se/docs/sslcerts.html
-O, --remote-name
Write output to a local file named like the remote file we get. (Only the file part of the remote file is used, the path is cut off.)
The remote file name to use for saving is extracted from the given URL, nothing else.
Consequentially, the file will be saved in the current working directory. If you want the file saved in a different directory, make sure you change current working directory
before you invoke curl with the -O, --remote-name flag!
There is no URL decoding done on the file name. If it has %20 or other URL encoded parts of the name, they will end up as-is as file name.
You may use this option as many times as the number of URLs you have.
-L, --location
(HTTP/HTTPS) If the server reports that the requested page has moved to a different location (indicated with a Location: header and a 3XX response code), this option will make
-s, --silent
Silent or quiet mode. Don't show progress meter or error messages. Makes Curl mute. It will still output the data you ask for, potentially even to the terminal/stdout unless you
redirect it.