我一直在使用以下代码来获取网页的html内容
$url = "http://mysmallwebpage.com/";
$html = file_get_contents($url);
但是,file_get_contents
无法以下列形式打开网址,但如果我在网络浏览器的地址栏中输入这些网址,则可以正常打开这些网址。
www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school
blog.al.com/birmingham-news-commentary/2012/11/naked_art_gallery_gives_back_t.html
mysmallwebpage.com/
你能告诉我在php中打开上面的URL有什么用吗?
以下无效
$url = "www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);
答案 0 :(得分:4)
除非您没有http://
前缀,否则PHP不会知道您想要一个网页并尝试使用文件系统:
$url = "http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);
if (FALSE === $html) {
throw new Exception(sprintf('Failed to open HTTP URL "%s".', $url));
}
在开发中,启用所有错误,警告和通知。 PHP通常会告诉您这些问题以及它们出现的位置。
完整示例(Demo):
<?php
/**
* PHP file_get_contents can't get html code
* @link http://stackoverflow.com/q/16118385/367456
*/
header('Content-Type: text/plain');
$url = "http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);
if (FALSE === $html) {
throw new Exception(sprintf('Failed to open HTTP URL "%s".', $url));
}
$xml = simplexml_import_dom(@DOMDocument::loadHTML($html), 'SimpleXMLIterator');
$tree = new RecursiveTreeIterator($xml);
foreach($tree as $element => $line) {
printf("%s <%s>\n", $tree->getPrefix(), $element);
}
输出:
|- <head>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <title>
| |- <meta>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <script>
| |- <script>
| |- <script>
| |- <script>
| |- <script>
| |- <script>
| |- <link>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| \- <link>
\- <body>
|- <div>
|- <noscript>
| \- <div>
| \- <p>
| \- <a>
|- <script>
|- <div>
| |- <div>
| | |- <ul>
| | | |- <li>
| | | | \- <a>
| | | |- <li>
| | | | |- <a>
| | | | \- <span>
| | | |- <li>
| | | | |- <a>
| | | | \- <div>
| | | | |- <div>
| | | | | \- <div>
| | | | | \- <ul>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | \- <li>
| | | | | \- <a>
| | | | \- <span>
| | | |- <li>
| | | | |- <a>
| | | | \- <div>
| | | | |- <div>
| | | | | \- <div>
| | | | | \- <ul>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | \- <li>
| | | | | \- <a>
| | | | \- <span>
| | | |- <li>
| | | | \- <a>
| | | |- <li>
| | | | \- <a>
| | | \- <li>
| | | \- <a>
| | \- <ul>
| | \- <li>
| | \- <a>
| \- <div>
| |- <h1>
| | \- <a>
| |- <ul>
| | |- <li>
| | | \- <a>
| | \- <li>
| | \- <a>
| |- <input>
| |- <input>
| |- <input>
| |- <input>
| |- <input>
| |- <form>
| | |- <div>
| | | |- <input>
| | | |- <button>
| | | \- <div>
| | | \- <ul>
| | |- <input>
| | \- <input>
| \- <div>
| \- <a>
| \- <span>
|- <div>
| \- <div>
| |- <a>
| | \- <span>
| |- <ul>
| | |- <li>
| | | \- <a>
| | \- <li>
| | \- <a>
| \- <div>
| |- <div>
| | |- <div>
| | | \- <p>
| | |- <div>
| | \- <form>
| | |- <div>
| | | \- <h2>
| | |- <div>
| | | |- <div>
| | | | |- <a>
| | | | | \- <span>
| | | | \- <div>
| | | |- <input>
| | | |- <input>
| | | |- <input>
| | | |- <input>
| | | \- <p>
| | |- <div>
| | | \- <span>
| | |- <div>
| | | |- <label>
| | | |- <span>
| | | |- <span>
| | | \- <input>
| | |- <div>
| | | |- <label>
| | | |- <span>
| | | \- <input>
| | |- <div>
| | | |- <input>
| | | |- <label>
| | | |- <input>
| | | |- <label>
| | | |- <input>
| | | \- <label>
| | |- <hr>
| | |- <div>
| | | |- <label>
| | | |- <span>
| | | | |- <span>
| | | | \- <span>
| | | |- <span>
| | | \- <input>
| | |- <div>
| | | |- <label>
| | | |- <span>
| | | \- <input>
| | |- <div>
| | | |- <label>
| | | |- <span>
| | | \- <input>
| | |- <div>
| | | |- <label>
| | | |- <span>
| | | |- <input>
| | | |- <div>
| | | | |- <span>
| | | | \- <span>
| | | \- <div>
| | | |- <span>
| | | \- <span>
| | |- <hr>
| | |- <p>
| | | |- <a>
| | | \- <a>
| | |- <p>
| | | |- <input>
| | | |- <input>
| | | |- <input>
| | | |- <span>
| | | | \- <span>
| | | | \- <input>
| | | \- <span>
| | \- <div>
| | |- <input>
| | \- <label>
| \- <div>
| |- <div>
| | \- <p>
| \- <form>
| |- <div>
| | |- <label>
| | |- <input>
| | \- <span>
| |- <div>
| | |- <label>
| | |- <input>
| | \- <span>
| |- <div>
| | |- <input>
| | \- <label>
| |- <p>
| | |- <input>
| | |- <input>
| | |- <input>
| | |- <span>
| | | \- <span>
| | | \- <input>
| | \- <span>
| |- <hr>
| |- <p>
| | \- <a>
| \- <p>
| \- <a>
|- <div>
| \- <div>
| \- <div>
|- <script>
|- <script>
|- <div>
| \- <div>
| \- <p>
| |- <strong>
| |- <strong>
| \- <strong>
|- <div>
| |- <ul>
| | |- <li>
| | | |- <a>
| | | \- <span>
| | \- <li>
| | \- <a>
| \- <div>
| \- <div>
| |- <div>
| | |- <div>
| | | |- <div>
| | | | |- <input>
| | | | |- <div>
| | | | | \- <a>
| | | | | \- <span>
| | | | \- <div>
| | | | |- <h3>
| | | | \- <p>
| | | |- <div>
| | | | |- <h1>
| | | | \- <p>
| | | | \- <a>
| | | |- <div>
| | | | |- <div>
| | | | | \- <a>
| | | | | \- <img>
| | | | \- <div>
| | | | \- <a>
| | | |- <div>
| | | | |- <a>
| | | | | \- <img>
| | | | |- <a>
| | | | | \- <img>
| | | | |- <a>
| | | | | \- <img>
| | | | |- <a>
| | | | | \- <img>
| | | | \- <a>
| | | | \- <img>
| | | |- <div>
| | | | \- <div>
| | | | |- <br>
| | | | \- <p>
| | | | |- <br>
| | | | \- <a>
| | | |- <div>
| | | | |- <h3>
| | | | | \- <span>
| | | | |- <h4>
| | | | \- <table>
| | | | |- <tr>
| | | | | |- <th>
| | | | | |- <th>
| | | | | \- <th>
| | | | |- <tr>
| | | | | |- <td>
| | | | | |- <td>
| | | | | | |- <span>
| | | | | | |- <span>
| | | | | | \- <span>
| | | | | \- <td>
| | | | | |- <span>
| | | | | |- <span>
| | | | | \- <span>
| | | | \- <tr>
| | | | |- <td>
| | | | |- <td>
| | | | | |- <span>
| | | | | |- <span>
| | | | | \- <span>
| | | | \- <td>
| | | | |- <span>
| | | | |- <span>
| | | | \- <span>
| | | |- <div>
| | | | |- <div>
| | | | | |- <h3>
| | | | | \- <ul>
| | | | | |- <div>
| | | | | | |- <p>
| | | | | | | \- <span>
| | | | | | \- <span>
| | | | | |- <li>
| | | | | | \- <span>
| | | | | \- <li>
| | | | | \- <span>
| | | | \- <div>
| | | | |- <h3>
| | | | \- <ul>
| | | | |- <li>
| | | | | \- <a>
| | | | \- <li>
| | | | \- <a>
| | | |- <div>
| | | | |- <ul>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | | \- <strong>
| | | | | | |- <span>
| | | | | | \- <span>
| | | | | \- <li>
| | | | | |- <div>
| | | | | | \- <a>
| | | | | \- <div>
| | | | \- <div>
| | | | |- <div>
| | | | | |- <div>
| | | | | | |- <h3>
| | | | | | |- <p>
| | | | | | |- <h3>
| | | | | | \- <p>
| | | | | \- <div>
| | | | | |- <h3>
| | | | | \- <p>
| | | | |- <div>
| | | | | \- <div>
| | | | | |- <h3>
| | | | | \- <div>
| | | | | |- <a>
| | | | | |- <a>
| | | | | \- <a>
| | | | \- <div>
| | | | \- <ul>
| | | | |- <li>
| | | | \- <li>
| | | |- <div>
| | | | |- <h3>
| | | | \- <div>
| | | | \- <ul>
| | | \- <script>
| | |- <div>
| | | |- <div>
| | | | |- <div>
| | | | | |- <div>
| | | | | |- <div>
| | | | | | |- <span>
| | | | | | |- <span>
| | | | | | |- <a>
| | | | | | | \- <span>
| | | | | | \- <span>
| | | | | \- <div>
| | | | \- <div>
| | | | |- <div>
| | | | |- <div>
| | | | | \- <div>
| | | | | \- <div>
| | | | \- <div>
| | | | |- <div>
| | | | | \- <form>
| | | | | |- <input>
| | | | | |- <input>
| | | | | |- <input>
| | | | | |- <input>
| | | | | |- <input>
| | | | | \- <span>
| | | | | \- <span>
| | | | | \- <input>
| | | | \- <div>
| | | | \- <div>
| | | | |- <input>
| | | | |- <input>
| | | | |- <input>
| | | | \- <a>
| | | | \- <span>
| | | |- <div>
| | | | \- <p>
| | | | |- <span>
| | | | \- <span>
| | | | \- <span>
| | | |- <script>
| | | |- <div>
| | | | |- <div>
| | | | | |- <h2>
| | | | | |- <ul>
| | | | | | |- <li>
| | | | | | | \- <span>
| | | | | | | \- <a>
| | | | | | \- <li>
| | | | | |- <div>
| | | | | | \- <div>
| | | | | | |- <span>
| | | | | | |- <div>
| | | | | | | \- <a>
| | | | | | | \- <img>
| | | | | | |- <div>
| | | | | | | \- <a>
| | | | | | | \- <img>
| | | | | | \- <div>
| | | | | | \- <a>
| | | | | | \- <span>
| | | | | \- <ul>
| | | | | |- <li>
| | | | | | |- <input>
| | | | | | \- <a>
| | | | | | \- <span>
| | | | | \- <li>
| | | | | \- <a>
| | | | | \- <span>
| | | | |- <div>
| | | | | |- <h2>
| | | | | \- <div>
| | | | | |- <div>
| | | | | | \- <a>
| | | | | | \- <img>
| | | | | \- <ul>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | \- <li>
| | | | | \- <a>
| | | | | \- <span>
| | | | \- <div>
| | | | |- <h2>
| | | | |- <ul>
| | | | | |- <li>
| | | | | | |- <input>
| | | | | | \- <a>
| | | | | | |- <span>
| | | | | | \- <span>
| | | | | |- <li>
| | | | | | \- <a>
| | | | | | \- <span>
| | | | | \- <li>
| | | | | \- <a>
| | | | | |- <span>
| | | | | \- <span>
| | | | |- <hr>
| | | | |- <ul>
| | | | | |- <li>
| | | | | | |- <script>
| | | | | | \- <a>
| | | | | | |- <span>
| | | | | | \- <span>
| | | | | |- <li>
| | | | | | |- <script>
| | | | | | \- <a>
| | | | | \- <li>
| | | | | |- <like>
| | | | | \- <script>
| | | | |- <hr>
| | | | \- <div>
| | | | \- <ul>
| | | | |- <li>
| | | | \- <li>
| | | | \- <a>
| | | \- <div>
| | | |- <div>
| | | | |- <div>
| | | | |- <div>
| | | | | |- <span>
| | | | | |- <span>
| | | | | |- <a>
| | | | | | \- <span>
| | | | | \- <span>
| | | | \- <div>
| | | \- <div>
| | | |- <div>
| | | |- <div>
| | | | \- <div>
| | | | \- <div>
| | | \- <div>
| | | |- <div>
| | | | \- <form>
| | | | |- <input>
| | | | |- <input>
| | | | |- <input>
| | | | |- <input>
| | | | |- <input>
| | | | \- <span>
| | | | \- <span>
| | | | \- <input>
| | | \- <div>
| | | \- <div>
| | | |- <input>
| | | |- <input>
| | | |- <input>
| | | \- <a>
| | | \- <span>
| | |- <script>
| | |- <div>
| | | \- <div>
| | | |- <span>
| | | \- <a>
| | | \- <span>
| | \- <script>
| |- <div>
| \- <div>
| |- <div>
| | |- <div>
| | | |- <h2>
| | | \- <a>
| | | \- <span>
| | \- <form>
| | |- <div>
| | | |- <div>
| | | |- <div>
| | | |- <input>
| | | |- <div>
| | | |- <div>
| | | | |- <label>
| | | | \- <select>
| | | | |- <option>
| | | | |- <optgroup>
| | | | \- <optgroup>
| | | \- <div>
| | | |- <label>
| | | \- <input>
| | \- <div>
| | |- <button>
| | \- <div>
| | \- <span>
| \- <div>
| \- <div>
| |- <a>
| | \- <span>
| |- <h2>
| \- <a>
|- <div>
|- <div>
| |- <p>
| |- <ul>
| | |- <li>
| | | \- <a>
| | |- <li>
| | | \- <a>
| | |- <li>
| | | \- <a>
| | |- <li>
| | | \- <a>
| | |- <li>
| | | \- <a>
| | |- <li>
| | | \- <a>
| | |- <li>
| | | \- <a>
| | |- <li>
| | | \- <a>
| | \- <li>
| | \- <a>
| |- <p>
| | |- <a>
| | | \- <span>
| | |- <a>
| | | \- <span>
| | |- <a>
| | \- <span>
| \- <script>
|- <script>
|- <script>
|- <div>
| \- <div>
| |- <div>
| | \- <h2>
| \- <div>
| \- <div>
| \- <a>
| \- <span>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <noscript>
| \- <iframe>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
|- <script>
\- <script>
答案 1 :(得分:2)
在http://
"www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school"
答案 2 :(得分:1)
$ url =“http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school”; $ html = file_get_contents($ url); 使用这个网址它正在工作。
你在网址中使用http。
答案 3 :(得分:0)
试试这个工作正常
$url = "http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);
echo "<pre>";
print_r($html);
exit;