PHP file_get_contents无法获取HTML代码

时间:2013-04-20 08:40:56

标签: php html

我一直在使用以下代码来获取网页的html内容

$url = "http://mysmallwebpage.com/";
$html = file_get_contents($url);

但是,file_get_contents无法以下列形式打开网址,但如果我在网络浏览器的地址栏中输入这些网址,则可以正常打开这些网址。

www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school
blog.al.com/birmingham-news-commentary/2012/11/naked_art_gallery_gives_back_t.html
mysmallwebpage.com/

你能告诉我在php中打开上面的URL有什么用吗?

以下无效

$url = "www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);

4 个答案:

答案 0 :(得分:4)

除非您没有http://前缀,否则PHP不会知道您想要一个网页并尝试使用文件系统:

$url = "http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);
if (FALSE === $html) {
    throw new Exception(sprintf('Failed to open HTTP URL "%s".', $url));
}

在开发中,启用所有错误,警告和通知。 PHP通常会告诉您这些问题以及它们出现的位置。

完整示例(Demo):

<?php
/**
 * PHP file_get_contents can't get html code
 * @link http://stackoverflow.com/q/16118385/367456
 */
header('Content-Type: text/plain');
$url = "http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);
if (FALSE === $html) {
    throw new Exception(sprintf('Failed to open HTTP URL "%s".', $url));
}

$xml = simplexml_import_dom(@DOMDocument::loadHTML($html), 'SimpleXMLIterator');
$tree = new RecursiveTreeIterator($xml);

foreach($tree as $element => $line) {
    printf("%s <%s>\n", $tree->getPrefix(), $element);
}

输出:

|- <head>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <title>
| |- <meta>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <link>
| |- <script>
| |- <script>
| |- <script>
| |- <script>
| |- <script>
| |- <script>
| |- <link>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| |- <meta>
| \- <link>
\- <body>
  |- <div>
  |- <noscript>
  | \- <div>
  |   \- <p>
  |     \- <a>
  |- <script>
  |- <div>
  | |- <div>
  | | |- <ul>
  | | | |- <li>
  | | | | \- <a>
  | | | |- <li>
  | | | | |- <a>
  | | | | \- <span>
  | | | |- <li>
  | | | | |- <a>
  | | | | \- <div>
  | | | |   |- <div>
  | | | |   | \- <div>
  | | | |   |   \- <ul>
  | | | |   |     |- <li>
  | | | |   |     | \- <a>
  | | | |   |     |- <li>
  | | | |   |     | \- <a>
  | | | |   |     |- <li>
  | | | |   |     | \- <a>
  | | | |   |     |- <li>
  | | | |   |     | \- <a>
  | | | |   |     \- <li>
  | | | |   |       \- <a>
  | | | |   \- <span>
  | | | |- <li>
  | | | | |- <a>
  | | | | \- <div>
  | | | |   |- <div>
  | | | |   | \- <div>
  | | | |   |   \- <ul>
  | | | |   |     |- <li>
  | | | |   |     | \- <a>
  | | | |   |     |- <li>
  | | | |   |     | \- <a>
  | | | |   |     |- <li>
  | | | |   |     | \- <a>
  | | | |   |     \- <li>
  | | | |   |       \- <a>
  | | | |   \- <span>
  | | | |- <li>
  | | | | \- <a>
  | | | |- <li>
  | | | | \- <a>
  | | | \- <li>
  | | |   \- <a>
  | | \- <ul>
  | |   \- <li>
  | |     \- <a>
  | \- <div>
  |   |- <h1>
  |   | \- <a>
  |   |- <ul>
  |   | |- <li>
  |   | | \- <a>
  |   | \- <li>
  |   |   \- <a>
  |   |- <input>
  |   |- <input>
  |   |- <input>
  |   |- <input>
  |   |- <input>
  |   |- <form>
  |   | |- <div>
  |   | | |- <input>
  |   | | |- <button>
  |   | | \- <div>
  |   | |   \- <ul>
  |   | |- <input>
  |   | \- <input>
  |   \- <div>
  |     \- <a>
  |       \- <span>
  |- <div>
  | \- <div>
  |   |- <a>
  |   | \- <span>
  |   |- <ul>
  |   | |- <li>
  |   | | \- <a>
  |   | \- <li>
  |   |   \- <a>
  |   \- <div>
  |     |- <div>
  |     | |- <div>
  |     | | \- <p>
  |     | |- <div>
  |     | \- <form>
  |     |   |- <div>
  |     |   | \- <h2>
  |     |   |- <div>
  |     |   | |- <div>
  |     |   | | |- <a>
  |     |   | | | \- <span>
  |     |   | | \- <div>
  |     |   | |- <input>
  |     |   | |- <input>
  |     |   | |- <input>
  |     |   | |- <input>
  |     |   | \- <p>
  |     |   |- <div>
  |     |   | \- <span>
  |     |   |- <div>
  |     |   | |- <label>
  |     |   | |- <span>
  |     |   | |- <span>
  |     |   | \- <input>
  |     |   |- <div>
  |     |   | |- <label>
  |     |   | |- <span>
  |     |   | \- <input>
  |     |   |- <div>
  |     |   | |- <input>
  |     |   | |- <label>
  |     |   | |- <input>
  |     |   | |- <label>
  |     |   | |- <input>
  |     |   | \- <label>
  |     |   |- <hr>
  |     |   |- <div>
  |     |   | |- <label>
  |     |   | |- <span>
  |     |   | | |- <span>
  |     |   | | \- <span>
  |     |   | |- <span>
  |     |   | \- <input>
  |     |   |- <div>
  |     |   | |- <label>
  |     |   | |- <span>
  |     |   | \- <input>
  |     |   |- <div>
  |     |   | |- <label>
  |     |   | |- <span>
  |     |   | \- <input>
  |     |   |- <div>
  |     |   | |- <label>
  |     |   | |- <span>
  |     |   | |- <input>
  |     |   | |- <div>
  |     |   | | |- <span>
  |     |   | | \- <span>
  |     |   | \- <div>
  |     |   |   |- <span>
  |     |   |   \- <span>
  |     |   |- <hr>
  |     |   |- <p>
  |     |   | |- <a>
  |     |   | \- <a>
  |     |   |- <p>
  |     |   | |- <input>
  |     |   | |- <input>
  |     |   | |- <input>
  |     |   | |- <span>
  |     |   | | \- <span>
  |     |   | |   \- <input>
  |     |   | \- <span>
  |     |   \- <div>
  |     |     |- <input>
  |     |     \- <label>
  |     \- <div>
  |       |- <div>
  |       | \- <p>
  |       \- <form>
  |         |- <div>
  |         | |- <label>
  |         | |- <input>
  |         | \- <span>
  |         |- <div>
  |         | |- <label>
  |         | |- <input>
  |         | \- <span>
  |         |- <div>
  |         | |- <input>
  |         | \- <label>
  |         |- <p>
  |         | |- <input>
  |         | |- <input>
  |         | |- <input>
  |         | |- <span>
  |         | | \- <span>
  |         | |   \- <input>
  |         | \- <span>
  |         |- <hr>
  |         |- <p>
  |         | \- <a>
  |         \- <p>
  |           \- <a>
  |- <div>
  | \- <div>
  |   \- <div>
  |- <script>
  |- <script>
  |- <div>
  | \- <div>
  |   \- <p>
  |     |- <strong>
  |     |- <strong>
  |     \- <strong>
  |- <div>
  | |- <ul>
  | | |- <li>
  | | | |- <a>
  | | | \- <span>
  | | \- <li>
  | |   \- <a>
  | \- <div>
  |   \- <div>
  |     |- <div>
  |     | |- <div>
  |     | | |- <div>
  |     | | | |- <input>
  |     | | | |- <div>
  |     | | | | \- <a>
  |     | | | |   \- <span>
  |     | | | \- <div>
  |     | | |   |- <h3>
  |     | | |   \- <p>
  |     | | |- <div>
  |     | | | |- <h1>
  |     | | | \- <p>
  |     | | |   \- <a>
  |     | | |- <div>
  |     | | | |- <div>
  |     | | | | \- <a>
  |     | | | |   \- <img>
  |     | | | \- <div>
  |     | | |   \- <a>
  |     | | |- <div>
  |     | | | |- <a>
  |     | | | | \- <img>
  |     | | | |- <a>
  |     | | | | \- <img>
  |     | | | |- <a>
  |     | | | | \- <img>
  |     | | | |- <a>
  |     | | | | \- <img>
  |     | | | \- <a>
  |     | | |   \- <img>
  |     | | |- <div>
  |     | | | \- <div>
  |     | | |   |- <br>
  |     | | |   \- <p>
  |     | | |     |- <br>
  |     | | |     \- <a>
  |     | | |- <div>
  |     | | | |- <h3>
  |     | | | | \- <span>
  |     | | | |- <h4>
  |     | | | \- <table>
  |     | | |   |- <tr>
  |     | | |   | |- <th>
  |     | | |   | |- <th>
  |     | | |   | \- <th>
  |     | | |   |- <tr>
  |     | | |   | |- <td>
  |     | | |   | |- <td>
  |     | | |   | | |- <span>
  |     | | |   | | |- <span>
  |     | | |   | | \- <span>
  |     | | |   | \- <td>
  |     | | |   |   |- <span>
  |     | | |   |   |- <span>
  |     | | |   |   \- <span>
  |     | | |   \- <tr>
  |     | | |     |- <td>
  |     | | |     |- <td>
  |     | | |     | |- <span>
  |     | | |     | |- <span>
  |     | | |     | \- <span>
  |     | | |     \- <td>
  |     | | |       |- <span>
  |     | | |       |- <span>
  |     | | |       \- <span>
  |     | | |- <div>
  |     | | | |- <div>
  |     | | | | |- <h3>
  |     | | | | \- <ul>
  |     | | | |   |- <div>
  |     | | | |   | |- <p>
  |     | | | |   | | \- <span>
  |     | | | |   | \- <span>
  |     | | | |   |- <li>
  |     | | | |   | \- <span>
  |     | | | |   \- <li>
  |     | | | |     \- <span>
  |     | | | \- <div>
  |     | | |   |- <h3>
  |     | | |   \- <ul>
  |     | | |     |- <li>
  |     | | |     | \- <a>
  |     | | |     \- <li>
  |     | | |       \- <a>
  |     | | |- <div>
  |     | | | |- <ul>
  |     | | | | |- <li>
  |     | | | | | \- <a>
  |     | | | | |   \- <strong>
  |     | | | | |     |- <span>
  |     | | | | |     \- <span>
  |     | | | | \- <li>
  |     | | | |   |- <div>
  |     | | | |   | \- <a>
  |     | | | |   \- <div>
  |     | | | \- <div>
  |     | | |   |- <div>
  |     | | |   | |- <div>
  |     | | |   | | |- <h3>
  |     | | |   | | |- <p>
  |     | | |   | | |- <h3>
  |     | | |   | | \- <p>
  |     | | |   | \- <div>
  |     | | |   |   |- <h3>
  |     | | |   |   \- <p>
  |     | | |   |- <div>
  |     | | |   | \- <div>
  |     | | |   |   |- <h3>
  |     | | |   |   \- <div>
  |     | | |   |     |- <a>
  |     | | |   |     |- <a>
  |     | | |   |     \- <a>
  |     | | |   \- <div>
  |     | | |     \- <ul>
  |     | | |       |- <li>
  |     | | |       \- <li>
  |     | | |- <div>
  |     | | | |- <h3>
  |     | | | \- <div>
  |     | | |   \- <ul>
  |     | | \- <script>
  |     | |- <div>
  |     | | |- <div>
  |     | | | |- <div>
  |     | | | | |- <div>
  |     | | | | |- <div>
  |     | | | | | |- <span>
  |     | | | | | |- <span>
  |     | | | | | |- <a>
  |     | | | | | | \- <span>
  |     | | | | | \- <span>
  |     | | | | \- <div>
  |     | | | \- <div>
  |     | | |   |- <div>
  |     | | |   |- <div>
  |     | | |   | \- <div>
  |     | | |   |   \- <div>
  |     | | |   \- <div>
  |     | | |     |- <div>
  |     | | |     | \- <form>
  |     | | |     |   |- <input>
  |     | | |     |   |- <input>
  |     | | |     |   |- <input>
  |     | | |     |   |- <input>
  |     | | |     |   |- <input>
  |     | | |     |   \- <span>
  |     | | |     |     \- <span>
  |     | | |     |       \- <input>
  |     | | |     \- <div>
  |     | | |       \- <div>
  |     | | |         |- <input>
  |     | | |         |- <input>
  |     | | |         |- <input>
  |     | | |         \- <a>
  |     | | |           \- <span>
  |     | | |- <div>
  |     | | | \- <p>
  |     | | |   |- <span>
  |     | | |   \- <span>
  |     | | |     \- <span>
  |     | | |- <script>
  |     | | |- <div>
  |     | | | |- <div>
  |     | | | | |- <h2>
  |     | | | | |- <ul>
  |     | | | | | |- <li>
  |     | | | | | | \- <span>
  |     | | | | | |   \- <a>
  |     | | | | | \- <li>
  |     | | | | |- <div>
  |     | | | | | \- <div>
  |     | | | | |   |- <span>
  |     | | | | |   |- <div>
  |     | | | | |   | \- <a>
  |     | | | | |   |   \- <img>
  |     | | | | |   |- <div>
  |     | | | | |   | \- <a>
  |     | | | | |   |   \- <img>
  |     | | | | |   \- <div>
  |     | | | | |     \- <a>
  |     | | | | |       \- <span>
  |     | | | | \- <ul>
  |     | | | |   |- <li>
  |     | | | |   | |- <input>
  |     | | | |   | \- <a>
  |     | | | |   |   \- <span>
  |     | | | |   \- <li>
  |     | | | |     \- <a>
  |     | | | |       \- <span>
  |     | | | |- <div>
  |     | | | | |- <h2>
  |     | | | | \- <div>
  |     | | | |   |- <div>
  |     | | | |   | \- <a>
  |     | | | |   |   \- <img>
  |     | | | |   \- <ul>
  |     | | | |     |- <li>
  |     | | | |     | \- <a>
  |     | | | |     |- <li>
  |     | | | |     |- <li>
  |     | | | |     | \- <a>
  |     | | | |     |- <li>
  |     | | | |     | \- <a>
  |     | | | |     |- <li>
  |     | | | |     | \- <a>
  |     | | | |     \- <li>
  |     | | | |       \- <a>
  |     | | | |         \- <span>
  |     | | | \- <div>
  |     | | |   |- <h2>
  |     | | |   |- <ul>
  |     | | |   | |- <li>
  |     | | |   | | |- <input>
  |     | | |   | | \- <a>
  |     | | |   | |   |- <span>
  |     | | |   | |   \- <span>
  |     | | |   | |- <li>
  |     | | |   | | \- <a>
  |     | | |   | |   \- <span>
  |     | | |   | \- <li>
  |     | | |   |   \- <a>
  |     | | |   |     |- <span>
  |     | | |   |     \- <span>
  |     | | |   |- <hr>
  |     | | |   |- <ul>
  |     | | |   | |- <li>
  |     | | |   | | |- <script>
  |     | | |   | | \- <a>
  |     | | |   | |   |- <span>
  |     | | |   | |   \- <span>
  |     | | |   | |- <li>
  |     | | |   | | |- <script>
  |     | | |   | | \- <a>
  |     | | |   | \- <li>
  |     | | |   |   |- <like>
  |     | | |   |   \- <script>
  |     | | |   |- <hr>
  |     | | |   \- <div>
  |     | | |     \- <ul>
  |     | | |       |- <li>
  |     | | |       \- <li>
  |     | | |         \- <a>
  |     | | \- <div>
  |     | |   |- <div>
  |     | |   | |- <div>
  |     | |   | |- <div>
  |     | |   | | |- <span>
  |     | |   | | |- <span>
  |     | |   | | |- <a>
  |     | |   | | | \- <span>
  |     | |   | | \- <span>
  |     | |   | \- <div>
  |     | |   \- <div>
  |     | |     |- <div>
  |     | |     |- <div>
  |     | |     | \- <div>
  |     | |     |   \- <div>
  |     | |     \- <div>
  |     | |       |- <div>
  |     | |       | \- <form>
  |     | |       |   |- <input>
  |     | |       |   |- <input>
  |     | |       |   |- <input>
  |     | |       |   |- <input>
  |     | |       |   |- <input>
  |     | |       |   \- <span>
  |     | |       |     \- <span>
  |     | |       |       \- <input>
  |     | |       \- <div>
  |     | |         \- <div>
  |     | |           |- <input>
  |     | |           |- <input>
  |     | |           |- <input>
  |     | |           \- <a>
  |     | |             \- <span>
  |     | |- <script>
  |     | |- <div>
  |     | | \- <div>
  |     | |   |- <span>
  |     | |   \- <a>
  |     | |     \- <span>
  |     | \- <script>
  |     |- <div>
  |     \- <div>
  |       |- <div>
  |       | |- <div>
  |       | | |- <h2>
  |       | | \- <a>
  |       | |   \- <span>
  |       | \- <form>
  |       |   |- <div>
  |       |   | |- <div>
  |       |   | |- <div>
  |       |   | |- <input>
  |       |   | |- <div>
  |       |   | |- <div>
  |       |   | | |- <label>
  |       |   | | \- <select>
  |       |   | |   |- <option>
  |       |   | |   |- <optgroup>
  |       |   | |   \- <optgroup>
  |       |   | \- <div>
  |       |   |   |- <label>
  |       |   |   \- <input>
  |       |   \- <div>
  |       |     |- <button>
  |       |     \- <div>
  |       |       \- <span>
  |       \- <div>
  |         \- <div>
  |           |- <a>
  |           | \- <span>
  |           |- <h2>
  |           \- <a>
  |- <div>
  |- <div>
  | |- <p>
  | |- <ul>
  | | |- <li>
  | | | \- <a>
  | | |- <li>
  | | | \- <a>
  | | |- <li>
  | | | \- <a>
  | | |- <li>
  | | | \- <a>
  | | |- <li>
  | | | \- <a>
  | | |- <li>
  | | | \- <a>
  | | |- <li>
  | | | \- <a>
  | | |- <li>
  | | | \- <a>
  | | \- <li>
  | |   \- <a>
  | |- <p>
  | | |- <a>
  | | | \- <span>
  | | |- <a>
  | | | \- <span>
  | | |- <a>
  | | \- <span>
  | \- <script>
  |- <script>
  |- <script>
  |- <div>
  | \- <div>
  |   |- <div>
  |   | \- <h2>
  |   \- <div>
  |     \- <div>
  |       \- <a>
  |         \- <span>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <noscript>
  | \- <iframe>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  |- <script>
  \- <script>

答案 1 :(得分:2)

http://

之前使用"www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school"

答案 2 :(得分:1)

  

$ url =“http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school”;   $ html = file_get_contents($ url);   使用这个网址它正在工作。

你在网址中使用http。

答案 3 :(得分:0)

试试这个工作正常

$url = "http://www.etsy.com/listing/118415624/not-my-small-diary-17-true-high-school";
$html = file_get_contents($url);


echo "<pre>";
print_r($html);
exit;