Question

我使用simplehtmldom来获取某些链接的标题，并想知道我是否可以限制下载内容的大小？而不是只下载整个内容的前20行代码来获得标题。

现在我正在使用它：

  $html = file_get_html($row['current_url']);

  $e = $html->find('title', 0);
  $title = $e->innertext;
  echo $e->innertext . '<br><br>';

感谢

Answer 1

除非我错过了什么，否则这不是file_get_html的工作方式。它将检索页面的内容。

换句话说，它必须阅读整个页面才能找到它在下一部分中寻找的内容。

现在，如果您要使用：

$section = file_get_contents('http://www.the-URL.com/', NULL, NULL, 0, 444);

您可以隔离html的前20行，只要您从<!DOCTYPE html>到</head><body>或<title></title>获得的页面始终相同。

然后你可以再次抓住前20行，只要Head的数量是相同的。

然后使用：

$html = str_get_html($section);

然后从那里使用你的'查找'

$html->find('title', 0);

<小时/> 修改

include('simple_html_dom.php'); $the_url = 'http://www.the-URL.com/'; // Read 444 characters starting from the 1st character $section = file_get_contents($the_url, NULL, NULL, 0, 444); $html = str_get_html($section); if (!$e = $html->find('title', 0)) { // Read 444 characters starting from the 445th character $section = file_get_contents($the_url, NULL, NULL, 444, 888); $html = str_get_html($section); $e = $html->find('title', 0); } $title = $e->innertext; echo $title . '<br><br>';

Simplehtmldom - 限制get_html的内容大小？

1 个答案: