在字符串后只获取接下来的6个字符并循环代码以获取所有字符

时间:2016-05-13 22:27:07

标签: php

我发现此链接很有用

Get only the next 6 characters after a word

但如何使用“foreach”

使其在整个页面上循环播放
$url = 'http://www.example.com/news/page'; 
$needle = '<div class="title"><a  href="http://www.example.com/news/'; 
$contents = file_get_contents($url);
$str = substr($contents, strpos($contents, $needle) + strlen($needle), 6);

此代码结果123456作为第一个找到的ID ...

如何让它循环整个页面?

1 个答案:

答案 0 :(得分:0)

我建议不要使用strpos执行此操作,因为该文件可能会略微改变您要查找的HTML,此处和那里有额外的空间,或换行符,额外的属性,... .etc。

解析HTML的方法是使用DOM。在PHP中,这是通过DOMDocument类完成的:

$url = 'http://www.example.com/news/page'; 
$doc = new DOMDocument();
// ignore errors due to malformed HTML at URL
libxml_use_internal_errors(true);
$doc->loadHTMLFile($url);
$xpath = new DOMXpath($doc);
$elements = $xpath->query("//div[@class='title']/a[starts-with(@href, 'http://www.example.com/news/')]");
$href="";
foreach ($elements as $a) {
    $href = $a->getAttribute('href');
    break; // only the first has our interest
};
// At this point $href is the full href content.
// Split the URL in parts by "/", and get the part that interests us
$num = explode("/", $href)[4]; // adapt "4" to get the right part

echo $num; // 123456