Question

我在我的html内容中有这个，我想从中提取一些文字：

<p>
    <strong>Text I want to extract</strong>
    <br />Text I want to extract including "<br>" <br /><br />
    <strong>Text I want to extract</strong>
    <br />Text I want to extract<br /><br />
    <strong>Text I want to extract</strong>
    <br />Text I want to extract ...

正如您所看到的，总会有一个强大的标签，然后是一些描述。

有没有人知道如何使用preg_match或preg_match_all实现这一目标，还是在这里使用domcrawler更好？

最佳，基督教

Answer 1

试试这个，

$str = '<p>
    <strong>Text I want to extract</strong>
    <br />Text I want to extract including <br> <br /><br />
    <strong>Text I want to extract</strong>
    <br />Text I want to extract<br /><br />
    <strong>Text I want to extract</strong>
    <br />Text I want to extract ...';
$tname = 'strong';
$pattern = "/<$tname ?.*>(.*)<\/$tname>/";
preg_match_all($pattern, $str, $matches);
print_r($matches[1]);

Answer 2

$string = '<p>    <strong>Text I want to extract</strong>
<br />Text I want to extract including "<br>" <br /><br /> 
 <strong>Text I want to extract</strong>    
<br />Text I want to extract<br /><br />    
<strong>Text I want to extract</strong>    
<br />Text I want to extract ...';

$pattern = "#</strong\b[^>]*>(.*?)<\s*?strong\b[^>]*>#s";
preg_match_all($pattern, $str, $matches);
print_r($matches);

Answer 3

尝试使用Dom来获取强标记内的字符串可以对其他标记执行相同的操作：

<?php
$str='<p>
    <strong>Text I want to extract</strong>
    <br />Text I want to extract including "<br>" <br /><br />
    <strong>Text I want to extract</strong>
    <br />Text I want to extract<br /><br />
    <strong>Text I want to extract</strong>
    <br />Text I want to extract ...
    </p>
';

$dom=new DomDocument();

$dom->loadHTML($str);
$books = $dom->getElementsByTagName('strong');
foreach ($books as $book) {
    echo $book->nodeValue, PHP_EOL;
}

DEMO HERE

preg_match强标记后跟描述

3 个答案: