Question

如何在 php 中的li标记之间获取字符串？我尝试过很多 php 代码，但它不起作用。

<li class="release">
    <strong>Release info:</strong>
    <div>
        How.to.Train.Your.Dragon.2.2014.All.BluRay.Persian
    </div>
    <div>
        How.to.Train.Your.Dragon.2.2014.1080p.BRRip.x264.DTS-JYK
    </div>
    <div>
        How.to.Train.Your.Dragon.2.2014.720p.BluRay.x264-SPARKS
    </div>
</li>

Answer 1

你可以试试这个

$myPattern = "/<li class=\"release\">(.*?)<\/li>/s";
$myText = '<li class="release">*</li>';
preg_match($myPattern,$myText,$match);
echo $match[1];

Answer 2

你不需要正则表达式。它似乎是a common mistake to use regular expressions to parse HTML code（我从T.J.Crowder评论中获取了URL）。

使用工具解析HTML，例如：DOM库。

这是获取所有字符串的解决方案（我假设这些是文本节点的值）：

$doc = new DOMDocument();
libxml_use_internal_errors(true);
$doc->loadHTML($html);
$xpath = new DOMXPath($doc);
$nodes = $xpath->query('//li//text()');
$strings = array();
foreach($nodes as $node) {
    $string = trim($node->nodeValue);
    if( $string !== '' ) {
        $strings[] = trim($node->nodeValue);
    }
}

print_r($strings);输出：

Array
(
    [0] => Release info:
    [1] => How.to.Train.Your.Dragon.2.2014.All.BluRay.Persian
    [2] => How.to.Train.Your.Dragon.2.2014.1080p.BRRip.x264.DTS-JYK
    [3] => How.to.Train.Your.Dragon.2.2014.720p.BluRay.x264-SPARKS
)

需要正则表达式

2 个答案: