Question

我正在尝试匹配此（特别是名称）：

<tr>
    <th class="name">Name:</th>
    <td>John Smith</td>
</tr>

像这样：

preg_match('/<th class="name">Name:<\/th>.+?<td>(.+)<\/td>/s', $a, $b);

但是，虽然它与名称匹配，但它不会在名称的末尾停止。它继续为另外150个左右的角色。为什么是这样？我只想匹配这个名字。

Answer 1

让最后一个量词非贪婪：preg_match('/<th class="name">Name:<\/th>.+?<td>(.+?)<\/td>/s', $a, $b);

Answer 2

不要使用正则表达式解析HTML，使用DOMDocument很容易：

<?php 
$html = <<<HTML
<tr>
    <th class="name">Name:</th>
    <td>John Smith</td>
</tr>
<tr>
    <th class="name">Somthing:</th>
    <td>Foobar</td>
</tr>
HTML;

$dom = new DOMDocument();
@$dom->loadHTML($html);

$ret = array();
foreach($dom->getElementsByTagName('tr') as $tr) {
    $ret[trim($tr->getElementsByTagName('th')->item(0)->nodeValue,':')] = $tr->getElementsByTagName('td')->item(0)->nodeValue;
}

print_r($ret);
/*
Array
(
    [Name] => John Smith
    [Somthing] => Foobar
)
*/
?>

Answer 3

preg_match('/<th class="name">Name:<\/th>\s*<td>(.+?)<\/td>/s', $line, $matches);

仅匹配</th>和<td>之间的空格，以及名称的非贪婪匹配。

Answer 4

preg_match('/<th class="name">Name:<\/th>.+?<td>(?P<name>.*)<\/td>/s', $str, $match);

echo $match['name'];

Answer 5

这是你的比赛

preg_match(!<tr>\s*<th[^>]*>Name:</th>\s*<td>([^<]*)</td>\s*</tr>!s)

它会完美运作。

我的正则表达式不知道何时停止

5 个答案: