Question

注意：如果您能够在项目中使用它们，请使用DOM解析器，仅在边缘情况下使用正则表达式。

我需要得到一个包含每个选项内容的数组，这是我的HTML：

<option value="2" selected>none</option>
<option value="1">fronttext</option>
<option value="15">fronttext,frontpicture</option>

我需要得到：

["none", "fronttext", "fronttext,frontpicture"]

我正在使用这个正则表达式：

<option.*>(.*)<\/option>

但是当我在PHP中使用它时：

preg_match_all('/<option.*>(.*)<\/option>/', $string, $matches);

它仅匹配最后一个结果（"fronttext,frontpicture"）。

我做错了什么？

Answer 1

您可以使用以下正则表达式

<option.*?>\K.*?(?=<\/option>)

DEMO

代码将是，

preg_match_all('~<option.*?>\K.*?(?=<\/option>)~', $string, $matches);

示例：

<?php $mystring = <<<'EOT' <option value="2" selected>none</option> <option value="1">fronttext</option> <option value="15">fronttext,frontpicture</option> EOT; preg_match_all('~<option.*?>\K.*?(?=<\/option>)~', $mystring, $matches); print_r($matches); ?>

<强>输出：

Array ( [0] => Array ( [0] => none [1] => fronttext [2] => fronttext,frontpicture ) )

Answer 2

这是因为您在正则表达式中使用.*本质上是 greedy 。

试试这个正则表达式：

preg_match_all('~<option[^>]*>([^<]*)</option>~', $string, $matches);

但是请注意，我建议您使用DOM解析器而不是正则表达式来解析HTML / XML文本。

正则表达式匹配HTML选项的内容

2 个答案: