Question

我正在使用file_get_contents()，我得到了一些像这样的html字符串：

$html = "
<select>
   <option>I need this part 1/ I don't need this 1 </option>
   <option>I need this part 2/ I don't need this 2 </option>
   <option>I need this part 3/ I don't need this 3 </option>
   ...
   <option>I need this part 50/ I don't need this 50 </option>
</select>";

我想摆脱所有/ I don't need this [n]。

知道怎么做吗？

Answer 1

代码：（Demo）

$html = "
<select>
   <option>I need this part 1/ I don't need this 1 </option>
   <option>I need this part 2/ I don't need this 2 </option>
   <option>I need this part 3/ I don't need this 3 </option>
   ...
   <option>I need this part 50/ I don't need this 50 </option>
</select>";

echo $html=preg_replace('~/.*<~','<',$html);

使用~作为模式分隔符，这样您就不必转义正则表达式中的斜杠。
. 可以 应该贪婪，因为除非你告诉它，否则点不会进入新行（使用{{1}在模式结尾处标记）并且如果您的任何不需要的子字符串包含s，这将保护您的HTML文本不被破坏。
不要使用捕获组，因为它会减慢您的模式，并且您没有在替换字符串中使用任何捕获引用。

输出：

最后，如果您不需要的子字符串不包含<select> <option>I need this part 1</option> <option>I need this part 2</option> <option>I need this part 3</option> ... <option>I need this part 50</option> </select>，那么以下模式＆amp;替换文本将远远超过我的上述方法：

模式：< 替换：~/[^<]+</~ Regex Demo

如何删除每个选项标签内的后半部分文本？

1 个答案: