输入

w1 <strong>w2</strong> w3 w4 <strong><strong>w5 <strong>w6</strong></strong></strong> w7

输出

w1 <strong>w2</strong> w3 w4 <strong>w5 w6</strong> w7

我的目的是消除所有重复的内部子标记，但保留父项及其内容。两者都导致html呈现为这样（所以我试图让标签更干净）：

w1 w2 w3 w4 w5 w6 w7

非常感谢您的帮助。

Answer 1

使用@HamZa

的解决方案

function remove_duplicate_child_tag($input, $tag = 'strong'){

    $pattern = "~
<{$tag}\s*>                         # Match strong open tag
(                                   # Open group 1
    (?:                             # Non-capturing group
        (?:(?!</?strong\s*>).)      # Match anything that not strong tag (open/close)
        |                           # Or
        (?R)                        # Repeat the whole pattern
    )*                              # Repeat the non-capturing group zero or more times
)                                   # Close group 1
</{$tag}\s*>                        # Match strong close tag
~xs";

    $output = preg_replace_callback($pattern, function($m, $tag = 'strong'){
        return "<{$tag}>" . preg_replace("~</?{$tag}\s*>~", '', $m[1]) . "</{$tag}>";
    }, $input);

    return $output;
}

Answer 2

试试这个：

$str = 'w1 <strong>w2</strong> w3 w4 <strong><strong>w5 <strong>w6</strong></strong></strong> w7';
$filtered_array = preg_replace('/(\<[a-zA-Z\/]+\>)+/', '$1', $str);

这将匹配并替换子HTML标记。

PHP正则表达式删除重复的子标记但保留父级和内容

输入

输出

2 个答案:

使用@HamZa