如何从文本中删除特定用户的所有bbcode引用块?

时间:2017-08-16 13:51:47

标签: php regex substring filtering bbcode

我希望在PHP中删除使用BBCode制作的引号,例如:

[quote=testuser]
[quote=anotheruser]a sdasdsa dfv rdfgrgre gzdf vrdg[/quote]
sdfsd fdsf dsf sdf[/quote]
the rest of the post text

我正在考虑使用阻止系统,因此用户不必查看他们不想要的内容。所以说" testuser"被封锁了,他们不想要整个被引用的部分,包括嵌套在主要引用内部的第二个引用。

所以帖子只剩下:

  

后期文本的其余部分

我想知道最好的方法。我希望正则表达式,但我认为它更复杂,我有这个尝试:

/\[quote\=testuser\](.*)\[\/quote\]/is

但是,然后会捕获所有结束引用标记。

是否有一种替代方法可以快速或正确修复我的正则表达式?

总结:删除被阻止用户的初始报价以及报价内的所有内容,但除此之外别无其他。

2 个答案:

答案 0 :(得分:1)

单个正则表达式无法满足您的需求。我建议您扫描文件,直至找到[quote=testuser]。找到后设置一个布尔值来开始过滤,计数器设置为1.增加布尔值为真后遇到的每个[quote=...]标签的计数器。减少您遇到的每个[/quote]标记的计数器。当计数器达到0时,将过滤的布尔值更改为false。

这是一些sudocode。您可能需要根据您的应用程序对其进行一些修改,但我认为它会显示要使用的常规算法。

filtering = false
counter = 0
for each line:
    if line contains "[quote=testuser]"
        filtering = true
        counter = 0
    if line contains "[quote="
        counter += 1
    if line contains "[/quote]
        counter -= 1
    if counter = 0
        filtering = false
    if not filtering
        print line

答案 1 :(得分:1)

据我所知,这不是一个简单的过程。这是我的步骤......

  1. 使用preg_split()将输入字符串分为3种方式:打开引号标签,关闭引号标签等。我正在拆分开始和结束标记,但使用DELIM_CAPTURE将它们保存在输出数组和原始位置/顺序中。使用NO_EMPTY以便在foreach循环中没有无用的迭代。
  2. 循环生成的数组并搜索要省略的用户名。
  3. 当找到目标用户的引用时,存储该元素的起始索引,并将$open设置为1.
  4. 每当找到新的开场报价代码时,$open都会递增。
  5. 每当找到新的结束引用标记时,$open都会递减。
  6. 只要$open到达0$startend索引就会被送到range(),以生成一个数组,其中包含两个点之间的数字
  7. 当然,
  8. array_flip()会将值移动到键。
  9. array_diff_key()删除preg_split()生成的数组中的点范围。
  10. 如果一切顺利,implode()会将子串重新粘合在一起,只保留所需的组件。
  11. 代码:(Demo

    /*
    This function DOES NOT validate the $bbcode string to contain a balanced number of opening & closing tags.
    This funcion DOES check that there are enough closing tags to conclude a targeted opening tag.
    */
    function omit_user_quotes($bbcode,$user){
        $substrings=preg_split('~(\[/?quote[^]]*\])~',$bbcode,NULL,PREG_SPLIT_DELIM_CAPTURE|PREG_SPLIT_NO_EMPTY);
        $opens=0;  // necessary declaration to avoid Notice when no quote tags in $bbcode string
        foreach($substrings as $index=>$substring){
            if(!isset($start) && $substring=="[quote={$user}]"){  // found targeted user's first opening quote
                $start=$index;  // disqualify the first if statement and start searching for end tag
                $opens=1;  // $opens counts how many end tags are required to conclude quote block
            }elseif(isset($start)){
                if(strpos($substring,'[quote=')!==false){  // if a nested opening quote tag is found
                    ++$opens;  // increment and continue looking for closing quote tags
                }elseif(strpos($substring,'[/quote]')!==false){  // if a closing quote tag is found
                    --$opens;  // decrement and check for quote tag conclusion or error
                    if(!$opens){  // if $opens is zero ($opens can never be less than zero)
                        $substrings=array_diff_key($substrings,array_flip(range($start,$index)));  // slice away unwanted elements from input array
                        unset($start);  // re-qualify the first if statement to allow the process to repeat
                    }
                }
            }
        }
        if($opens){  // if $opens is positive
            return 'Error due to opening/closing tag imbalance (too few end tags)';
        }else{
            return trim(implode($substrings));  // trims the whitespaces on either side of $bbcode string as feature
        }    
    }
    
    /* Single unwanted quote with nested innocent quote: */
    /*$bbcode='[quote=testuser]
    [quote=anotheruser]a sdasdsa dfv rdfgrgre gzdf vrdg[/quote]
    sdfsd fdsf dsf sdf[/quote]
    the rest of the test'; */
    /* output: the rest of the test */
    
    /* Complex battery of unwanted, wanted, and nested quotes: */
    $bbcode='[quote=mickmackusa]Keep this[/quote]
    [quote=testuser]Don\'t keep this because 
        [quote=mickmackusa]said don\'t do it[/quote]
        ... like that\'s a good reason
        [quote=NaughtySquid] It\'s tricky business, no?[/quote]
        [quote=nester][quote=nesty][quote=nested][/quote][/quote][/quote]
    [/quote]
    Let\'s remove a second set of quotes
    [quote=testuser]Another quote block[/quote]
    [quote=mickmackusa]Let\'s do a third quote inside of my quote...
    [quote=testuser]Another quote block[/quote]
    [/quote]
    This should be good, but
    What if [quote=testuser]quotes himself [quote=testuser] inside of his own[/quote] quote[/quote]?';
    /* output: [quote=mickmackusa]Keep this[/quote]
    
    Let's remove a second set of quotes
    
    [quote=mickmackusa]Let's do a third quote inside of my quote...
    
    [/quote]
    This should be good, but
    What if ? */
    
    /* No quotes: */
    //$bbcode='This has no bbcode quote tags in it.';
    /* output: This has no bbcode quote tags in it. */
    
    /* Too few end quote tags by innocent user:
    (No flag is raised because the targeted user has not quoted any text) */
    //$bbcode='This [quote=mickmackusa] has not end tag.';
    /* output: This [quote=mickmackusa] has not end tag. */
    
    /* Too few end quote tags by unwanted user: */
    //$bbcode='This [quote=testuser] has not end tag.';
    /* output: Error due to opening/closing tag imbalance (too few end tags) */
    
    /* Too many end quote tags by unwanted user: 
    (No flag is raised because the function does not validate the bbcode text as fully balanced) */
    //$bbcode='This [quote=testuser] has too many end[/quote] tags.[/quote]';
    /* output: This  tags.[/quote] */
    
    $user='testuser';
    
    echo omit_user_quotes($bbcode,$user);  // omit a single user's quote blocks
    
    /* Or if you want to omit quote blocks from multiple users, you can use a loop:
    $users=['mickmackusa','NaughtySquid'];
    foreach($users as $user){
        $bbcode=omit_user_quotes($bbcode,$user);
    }
    echo $bbcode;
    */