RegEx从代码中删除方法

时间:2010-02-07 18:48:40

标签: regex

使用正则表达式我正在尝试从以下代码中删除所有方法/函数。只留下“全球范围”。但是,我无法使其与方法的所有内部内容匹配。

<?php
$mother = new Mother();
class Hello
{
    public function FunctionName($value="username",)
    {

    }
    public function ododeqwdo($value='')
    {
        # code...
    }
    public function ofdoeqdoq($value='')
    {
    if(isset($mother)) {
        echo $lol;
    }
    if(lol(9)) {
       echo 'lol';
    }
    }
}
function user()
{
    if(isset($mother)) {
        echo $lol;
    }
    if(lol(9)) {
       echo 'lol';
    }
}
    $mother->global();
function asodaosdo() {

}

我当前的正则表达式是:(?:(public|protected|private|static)\s+)?function\s+\w+\(.*?\)\s+{.*?}但是,它不会选择内部包含括号的方法,例如function user()

如果有人能指出我正确的方向。

2 个答案:

答案 0 :(得分:6)

使用正则表达式无法正常执行此操作。您需要编写一个可以正确解析注释,字符串文字和嵌套括号的解析器。

正则表达式无法应对这些情况:

class Hello
{
  function foo()
  {
    echo '} <- that is not the closing bracket!';
    // and this: } bracket isn't the closing bracket either!
    /*
    } and that one isn't as well...
    */
  }
}

修改

这里有一个关于如何使用XUE Can提到的标记器功能的小演示:

$source = <<<BLOCK
<?php

\$mother = new Mother("this function isNotAFunction(\$x=0) {} foo bar");

class Hello
{
    \$foo = 666;

    public function FunctionName(\$value="username",)
    {

    }
    private \$bar;
    private function ododeqwdo(\$value='')
    {
        # code...
    }
    protected function ofdoeqdoq    (\$value='')
    {
        if(isset(\$mother)) {
            echo \$lol . 'function() {';
        }
        if(lol(9)) {
           echo 'lol';
        }
    }
}

function user()
{
    if(isset(\$mother)) {
        echo \$lol;
    }
    /* comment inside */
    if(lol(9)) {
       echo 'lol';
    }
}
/* comment to preserve function noFunction(){} */
\$mother->global();

function asodaosdo() {

}

?>
BLOCK;

if (!defined('T_ML_COMMENT')) {
   define('T_ML_COMMENT', T_COMMENT);
} 
else {
   define('T_DOC_COMMENT', T_ML_COMMENT);
}

// Tokenize the source
$tokens = token_get_all($source);

// Some flags and counters
$tFunction = false;
$functionBracketBalance = 0;
$buffer = '';

// Iterate over all tokens
foreach ($tokens as $token) {
    // Single-character tokens.
    if(is_string($token)) {
        if(!$tFunction) {
            echo $token;
        }
        if($tFunction && $token == '{') {
            // Increase the bracket-counter (not the class-brackets: `$tFunction` must be true!)
            $functionBracketBalance++;
        }
        if($tFunction && $token == '}') {
            // Decrease the bracket-counter (not the class-brackets: `$tFunction` must be true!)
            $functionBracketBalance--;
            if($functionBracketBalance == 0) {
                // If it's the closing bracket of the function, reset `$tFunction`
                $tFunction = false;
            }
        }
    } 
    // Tokens consisting of (possibly) more than one character.
    else {
        list($id, $text) = $token;
        switch ($id) {
            case T_PUBLIC:
            case T_PROTECTED:
            case T_PRIVATE: 
                // Don'timmediately echo 'public', 'protected' or 'private'
                // before we know if it's part of a variable or method.
                $buffer = "$text ";
                break; 
            case T_WHITESPACE:
                // Only display spaces if we're outside a function.
                if(!$tFunction) echo $text;
                break;
            case T_FUNCTION:
                // If we encounter the keyword 'function', flip the `tFunction` flag to 
                // true and reset the `buffer` 
                $tFunction = true;
                $buffer = '';
                break;
            default:
                // Echo all other tokens if we're not in a function and prepend a possible 
                // 'public', 'protected' or 'private' previously put in the `buffer`.
                if(!$tFunction) {
                    echo "$buffer$text";
                    $buffer = '';
                }
       }
   }
}

将打印:

<?php

$mother = new Mother("this function isNotAFunction($x=0) {} foo bar");

class Hello
{
    $foo = 666;


     private $bar;


}


/* comment to preserve function noFunction(){} */
$mother->global();



?>

这是原始来源,只有没有功能。

答案 1 :(得分:3)

我相信使用来自Zend Framework的PHP's built-in Tokenizer功能或Zend_CodeGenerator是一种更安全的方法。这些也将使您的代码更容易阅读。

这只是因为如果你想使用regexp来解析源代码,你必须维护自己的令牌集,但是有一个内置的解决方案。