将PHP Doc注释解析为数据结构

时间:2011-01-15 21:55:05

标签: php parsing phpdoc

我在PHP中使用Reflection API从方法中提取DocComment(PHPDoc)字符串

$r = new ReflectionMethod($object);
$comment = $r->getDocComment();

这将返回一个类似于此字符串的字符串(取决于记录方法的程度)

/**
* Does this great things
*
* @param string $thing
* @return Some_Great_Thing
*/

是否有任何可以将PHP Doc Comment String解析为数据结构的内置方法或函数?

$object = some_magic_function_or_method($comment_string);

echo 'Returns a: ', $object->return;

缺乏这一点,我应该看看PHPDoc source code的哪一部分。

缺少和/或除此之外,是否有第三方代码被认为是“更好”的PHPDoc代码?

我意识到解析这些字符串不是火箭科学,甚至不是计算机科学,但我更喜欢一个经过良好测试的库/例程/方法,它是为了处理很多janky,半非正确的PHP而构建的可能存在于野外的文档代码。

12 个答案:

答案 0 :(得分:21)

我很惊讶尚未提及:如何使用Zend Framework的 Zend_Reflection ?这可能会派上用场,特别是如果你使用像Magento这样的Zend Framework构建的软件。

有关代码示例,请参阅Zend Framework Manual,可用方法请参阅API Documentation

有不同的方法可以做到这一点:

  • 将文件名传递给Zend_Reflection_File。
  • 将对象传递给Zend_Reflection_Class。
  • 将对象和方法名称传递给Zend_Reflection_Method。
  • 如果你真的只有注释字符串,你甚至可以将一个小虚拟类的代码放在一起,将它保存到临时文件并将该文件传递给Zend_Reflection_File。

让我们来看看这个简单的情况,并假设你有一个你想要检查的现有课程。

代码就像这样(未经测试,请原谅我):

$method = new Zend_Reflection_Method($class, 'yourMethod');
$docblock = $method->getDocBlock();

if ($docBlock->hasTag('return')) {
    $tagReturn = $docBlock->getTag('return'); // $tagReturn is an instance of Zend_Reflection_Docblock_Tag_Return
    echo "Returns a: " . $tagReturn->getType() . "<br>";
    echo "Comment for return type: " . $tagReturn->getDescription();
}

答案 1 :(得分:16)

您可以使用Fabien Potencier Sami(“又一个PHP API文档生成器”)开源项目中的“ DocBlockParser ”类。
首先,从GitHub获得萨米语 这是如何使用它的一个例子:

<?php

require_once 'Sami/Parser/DocBlockParser.php';
require_once 'Sami/Parser/Node/DocBlockNode.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Node\DocBlockNode;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $doc = $dbp->parse($comment);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>

这是测试页面的输出:

** getDesc:
This is the short description.

This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description
** getTags:
Array
(
    [param] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => foo
                    [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
                )

            [1] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => bool
                                    [1] => 
                                )

                            [1] => Array
                                (
                                    [0] => int
                                    [1] => 
                                )

                        )

                    [1] => bar
                    [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
                )

        )

    [return] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => Array
                                (
                                    [0] => string
                                    [1] => 
                                )

                        )

                    [1] => de-html_entitied string (no entities at all)
                )

        )

)

** getTag('param'):
Array
(
    [0] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => string
                            [1] => 
                        )

                )

            [1] => foo
            [2] => sometimes a boolean, sometimes a string (or, could have just used "mixed")
        )

    [1] => Array
        (
            [0] => Array
                (
                    [0] => Array
                        (
                            [0] => bool
                            [1] => 
                        )

                    [1] => Array
                        (
                            [0] => int
                            [1] => 
                        )

                )

            [1] => bar
            [2] => sometimes a boolean, sometimes an int (again, could have just used "mixed")
        )

)

** getErrors:
Array
(
)

** getOtherTags:
Array
(
)

** getShortDesc:
This is the short description.
** getLongDesc:
This is the 1st line of the long description 
This is the 2nd line of the long description 
This is the 3rd line of the long description

答案 2 :(得分:6)

您可以使用DocBlox(http://github.com/mvriel/docblox)为您生成XML数据结构;您可以使用PEAR安装DocBlox,然后运行命令:

docblox parse -d [FOLDER] -t [TARGET_LOCATION]

这将生成一个名为structure.xml的文件,其中包含有关源代码的所有元数据,包括已解析的文档块。

您可以使用DocBlox_Reflection_DocBlock*类直接解析一段DocBlock文本。

您可以通过确保启用自动加载(或包含所有DocBlox_Reflection_DocBlock *文件)并执行以下操作来执行此操作:

$parsed = new DocBlox_Reflection_DocBlock($docblock);

之后,您可以使用getters提取所需的信息。

注意:您不需要删除星号; Reflection类负责处理这个问题。

答案 3 :(得分:5)

结帐

http://pecl.php.net/package/docblock

我认为docblock_tokenize()函数会让你在那里分开。

答案 4 :(得分:5)

我建议附录,它非常酷,并且在许多php5框架中使用...

http://code.google.com/p/addendum/

检查测试的例子

http://code.google.com/p/addendum/source/browse/trunk#trunk%2Fannotations%2Ftests

答案 5 :(得分:4)

您始终可以从phpDoc查看来源。代码在LGPL下,因此,如果您决定复制它,则需要在相同的许可下许可您的软件并正确添加正确的通知。

编辑:除非@Samuel Herzog注意到你将它用作图书馆。

感谢@Samuel Herzog的澄清。

答案 6 :(得分:3)

根据您的描述,我只能怀疑您要做的事情(PHP代码文档)。既然你没有说明你为什么要这样做,我只能推测。

也许你应该尝试另一种方法。要记录PHP代码(如果这是你正在尝试的),我会使用doxygen,从代码注释的外观来看,它已经为doxygen格式化了。

使用Graphvizdoxygen也会渲染漂亮的类图并调用树。

答案 7 :(得分:2)

如果您尝试读取@标签及其值,那么使用preg_match将是最佳解决方案。

答案 8 :(得分:2)

我建议你看看http://code.google.com/p/php-annotations/

如果需要,可以很容易地修改/理解代码。

答案 9 :(得分:1)

正如上面的一个答案所指出的,你可以使用phpDocumentor。如果您使用composer,那么只需添加 &#34; phpdocumentor / reflection-docblock&#34;:&#34; ~2.0&#34; 到你的&#34;要求&#34;块。

请参阅此示例:https://github.com/abdulla16/decoupled-app/blob/master/composer.json

有关用法示例,请参阅: https://github.com/abdulla16/decoupled-app/blob/master/Container/Container.php

答案 10 :(得分:1)

更新版本的user1419445代码。 DocBlockParser::parse()方法已更改,需要第二个上下文参数。它似乎也与phpDocumentor略有结合,所以为了简单起见,我假设你已经通过Composer安装了Sami。以下代码适用于Sami v4.0.16

<?php

require_once 'vendor/autoload.php';

class TestClass {
    /**
     * This is the short description.
     *  
     * This is the 1st line of the long description 
     * This is the 2nd line of the long description 
     * This is the 3rd line of the long description   
     *  
     * @param bool|string $foo sometimes a boolean, sometimes a string (or, could have just used "mixed")
     * @param bool|int $bar sometimes a boolean, sometimes an int (again, could have just used "mixed") 
     * @return string de-html_entitied string (no entities at all)
     */
    public function another_test($foo, $bar) {
        return strtr($foo,array_flip(get_html_translation_table(HTML_ENTITIES)));
    }
}

use Sami\Parser\DocBlockParser;
use Sami\Parser\Filter\PublicFilter;
use Sami\Parser\ParserContext;

try {
    $method = new ReflectionMethod('TestClass', 'another_test');
    $comment = $method->getDocComment();
    if ($comment !== FALSE) {
        $dbp = new DocBlockParser();
        $filter = new PublicFilter;
        $context = new ParserContext($filter, $dbp, NULL);
        $doc = $dbp->parse($comment, $context);
        echo "\n** getDesc:\n";
        print_r($doc->getDesc());
        echo "\n** getTags:\n";
        print_r($doc->getTags());
        echo "\n** getTag('param'):\n";
        print_r($doc->getTag('param'));
        echo "\n** getErrors:\n";
        print_r($doc->getErrors());
        echo "\n** getOtherTags:\n";
        print_r($doc->getOtherTags());
        echo "\n** getShortDesc:\n";
        print_r($doc->getShortDesc());
        echo "\n** getLongDesc:\n";
        print_r($doc->getLongDesc());
    }
} catch (Exception $e) {
    print_r($e);
}

?>

答案 11 :(得分:1)

看看Php Comment Manager软件包。它允许解析方法DocBloc注释。它使用Php Reflection API来获取方法的DocBloc注释