Question

我想从Python项目中提取所有Python函数/方法及其签名。我试过了：

$ grep -r ^def *

但是当参数跨越多行时，这不会显示完整的签名。有什么建议吗？

Answer 1

您可以tokenize the file并使用它来打印功能定义：

import token
from tokenize import generate_tokens

def find_definitions(filename):
    with open(filename) as f:
        gen = generate_tokens(f.readline)
        for tok in gen:
            if tok[0] == token.NAME and tok[1] == 'def':
                # function definition, read until next colon.
                definition, last_line = [tok[-1]], tok[3][0]
                while not (tok[0] == token.OP and tok[1] == ':'):
                    if last_line != tok[3][0]:
                        # more than one line, append, track line number
                        definition.append(tok[-1])
                        last_line = tok[3][0]
                    tok = next(gen)
                if last_line != tok[3][0]:
                    definition.append(tok[-1])
                yield ''.join(definition)

无论函数定义使用多少行，都可以使用。

演示：

>>> import textwrap
>>> gen = find_definitions(textwrap.__file__.rstrip('c'))
>>> for definition in gen:
...     print(definition.rstrip())
...
    def __init__(self,
                 width=70,
                 initial_indent="",
                 subsequent_indent="",
                 expand_tabs=True,
                 replace_whitespace=True,
                 fix_sentence_endings=False,
                 break_long_words=True,
                 drop_whitespace=True,
                 break_on_hyphens=True):
    def _munge_whitespace(self, text):
    def _split(self, text):
    def _fix_sentence_endings(self, chunks):
    def _handle_long_word(self, reversed_chunks, cur_line, cur_len, width):
    def _wrap_chunks(self, chunks):
    def wrap(self, text):
    def fill(self, text):
def wrap(text, width=70, **kwargs):
def fill(text, width=70, **kwargs):
def dedent(text):

上面使用textwrap模块来演示它如何处理多行定义。

如果您需要支持带注释的Python 3代码，那么您需要更加聪明一点并跟踪开放和关闭的问题;括号内的冒号不算数。另一方面，Python 3 tokenize.tokenize()生成命名元组，使下面的函数更容易阅读：

import token
from tokenize import tokenize

def find_definitions(filename):
    with open(filename, 'rb') as f:
        gen = tokenize(f.readline)
        for tok in gen:               
            if tok.type == token.NAME and tok.string == 'def':
                # function definition, read until next colon outside
                # parentheses.
                definition, last_line = [tok.line], tok.end[0]
                parens = 0
                while tok.exact_type != token.COLON or parens > 0:
                    if last_line != tok.end[0]:
                        definition.append(tok.line)
                        last_line = tok.end[0]
                    if tok.exact_type == token.LPAR:
                        parens += 1
                    elif tok.exact_type == token.RPAR:
                        parens -= 1
                    tok = next(gen)
                if last_line != tok.end[0]:
                    definition.append(tok.line)
                yield ''.join(definition)

在Python 3中，您最好以二进制模式打开源文件，让令牌器找出正确的编码。此外，上面的Python 3版本可以毫无问题地标记Python 2代码。

Answer 2

在我看来，这不是一个使用正则表达式的地方，除非你接受这样一个事实，即你可能会遗漏许多边缘情况。

相反，我建议您使用inspect和funcsigs（funcsigs是Python 3 inspect模块中所做更改的后端。它包括签名解析功能）

这是我们要解析的文件（inspect_me.py）：

import sys


def my_func(a, b=None):
    print a, b


def another_func(c):
    """
    doc comment
    """
    return c + 1

这是代码，它将为我们解析它：

import inspect
from funcsigs import signature

import inspect_me


if __name__ == "__main__":
    # get all the "members" of our module:
    members = inspect.getmembers(inspect_me)
    for k, v in members:
        # we're only interested in functions for now (classes, vars, etc... may come later in a very similar fashion):
        if inspect.isfunction(v):
            # the name of our function:
            print k

            # the function signature as a string
            sig = signature(v)
            print str(sig)

            # let's strip out the doc string too:
            if inspect.getdoc(v):
                print "FOUND DOC COMMENT: %s" % (inspect.getdoc(v))

Inspect is the way在python中进行内省。 token和ast都可以完成这项工作，但它们的水平/复杂程度低于您实际需要的程度。

运行上述的输出：

another_func
(c)
FOUND DOC COMMENT: doc comment
my_func
(a, b=None)

Answer 3

您可以使用ast module解析来源。它允许您查看解释器看到的完全相同的代码结构。您只需遍历它并转储您找到的任何函数定义。

如果你想处理像多行声明这样的边缘情况，bash / grep是不够的。

如何在Python项目中打印所有函数/方法的签名？

3 个答案: