如何从python文件中捕获类和方法?
我不关心attrs或args。
class MyClass_1(...):
...
def method1_of_first_class(self):
...
def method2_of_first_class(self):
...
def method3_of_first_class(self):
...
class MyClass_2(...):
...
def method1_of_second_class(self):
...
def method2_of_second_class(self):
...
def method3_of_second_class(self):
...
到目前为止我尝试了什么:
class ([\w_]+?)\(.*?\):.*?(?:def ([\w_]+?)\(self.*?\):.*?)+?
选项:dot匹配换行符
抓住课程
Match the characters “class ” literally «class »
Match the regular expression below and capture its match into backreference number 1 «([\w_]+?)»
Match a single character present in the list below «[\w_]+?»
Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»
A word character (letters, digits, etc.) «\w»
The character “_” «_»
Match the character “(” literally «\(»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “)” literally «\)»
Match the character “:” literally «:»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
掌握方法:
Match the regular expression below «(?:def ([\w_]+?)\(self.*?\):.*?)+?»
Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»
Match the characters “def ” literally «def »
Match the regular expression below and capture its match into backreference number 2 «([\w_]+?)»
Match a single character present in the list below «[\w_]+?»
Between one and unlimited times, as few times as possible, expanding as needed (lazy) «+?»
A word character (letters, digits, etc.) «\w»
The character “_” «_»
Match the character “(” literally «\(»
Match the characters “self” literally «self»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
Match the character “)” literally «\)»
Match the character “:” literally «:»
Match any single character «.*?»
Between zero and unlimited times, as few times as possible, expanding as needed (lazy) «*?»
但是它只捕获了类名和第一种方法,我认为这是因为反向引号2不能捕获超过1,即使它在(?:myregex)+内部?
当前输出:
'MyClass_1':'method1_of_first_class',
'MyClass_2':'method1_of_second_class'
期望的输出:
'MyClass_1':['method1_of_first_class','method2_of_first_class',...],
'MyClass_2':['method1_of_second_class','method2_of_second_class',...]
答案 0 :(得分:2)
由于一个类可以包含另一个类或另一个函数,并且一个函数可以包含另一个函数或另一个类,只需使用正则表达式获取类和函数声明将导致层次结构信息丢失。
特别是,Python安装中的pydoc.py
(可从2.1版获得)是此类案例的主要示例。
在Python中解析Python代码很简单,因为Python在parser
模块和(从2.6版本)ast
模块中包含内置解析器。
这是使用ast
模块(版本2.6及更高版本)在Python中解析Python代码的示例代码:
from ast import *
import sys
fi = open(sys.argv[1])
source = fi.read()
fi.close()
parse_tree = parse(source)
class Node:
def __init__(self, node, children):
self.node = node;
self.children = children
def __repr__(self):
return "{{{}: {}}}".format(self.node, self.children)
class ClassVisitor(NodeVisitor):
def visit_ClassDef(self, node):
# print(node, node.name)
r = self.generic_visit(node)
return Node(("class", node.name), r)
def visit_FunctionDef(self, node):
# print(node, node.name)
r = self.generic_visit(node)
return Node(("function", node.name), r)
def generic_visit(self, node):
"""Called if no explicit visitor function exists for a node."""
node_list = []
def add_child(nl, children):
if children is None:
pass
''' Disable 2 lines below if you need more scoping information '''
elif type(children) is list:
nl += children
else:
nl.append(children)
for field, value in iter_fields(node):
if isinstance(value, list):
for item in value:
if isinstance(item, AST):
add_child(node_list, self.visit(item))
elif isinstance(value, AST):
add_child(node_list, self.visit(value))
return node_list if node_list else None
print(ClassVisitor().visit(parse_tree))
代码已经在Python 2.7和Python 3.2中进行了测试。
由于generic_visit
的默认实现没有返回任何内容,我复制了generic_visit
的源代码并修改了它以将返回值传递回调用者。
答案 1 :(得分:0)
您可以使用this regex开头:
/class\s(\w+)|def\s(\w+)/gm
这将匹配所有类和方法名称。要将其纳入您在评论中提到的结构中,您可能需要使用实现语言。
修改: here's a PHP implementation example:
$output = array();
foreach ($match_array[0] as $key => $value) {
if (substr($value, 0, 5) === 'class') {
$output[$value] = array();
$parent_key = $value;
continue;
}
$output[$parent_key][] = $value;
}
// print_r($output);
foreach ($output as $parent => $values) {
echo '[' . $parent . ', [' . implode(',', $values) . ']]' . PHP_EOL;
}
示例输出:
[class MyClass_1, [def method1_of_first_class,def method2_of_first_class,def method3_of_first_class]]
[class MyClass_2, [def method1_of_second_class,def method2_of_second_class,def method3_of_second_class]]