Python标识符识别

时间:2017-03-14 08:17:10

标签: python

我正在Python程序中读取Python文件,我想获取正在读取的Python文件中的所有标识符,文字,分隔符和终结符的列表。使用标识符作为示例:

one_var = "something"
two_var = "something else"
other_var = "something different"

假设上面的变量在正在读取的文件中,结果应为:

list_of_identifiers = [one_var, two_var, other_var]

文字,终结符和分隔符也是如此。感谢

我已经为所有运营商和关键字编写了代码:

import keyword, operator
list_of_operators = []
list_of_keywords = []
more_operators = ['+', '-', '/', '*', '%', '**', '//', '==', '!=', '>', '<', '>=', '<=', '=', '+=', '-=', '*=', '/=', '%=', '**=', '//=', '&', '|', '^', '~', '<<', '>>', 'in', 'not in', 'is', 'is not', 'not', 'or', 'and']
with open('file.py') as data_source:
    for each_line in data_source:
        new_string = str(each_line).split(' ')
        for each_word in new_string:
            if each_word in keyword.kwlist:
                list_of_keywords.append(each_word)
            elif each_word in operator.__all__ or each_word in more_operators:
                list_of_operators.append(each_word)
    print("Operators found:\n", list_of_operators)
    print("Keywords found:\n", list_of_keywords)

1 个答案:

答案 0 :(得分:1)

import ast

with open('file.py') as data_source:
    ast_root = ast.parse(data_source.read())

identifiers = set()

for node in ast.walk(ast_root):
    if isinstance(node, ast.Name):
        identifiers.add(node.id)

print(identifiers)