Question

我有一个脚本可以解析此文件中的第一个大写单词：

IMPORT fs

IF fs.exists("fs.pyra") THEN
    PRINT "fs.pyra Exists!" 
END

脚本如下所示：

file = open(sys.argv[1], "r")
file = file.read().split("\n")

while '' in file:
    findIt = file.index('')
    file.pop(findIt)

for line in file:
    func = ""
    index = 0
    while line[index] == " ":
        index = index + 1
    while not line[index] == " " or "=" and line[index].isupper():
        func = func + line[index]
        index = index + 1
    print func

所有已使用的模块已经导入我在参数中传递了被解析的文件路径，并且我得到了这个输出：

IMPORT
IF
PRINT
Traceback (most recent call last):
  File "src/source.py", line 20, in <module>
    while not line[index] == " " or "=" and line[index].isupper():
IndexError: string index out of range

这意味着它成功解析直到列表中的最后一个参数，然后它根本不解析它。我该如何解决这个问题？

Answer 1

您不需要增加空格索引 - line.strip()将删除前导和尾随空格。

你可以split()空格行来获取单词。

然后你可以迭代这些字符串并使用isupper()来检查整个单词，而不是单个字符

或者，通过[A-Z]+

的模式匹配器运行整个文件

无论如何，你的错误......

while not line[index] == " " or "="

or "="始终为True，因此您的索引超出范围

Answer 2

如果您尝试处理的文件与Python内置的标记生成器兼容，您可以使用它，以便它也可以处理引号内的内容，然后使用第一个名称标记从每一行的大写中找到，例如：

import sys
from itertools import groupby
from tokenize import generate_tokens, NAME

with open(sys.argv[1]) as fin:
    # Tokenize and group by each line
    grouped = groupby(tokenize.generate_tokens(fin.readline), lambda L: L[4])
    # Go over the lines
    for k, g in grouped:
        try:
            # Get the first capitalised name
            print next(t[1] for t in g if t[0] == NAME and t[1].isupper())
        except StopIteration:
            # Couldn't find one - so no panic - move on
            pass

这会给你：

IMPORT
IF
PRINT
END

成功解析直到IndexError？

2 个答案: