字符串的递归解压缩

时间:2020-10-31 19:38:41

标签: python recursion compression introduction

我正在尝试使用递归解压缩字符串。例如,输入:

3 [b3 [a]]

应输出:

baaabaaabaaa

但是我得到了

baaaabaaaabaaaabbaaaabaaaabaaaaa

我有以下代码,但显然已关闭。第一个find_end函数按预期工作。对于使用递归和任何帮助理解/跟踪多余字母的来源或任何帮助我了解这种非常酷的方法学的一般性提示,我是绝对陌生的。

def find_end(original, start, level):
    if original[start] != "[":
        message = "ERROR in find_error, must start with [:", original[start:]
        raise ValueError(message)
    indent = level * "  "
    index = start + 1
    count = 1
    while count != 0 and index < len(original):
        if original[index] == "[":
            count += 1
        elif original[index] == "]":
            count -= 1
        index += 1
    if count != 0:
        message = "ERROR in find_error, mismatched brackets:", original[start:]
        raise ValueError(message)
    return index - 1


def decompress(original, level):
# set the result to an empty string
    result = ""
# for any character in the string we have not looked at yet
    for i in range(len(original)):
# if the character at the current index is a digit
        if original[i].isnumeric():
# the character of the current index is the number of repetitions needed
            repititions = int(original[i])
# start = the next index containing the '[' character
            x = 0
            while x < (len(original)):
                if original[x].isnumeric():
                    start = x + 1
                    x = len(original)
                else:
                    x += 1
# last = the index of the matching ']'
            last = find_end(original, start, level)
# calculate a substring using `original[start + 1:last]
            sub_original = original[start + 1 : last]
# RECURSIVELY call decompress with the substring
            # sub = decompress(original, level + 1)
# concatenate the result of the recursive call times the number of repetitions needed to the result
            result += decompress(sub_original, level + 1) * repititions
# set the current index to the index of the matching ']'
            i = last
# else
        else:
# concatenate the letter at the current index to the result
            if original[i] != "[" and original[i] != "]":
                result += original[i]
# return the result
    return result


def main():
    passed = True
    ORIGINAL = 0
    EXPECTED = 1
    # The test cases
    provided = [
        ("3[b]", "bbb"),
        ("3[b3[a]]", "baaabaaabaaa"),
        ("3[b2[ca]]", "bcacabcacabcaca"),
        ("5[a3[b]1[ab]]", "abbbababbbababbbababbbababbbab"),
    ]
    # Run the provided tests cases
    for t in provided:
        actual = decompress(t[ORIGINAL], 0)
        if actual != t[EXPECTED]:
            print("Error decompressing:", t[ORIGINAL])
            print("   Expected:", t[EXPECTED])
            print("   Actual:  ", actual)
            print()
            passed = False
    # print that all the tests passed
    if passed:
        print("All tests passed")


if __name__ == '__main__':
    main()

1 个答案:

答案 0 :(得分:1)

从我从您的代码中收集的信息来看,它可能会给出错误的结果,因为您采用了在给定级别上查找最后一个匹配的右括号的方法(我不确定100%肯定,代码很多)。但是,我可以建议使用堆栈的更简洁的方法(几乎与DFS类似,没有复杂性):

def decomp(s):
    stack = []
    for i in s:
        if i.isalnum():
            stack.append(i)
        elif i == "]":
            temp = stack.pop()
            count = stack.pop()
            if count.isnumeric():
                stack.append(int(count)*temp)
            else:
                stack.append(count+temp)
    for i in range(len(stack)-2, -1, -1):
        if stack[i].isnumeric():
            stack[i] = int(stack[i])*stack[i+1]
        else:
            stack[i] += stack[i+1]
    return stack[0]

print(decomp("3[b]"))          # bbb
print(decomp("3[b3[a]]"))      # baaabaaabaaa
print(decomp("3[b2[ca]]"))     # bcacabcacabcaca
print(decomp("5[a3[b]1[ab]]")) # abbbababbbababbbababbbababbbab

这是基于一个简单的观察结果:宁可在读取[之后评估子字符串,而在遇到]之后评估子字符串。这样,您也可以在对零件进行单独评估之后建立结果。 (这类似于使用编程进行的前缀/后缀评估)。
(如果需要,您也可以在其中添加错误检查。一次检查该字符串在语义上是否正确,而在另一次检查中对其进行评估,则比一次性完成都容易得多)

相关问题