Question

所以今天我正在研究一个函数，它从一大块数据中删除任何引用的字符串，并用格式区域替换它们（{0}，{1}等等。）。 /> 我遇到了一个问题，因为输出变得完全混乱，就像{1}在一个看似随机的地方一样。我后来发现这是一个问题，因为更换列表中的切片会改变列表，使得它的长度不同，因此之前的re匹配不会排列（它只适用于第一次迭代）正如预期的那样，字符串的收集工作完美，因为这对re来说肯定不是问题我已经阅读了about mutable sequences以及其他一些内容，但却无法找到任何内容。
我认为我需要的是str.replace，但可以采用切片，而不是子串这是我的代码：

import re

def rm_strings_from_data(data):
    regex = re.compile(r'"(.*?)"')
    s = regex.finditer(data)
    list_data = list(data)
    val = 0
    strings = []

    for i in s:
        string = i.group()
        start, end = i.span()
        strings.append(string)
        list_data[start:end] = '{%d}' % val
        val += 1

    print(strings, ''.join(list_data), sep='\n\n')

if __name__ == '__main__':
    rm_strings_from_data('[hi="hello!" thing="a thing!" other="other thing"]')

我得到：

['"hello!"', '"a thing!"', '"other thing"']

[hi={0} thing="a th{1}r="other thing{2}

我想要输出：

['"hello!"', '"a thing!"', '"other thing"']

[hi={0} thing={1} other={2}]

任何帮助将不胜感激。谢谢你的时间：）

Answer 1

为什么不使用正则表达式捕获组匹配两个key=value部分，如下所示：(\w+?)=(".*?")
然后根据需要组装列表变得非常容易。

Sample Code：

import re

def rm_strings_from_data(data):
    regex = re.compile(r'(\w+?)=(".*?")')
    matches = regex.finditer(data)
    strings = []
    list_data = []
    for matchNum, match in enumerate(matches):
        matchNum = matchNum + 1
        strings.append(match.group(2))
        list_data.append((match.group(1) + '={' + str(matchNum) + '} '))

    print(strings, '[' + ''.join(list_data) + ']', sep='\n\n')

if __name__ == '__main__':
    rm_strings_from_data('[hi="hello!" thing="a thing!" other="other thing"]')

用不同大小的字符串替换字符串python切片，但保持结构

1 个答案: