Question

我想在字符串中搜索模式，然后再次在匹配模式中搜索一些无效字符，然后将其删除或替换为一些有效字符。

我有一些示例字典，例如。 sample_dict = {"randomId":"123y" uhnb\n g", "desc": ["sample description"]}

在这种情况下，我想查找字典的值，说“ 123y”，然后删除其中的无效字符，例如（“，\ t，\ n）等。我尝试过的是将所有词典存储在文件中，然后读取文件和匹配模式以获取字典值，但这为我提供了匹配模式的列表，我也可以编译这些匹配项，但是我不确定如何在原始字典中执行替换值，因此我的最终输出将是： {"randomId":"123y uhnb g", "desc": ["sample description"]}

pattern = re.findall("\":\"(.+?)\"", sample_dict)

预期结果：

{"randomId":"123y uhnb g", "desc": ["sample description"]}

实际结果：

['123y" uhnb\n g']

Answer 1

您可以使用re.sub替换值中的非字母数字字符如下

dct = {"randomId":"123y uhnb\n g", "desc": ["sample description"]}
import re

for key, value in dct.items():
    val = None
    #If the value is a string, directly substitute
    if isinstance(value, str):
       val = re.sub(r"[^a-zA-Z0-9 ]", '', str(value))
    #If value is a list, substitute for all string in the list
    elif isinstance(value, list):
       val = []
       for item in value:
           val.append(re.sub(r"[^a-zA-Z0-9]", ' ', str(item)))
    dct[key] = val

print(dct)
#{'randomId': '123y uhnb g', 'desc': ['sample description']}

如何从字符串中替换匹配模式中的多个值

1 个答案: