使用正则表达式替换python中的字符串的一部分

时间:2017-06-16 07:04:19

标签: python regex

我想要的是:
原始字符串:(#1 AND #12) OR #10
转换为:(something AND another_something) OR something_another

意思是说,根据#number将其替换为唯一的字符串

我做的是:

filter_string = "(#1 AND #12) OR #10"
for fltr in filters_array:
        index = fltr[0] #numbers coming from here
        replace_by = fltr[1] #this string will replace original one
        filter_string = re.sub(r'#'+str(index),replace_by,filter_string)

输出:

(something AND something2) OR something0

问题:而不是替换#1它取代了#12和#11,因为#12也有#1 我在count = 1函数中尝试使用re.sub(),但由于我的字符串可能是&{39; (#12 AND #1)'同样。

2 个答案:

答案 0 :(得分:3)

使用单词边界\\b锚点来强制使用完全匹配的数字:

filter_string = "(#1 AND #12) OR #10"
filters_array = [(1,"something"),(10,"something_another"),(12,"another_somet‌​hing")]
for num,s in filters_array:
    filter_string = re.sub(r'#'+ str(num) +'\\b', s, filter_string)

print(filter_string)

输出:

(something AND another_somet‌​hing) OR something_another

http://www.regular-expressions.info/wordboundaries.html

答案 1 :(得分:1)

您可以将元组列表转换为字典,并使用带有捕获数字部分的模式的re.sub,然后使用替换参数中的lambda表达式来按键找到正确的值:

import re
filter_string = "(#1 AND #12) OR #10"
filters_array = [(1,"something"),(10,"something_another"),(12,"another_something")]
dt = dict(filters_array)
filter_string = re.sub(r'#([0-9]+)', lambda x: dt[int(x.group(1))] if int(x.group(1)) in dt else x.group(), filter_string)
print(filter_string)
# => (something AND another_something) OR something_another

#([0-9]+)模式匹配#,然后匹配并捕获组1中的一个或多个数字。然后,在lambda内部,数值用于获取现有值。如果它不存在,# +号码将被插回到结果中。

请参阅Python demo

如果您需要进一步处理匹配,您可能需要在替换参数中使用回调方法而不是lamda:

import re

filters_array = [(1,"something"),(10,"something_another"),(12,"another_something")]
dt = dict(filters_array)

def repl(m):
    return dt[int(m.group(1))] if int(m.group(1)) in dt else m.group()

filter_string = re.sub(r'#([0-9]+)', repl, "(#1 AND #12) OR #10")
print(filter_string)

请参阅another Python demo