我有一个字符列表,我想在字符串中找到它,并将它的多次出现一起替换为一次出现。
但是我遇到了两个问题 - 当我循环它们时,re.sub函数不会替换多次出现,当我有笑脸时:)它将':'替换为':)'我不想要
这是我尝试的代码。
end_of_line_chars = [".",";","!",":)",":-)","=)",":]",":-(",":(",":[","=(",":P",":-P",":-p",":p","=P"]
for i in end_of_line_chars:
pattern = "[" + i + "]" + "+"
str = re.sub(pattern,i,str)
如果我使用单个字符并尝试按下图所示工作。
str = re.sub("[.]+",".",str)
但循环遍历字符列表会出错。 如何解决这两个问题?谢谢你的帮助。
答案 0 :(得分:1)
re.escape(str)
为你逃跑。与|
分开,您可以匹配替代品。使用(?:…)
,您可以在不捕获的情况下进行分组。所以:
# Only in Python2:
from itertools import imap as map, ifilter as filter
# Escape all elements for, e.g. ':-)' → r'\:\-\)':
esc = map(re.escape, end_of_line_chars)
# Wrap elements in capturing as group, so you know what element what found,
# and in a non-capturing group with repeats and optional trailing spaces:
esc = map(r'(?:({})\s*)+'.format, esc)
# Compile expressing what finds any of these elements:
esc = re.compile('|'.join(esc))
# The function to turn a match of repeats into a single item:
def replace_with_one(match):
# match.groups() has captures, where only the found one is truthy: ()
# e.g. (None, None, None, None, ':-)', None, None, None, None, None, None, None, None, None, None, None)
return next(filter(bool, match.groups()))
# This is how you use it:
esc.sub(replace_with_one, '.... :-) :-) :-) :-( .....')
# Returns: '.:-):-(.'
答案 1 :(得分:0)
如果要替换的内容不是单个字符,则字符类不会起作用。相反,使用非捕获组(并使用re.escape
,因此文字不会被解释为正则表达式特殊字符):
end_of_line_chars = [".",";","!",":)",":-)","=)",":]",":-(",":(",":[","=(",":P",":-P",":-p",":p","=P"]
for i in end_of_line_chars:
pattern = r"(?:{})+".format(re.escape(i))
str = re.sub(pattern,i,str)