用re模块替换python字符串

时间:2018-05-21 14:25:14

标签: python regex

我正在寻找一种方法来替换一个字符串,该字符串可以包含任何类型的字符,但有两个规则:

  • 此字符串不得包含在开始和结束引号或双引号之间。
  • 不得直接在其后面或之后 字母数字字符或下划线。

例如:

myStringToBeReplaced = "any type* of (h@ra[ters –"

mySourceString = """
a = any type* of (h@ra[ters –*2
print "The value of any type* of (h@ra[ters – is: ",any type* of (h@ra[ters –, " and it's like this !"
b  = any type* of (h@ra[ters –With alpha numeric close to it
"""

myReplacementString = "HELLO"

theResultShouldBe ="""
a = HELLO*2
print "The value of any type* of (h@ra[ters – is: ", HELLO, " and it's like this !"
b  = any type* of (h@ra[ters –With alpha numeric close to it
"""

非常感谢

JD

目前首次尝试使用更简单的字符串:

#!/usr/bin/env python
# -*- coding: latin1 -*-
import re

myStringToBeReplaced = "anytypeof(h@ra[ters"

mySourceString = """
a = anytypeof(h@ra[ters*2
print "anytypeof(h@ra[ters is:", anytypeof(h@ra[ters , " and it's like this !"
b  = anytypeof(h@ra[tersWith alpha numeric close to it
"""

myReplacementString = "HELLO"

myescape = re.escape(myStringToBeReplaced)

pattern = "(?<!\"|')" + myescape + "(?!\"|')" 

result = re.sub(pattern, myReplacementString, mySourceString)

print result

这给出了:

a = HELLO*2
print "anytypeof(h@ra[ters is:", HELLO , " and it's like this !"
b  = HELLOWith alpha numeric close to it

1 个答案:

答案 0 :(得分:2)

要解决您的问题,您需要匹配单引号或双引号字符串,同时将它们捕获到一个组中,然后使用非模糊的(?<!\w) / (?!\w)字边界匹配您需要的搜索字符串(因为您的搜索字词可能以非字符字符开头/结尾,您无法使用\b):

import re

myStringToBeReplaced = "anytypeof(h@ra[ters"

mySourceString = """
a = anytypeof(h@ra[ters*2
print "anytypeof(h@ra[ters is:", anytypeof(h@ra[ters , " and it's like this !"
b  = anytypeof(h@ra[tersWith alpha numeric close to it
"""

def myReplacementString(m):
    if m.group(1):
        return m.group(1)
    else:
        return "HELLO"

myescape = re.escape(myStringToBeReplaced)
pattern = r'''('[^'\\]*(?:\\.[^'\\]*)*'|"[^"\\]*(?:\\.[^"\\]*)*")|(?<!\w){}(?!\w)'''.format(myescape)
result = re.sub(pattern, myReplacementString, mySourceString)
print result

请参阅Python demo

<强>详情

  • ('[^'\\]*(?:\\.[^'\\]*)*'|"[^"\\]*(?:\\.[^"\\]*)*") - 两者中的任何一个:
    • '[^'\\]*(?:\\.[^'\\]*)*' - 单引号C字符串文字
    • |
    • "[^"\\]*(?:\\.[^"\\]*)*" - 双引号字符串文字
  • | - 或
  • (?<!\w) - 在搜索字词
  • 之前不允许使用字词字符
  • {} - (转义搜索字词)
  • (?!\w) - 搜索字词后不允许使用字词。

请注意,myReplacementString现在是方法,它将在re.sub内传递匹配数据对象(作为第二个参数)。在那里检查匹配数据对象,如果组1匹配,则返回其值,否则返回新字符串,用于替换整个匹配。