Question

我非常感谢你的时间。

我成功捕获了我指定为* code *的分隔符标记之间的一些文本。我有多个IE：*代码*代码在这里＃1 *代码*然后*代码*代码在这里＃2 *代码*。我正在努力将* code *标记之间的REGEX捕获代码添加到我的类中以进行格式化。它总是一遍又一遍地显示为“代码＃1”。

The input text is:

*image1* 
Some More Text here

That's a title pic and there are 2 more enable pictures per page. 
*code* CENTER CODES HERE *code*  Those can be a bit larger. And then     there is more 
code to show *code* MORE CENTER CODE *code*

Paragraph Test

这是我捕获文本然后迭代的内容：

    replace = CodeboxReplace()
    codeboxRE = re.compile('\*code\*(.*?)\*code\*')
    found = codeboxRE.findall(thisText)
    for item in found:
        thisText = codeboxRE.sub(replace(item), thisText)

好的，然后类CodeboxReplace（）看起来像这样{CODEHERE}是我用的标签，用在代码分隔符之间匹配的实际代码替换：

class CodeboxReplace(object):
def __init__(self):
    self.counter = 0

def __call__(self, match):
    self.counter += 1
    .......some not relevant code here................
    codeHereRE = re.compile('{CODEHERE}')
    found = codeHereRE.findall(myCode)
    for item in found:
        myCode = codeHereRE.sub(match, myCode)
    return myCode

所以从根本上说，我希望在分隔符之间捕获的代码片段替换{CODEHERE}。但是每个匹配始终只使用REGEX中的第一个捕获。

帮助！谢谢！

如果你想看看渲染的样子： http://www.americantechnocracy.com/getArticle

最诚挚的问候，汤姆

Answer 1

正则表达式对象的 sub 方法替换了模式的所有非重叠，出现的情况。所以，这是第一次执行：

myCode = codeHereRE.sub(match, myCode)

它取代了＆＃39; {CODEHERE}＆＃39;的所有出现。如果您只想替换1个出现，请使用 sub 的 count 参数：

myCode = codeHereRE.sub(match, myCode, count=1)

Python将多个捕获的REGEX匹配传递给Function

1 个答案: