Question

我想做的是这个（用伪代码）：

search for [[phrase//<img src="example.jpg" />//description (if applicable)]]
replace with:
<a>phrase
<div>Description<br><img src="example.jpg"></div>
</a>

例如，我要转换它：

[[transvaginal pudendal nerve block//<img src="3bc9e18a9fa82a1bd4e0c8c580909389.jpg" />//image of transvaginal pudendal nerve block]]

对此：

<a>transvaginal pudendal nerve block
<div>image of transvaginal pudendal nerve block<br><img src="3bc9e18a9fa82a1bd4e0c8c580909389.jpg" /></div>
</a>

到目前为止，这是我的代码：

import re

answer_string = open("answer.txt", "r").read()
pattern = re.compile(r"\[\[.*\]\]")

for raw_material in re.findall(pattern, answer_string):
    copy_material = raw_material
    copy_material = copy_material.replace("[[", "")
    copy_material = copy_material.replace("]]", "")
    copy_material = copy_material.split("//")

    if len(copy_material) >= 3:
        raw_material = "<a>" + copy_material[0] + "<div>" + copy_material[2] + "<br>" + copy_material[1] + "</div></a>"
    else:
        raw_material = "<a>" + copy_material[0] + "<div>" + copy_material[1] + "</div></a>"

with open('new_answer.txt','w') as f:
  f.write(answer_string)
  f.close()

我认为，通过设置raw_material =，我可以立即更改该短语，但我想没有。关于如何使用正则表达式查找内容，对其进行操作，然后替换该短语，有些困惑。

Answer 1

实际上，您可以使用re.sub来替换匹配项，并且如果替换字符串在不同情况下可能不同，则可以调用一个小函数。例如：

def replace_string(matchobj):
    if len(matchobj.groups()) == 5:
        if matchobj.group(5):
            return "<a>"+matchobj.group(2)+"\n<div>"+matchobj.group(5)+"<br>"+matchobj.group(3)+"</div></a>"
        else: 
            return "<a>"+matchobj.group(2)+"\n<div><br>"+matchobj.group(3)+"</div></a>"
    else:
        return ""

pattern = re.compile(r"\[\[((.*?)//)(.*?)(//(.*?))*\]\]")

print re.sub(pattern, replace_string, answer_string)

现在，这是快速而又肮脏的，但想法是re.sub将查找并替换所有匹配项。我更改了模式以添加括号，这使Python将捕获的匹配“捕获”到匹配对象的groups()中。因此，根据添加的括号，有5个捕获组。我认为每次该表达式匹配时都会有5个捕获组，但是如果不匹配，捕获的组将是None。

当替换发生时，它将调用replace_string函数，并且代码将根据第5个组是否为None来决定返回什么。如果省略//description部分，则会发生这种情况。我不确定是否需要检查5个小组，但想确定一下。

无论如何，我认为这至少应该为您指明一个有用的方向。

操作后查找并替换正则表达式术语吗？

1 个答案: