Question

我想从垃圾文本（基本上是html）中找到所谓的短代码，并用某些内容替换它们。

[[img src="picture.jpg" alt="Description" caption="Caption here..."]]

我知道我可以使用.find，但是我如何告诉python从[[img开始并停在"]]？问题是文本内部可能有很多，即使是在行......

Answer 1

喜欢@Sam建议

import re
t = 'foobar[[img src="picture.jpg" alt="Description" caption="Caption here..."]]helloworld'
p = r'(\[\[img.*?\"\]\])'
foo = re.findall(p,t)

t = t.replace(foo[0],'bar')
print t

查找以... endin开头的文本（短代码样式）

1 个答案: