假设我有以下字符串:
mystr = """
<p>Some text and another text. </p> ![image_file_1][image_desc_1] some other text.
<p>some text</p>
![image_file_2][image_desc_2] and image: ![image_file_3][image_desc_3]
test case 1: ![dont_match_1]
test case 2: [dont_match_2][dont_match_3]
finally: ![image_file_4][image_desc_4]
"""
我可以使用以下代码获取image_file_X
:
import re
re.findall('(?<=!\[)[^]]+(?=\]\[.*?\])', mystr)
我想捕获image_desc_X
,但是后续操作不起作用:
re.findall('(?!\[.*?\]\[)[^]]+(?=\])', mystr)
有什么建议吗?如果我可以使用一个更好的命令同时获得image_file
和image_desc
。
答案 0 :(得分:2)
使用以下方法:
result = re.findall(r'!\[([^]]+)\]\[([^]]+)\]', mystr)
print(result)
输出:
[('image_file_1', 'image_desc_1'), ('image_file_2', 'image_desc_2'), ('image_file_3', 'image_desc_3'), ('image_file_4', 'image_desc_4')]
答案 1 :(得分:1)
我想你可以使用:
for match in re.finditer(r"!\[(.*?)\]\[(.*?)]", mystr):
print match.group(1)
print match.group(2)
输出:
image_file_1
image_desc_1
image_file_2
image_desc_2
image_file_3
image_desc_3
image_file_4
image_desc_4