Question

我有以下字符串s：

s = "this is a test <#1> that can be a very good test (#2) to look at [#3] test [#4], but also computer <#4> and test"

如您所见，这是一个普通句子，其中包含形式为<...>, (...)或[..]的方括号，方括号之间是子字符串。

我想提取括号中的子字符串，假设括号位于单词test或computer之后。换句话说，我想要以下输出：

[["test", "#1"], ["test", "#2"], ["test", "#4"], ["computer", "#4"]]

到目前为止，这就是我所拥有的：我可以使用正则表达式来找到方括号，例如

import re
re.findall(re.compile("<.*?>"), s)

但是现在我需要将其概括为[..]，(..)和和，仅在括号位于单词test或computer。使用正则表达式可以实现这一点吗？

Answer 1

尝试以下模式：

(test|computer)\s[\[\(<](.*?)[\]\)>]

因此，代码将是：

import re
pattern = r'(test|computer)\s[\[\(<](.*?)[\]\)>]'
print(re.findall(re.compile(p), s))

输出：

[（'test'，'＃1'），（'test'，'＃2'），（'test'，'＃4'），（'计算机'，'＃4'）]

但是，这假设您在test和#1之间将有一个空格。如果要匹配多个空格，请将模式更改为\s+。