Question

我有一个很长的文字，它是

的一部分

C: state name of the Company in Russian: [03_SNYuLOOO IC "Story Group".]
). - [04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B].

我需要找到所有这样的子串：

[03_SNYuLOOO IC "Story Group".]
[04_MNMestablishment of the Company: 107S64, Russian Federation, Moscow, 
ul. Krasnobogatyrskaya, 2, is built.
2, floor 3. com. 11. Office B]

我尝试使用

re.findall(r'^\[\d{2}_[\s\S]+\]$', text)

但它返回空列表。我错了什么？

Answer 1

^和$锚点要求整个字符串与模式匹配，[\s\S]+尽可能多地匹配任意1个字符，抓取任何[和{{ 1}}在前往字符串末尾的路上，因此最终的]将匹配字符串中最右边的]。

您可以使用以下正则表达式：

请参阅regex demo

<强>详情

r'\[\d{2}_[^]]+]' - 文字\[
[ - 两位数
\d{2} - 下划线
_ - [^]]+
] - 文字]。

请参阅Python demo：

正则表达式：查找文本中的所有子字符串

1 个答案: