正则表达式捕获字符串中两点之间的文本

时间:2018-09-05 09:44:00

标签: regex python-3.x

我正在尝试编写正则表达式以在一段文本中捕获对英国立法的引用。

The law on this point is contained in section 4 of the Sausages and Beans Act 2005 and the Police Brutality and Cuddly Cats Act 2017.

目标是捕获:

Sausages and Beans Act 2005
Police Brutality and Cuddly Cats Act 2017

为此,我使用以下正则表达式(我有一个demo here):

(?<= of the\s)(.*)((\d{4}))(?!=\d{4}[\s])

只要同一行中的匹配项不超过一个,该表达式就可以正常工作。例如,

Section 24 of the Sausages and Beans Act 2015 is very interesting.

将匹配

Sausages and Beans Act 2015

但是

Section 24 of the Sausages and Beans Act 2015 is very interesting and so is section 1 of the Police Brutality and Cuddly Cats Act 2017. 

会匹配

Sausages and Beans Act 2015 is very interesting and so is section 1 of the Police Brutality and Cuddly Cats Act 2017

对表达式做了什么修改,以便即使在同一行中找到匹配项,也要返回两个单独的匹配项?

1 个答案:

答案 0 :(得分:1)

使用RunspaceId : dsfdsfsdfsf9 Date : 24/07/2014 8:41:48 CreationTime : 24/07/2014 8:41:48 LastModifiedTime : 5/09/2018 12:42:37 Name : Kalender FolderPath : /Kalender FolderId : LgAAAADmF+sdfsdfdf/KAAAAAAENAAAC FolderType : Calendar ContentFolder : True ContentMailboxGuid : 86afb4sfdsfdsdfsd7 代替.*?。这将匹配可能的最小字符串(非贪婪),而不是最大字符串(贪婪)。