Question

我需要在下面的字符串上使用正则表达式来捕获类别为b的ID。

"id":"1","string of variable length","category":"a";"id":"2","string of variable length","category":"b";"id":"3","string of variable length","category":"a";"id":"4","string of variable length","category":"b";"id":"5","string of variable length","category":"a"

在这种情况下，我应该能够捕获2，4，仅此而已。我尝试了模式"id":"(\d+?)",".*?","category":"b"，但是失败了。

Answer 1

如果您知道字符串中哪些字符合法，则可以使用类似以下内容的

"[a-zA-Z0-9|\s]*"(?=,"category":"b";)

哪个将拔出,"category":"b"之前的字符串；要获取ID，您可以使用类似以下内容的方法：

(?<="id":")\d(?=","[a-zA-Z0-9|\s]*","category":"b";)

Answer 2

正则表达式(?<="id":")\d+(?="[^;]*"category":"b")将执行以下操作：

import re
print(re.findall(r'(?<="id":")\d+(?="[^;]*"category":"b")', '"id":"1","string of variable length","category":"a";"id":"2","string of variable length","category":"b";"id":"3","string of variable length","category":"a";"id":"4","string of variable length","category":"b";"id":"5","string of variable length","category":"a"'))

这将输出：

['2', '4']

使用RegEx过滤

2 个答案: