使用RegEx过滤

时间:2018-07-30 00:40:56

标签: regex python-3.x

我需要在下面的字符串上使用正则表达式来捕获类别为b的ID。

"id":"1","string of variable length","category":"a";"id":"2","string of variable length","category":"b";"id":"3","string of variable length","category":"a";"id":"4","string of variable length","category":"b";"id":"5","string of variable length","category":"a"

在这种情况下,我应该能够捕获24,仅此而已。 我尝试了模式"id":"(\d+?)",".*?","category":"b",但是失败了。

2 个答案:

答案 0 :(得分:1)

如果您知道字符串中哪些字符合法,则可以使用类似以下内容的

"[a-zA-Z0-9|\s]*"(?=,"category":"b";)

哪个将拔出,"category":"b"之前的字符串;要获取ID,您可以使用类似以下内容的方法:

(?<="id":")\d(?=","[a-zA-Z0-9|\s]*","category":"b";)

答案 1 :(得分:1)

正则表达式(?<="id":")\d+(?="[^;]*"category":"b")将执行以下操作:

import re
print(re.findall(r'(?<="id":")\d+(?="[^;]*"category":"b")', '"id":"1","string of variable length","category":"a";"id":"2","string of variable length","category":"b";"id":"3","string of variable length","category":"a";"id":"4","string of variable length","category":"b";"id":"5","string of variable length","category":"a"'))

这将输出:

['2', '4']