我需要在下面的字符串上使用正则表达式来捕获类别为b
的ID。
"id":"1","string of variable length","category":"a";"id":"2","string of variable length","category":"b";"id":"3","string of variable length","category":"a";"id":"4","string of variable length","category":"b";"id":"5","string of variable length","category":"a"
在这种情况下,我应该能够捕获2
,4
,仅此而已。
我尝试了模式"id":"(\d+?)",".*?","category":"b"
,但是失败了。
答案 0 :(得分:1)
如果您知道字符串中哪些字符合法,则可以使用类似以下内容的
"[a-zA-Z0-9|\s]*"(?=,"category":"b";)
哪个将拔出,"category":"b"
之前的字符串;要获取ID,您可以使用类似以下内容的方法:
(?<="id":")\d(?=","[a-zA-Z0-9|\s]*","category":"b";)
答案 1 :(得分:1)
正则表达式(?<="id":")\d+(?="[^;]*"category":"b")
将执行以下操作:
import re
print(re.findall(r'(?<="id":")\d+(?="[^;]*"category":"b")', '"id":"1","string of variable length","category":"a";"id":"2","string of variable length","category":"b";"id":"3","string of variable length","category":"a";"id":"4","string of variable length","category":"b";"id":"5","string of variable length","category":"a"'))
这将输出:
['2', '4']