在为该任务找到正确的正则表达式时遇到了一些问题,请问我的初学者技能是什么。我想做的是仅从其“可用”:true而不是“可用”:false的行中获取id值。我可以通过re.findall('"id":(\d{13})', line, re.DOTALL)
获得所有行的ID(13就是正好匹配13位数字,因为代码中还有其他ID少于13位我不需要)。
{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
{"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
{"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},
因此最终结果必须为['1651572973431','1351572943231']
感谢您的大力帮助
答案 0 :(得分:3)
这可能不是一个很好的答案,这取决于您所拥有的。它看起来像,就像您有一个字符串列表,并且您希望从其中的某些字符串获得ID。如果真是这样,那么如果您解析JSON而不是编写拜占庭式正则表达式,它将更加整洁且易于阅读。例如:
import json
# lines is a list of strings:
lines = ['{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
'{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}',
]
# parse it and you can use regular python to get what you want:
[line['id'] for line in map(json.loads, lines) if line['available']]
结果
[1351572943231, 1651572973431]
如果您发布的代码是一个长字符串,则可以将其包装在[]
中,然后将其解析为具有相同结果的数组:
import json
line = r'{"id":1351572979731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}, {"id":1351572329731,"parent_pid":21741,"available":false,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}, {"id":1351572943231,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""},{"id":1651572973431,"parent_pid":21741,"available":true,"lou":"678","feature":true,"pub":true,"require":null,"option4":""}'
lines = json.loads('[' + line + ']')
[line['id'] for line in lines if line['available']]
答案 1 :(得分:2)
这可以满足您的需求
(?<="id":)\d{13}(?=(?:,"[^"]*":[^,]*?)*?,"available":true)
https://regex101.com/r/FseimH/1
扩展
(?<= "id": )
\d{13}
(?=
(?: ," [^"]* ": [^,]*? )*?
,"available":true
)
解释
(?<= "id": ) # Lookbehind assertion for id
\d{13} # Consume 13 digit id
(?= # Lookahead assertion
(?: # Optional sequence
, # comma
" [^"]* " # quoted string
: # colon
[^,]*? # optional non-comma's
)*? # End sequence, do 0 to many times -
,"available":true # until we find available = true
)
答案 2 :(得分:1)