我正在尝试从下面的源代码中的_id字段后面捕获24个字符的字符串:
[{"actors":"Natalie Portman, Hugo Weaving, Stephen Rea","year":2006,"description":"","title":"V for Vendetta","image":"http:\/\/content8.flixster.com\/movie\/11\/16\/67\/11166734_det.jpg","rating":3.65,"_id":"4eb04794f5f8077d1d000000","links":{"rottentomatoes":"http:\/\/www.rottentomatoes.com\/m\/v_for_vendetta\/","imdb":"http:\/\/www.imdb.com\/title\/tt0434409\/","shortUrl":"http:\/\/www.canistream.it\/search\/movie\/4eb04794f5f8077d1d000000\/v-for-vendetta"}},{"actors":"Guy Madison, Monica Randall, Mariano Vidal Molina","year":1966,"description":"","title":"I Cinque della vendetta (Five for Revenge)(The Five Giants from Texas)(No Drums No Trumpets)","image":"http:\/\/images.rottentomatoescdn.com\/images\/redesign\/poster_default.gif","rating":-0.05,"_id":"4e663229f5f8071702000002","links":{"imdb":"http:\/\/www.imdb.com\/title\/tt0060238\/","rottentomatoes":"http:\/\/www.rottentomatoes.com\/m\/i-cinque-della-vendetta-five-for-revengethe-five-giants-from-texasno-drums-no-trumpets\/","shortUrl":"http:\/\/www.canistream.it\/search\/movie\/4e663229f5f8071702000002\/i-cinque-della-vendetta-five-for-revenge-the-five-giants-from-texas-no-drums-no-trumpets-"}}]
我尝试使用如下所示的lookbehind,但没有运气。
^(?<=_id":")[a-z0-9]{24}
我正在使用它作为Python脚本的一部分,如果它有所作为。
答案 0 :(得分:1)
如果上述数据是存储在变量中的json对象,请说data
data[0]['_id']
给出你想要的东西。
如果是字符串,请使用python的json module将其加载为json并访问上述数据,即
import json
data_j = json.loads(data)
data_j[0]['_id']
答案 1 :(得分:1)
这是list
,其中有一个dictionary
,如果它被称为D
>>> D[0]['_id']
'4eb04794f5f8077d1d000000'
答案 2 :(得分:1)
与其他两个答案一样,如果您有原始数据结构,请使用这些。但如果所有这些都失败了,这可能会奏效:
pat = '_id":"'
i = s.find(pat)
if i >= 0:
i += len(pat)
value = s[i:i+24]