Question

我正在尝试从下面的源代码中的_id字段后面捕获24个字符的字符串：

[{"actors":"Natalie Portman, Hugo Weaving, Stephen Rea","year":2006,"description":"","title":"V for Vendetta","image":"http:\/\/content8.flixster.com\/movie\/11\/16\/67\/11166734_det.jpg","rating":3.65,"_id":"4eb04794f5f8077d1d000000","links":{"rottentomatoes":"http:\/\/www.rottentomatoes.com\/m\/v_for_vendetta\/","imdb":"http:\/\/www.imdb.com\/title\/tt0434409\/","shortUrl":"http:\/\/www.canistream.it\/search\/movie\/4eb04794f5f8077d1d000000\/v-for-vendetta"}},{"actors":"Guy Madison, Monica Randall, Mariano Vidal Molina","year":1966,"description":"","title":"I Cinque della vendetta (Five for Revenge)(The Five Giants from Texas)(No Drums No Trumpets)","image":"http:\/\/images.rottentomatoescdn.com\/images\/redesign\/poster_default.gif","rating":-0.05,"_id":"4e663229f5f8071702000002","links":{"imdb":"http:\/\/www.imdb.com\/title\/tt0060238\/","rottentomatoes":"http:\/\/www.rottentomatoes.com\/m\/i-cinque-della-vendetta-five-for-revengethe-five-giants-from-texasno-drums-no-trumpets\/","shortUrl":"http:\/\/www.canistream.it\/search\/movie\/4e663229f5f8071702000002\/i-cinque-della-vendetta-five-for-revenge-the-five-giants-from-texas-no-drums-no-trumpets-"}}]

我尝试使用如下所示的lookbehind，但没有运气。

^(?<=_id":")[a-z0-9]{24}

我正在使用它作为Python脚本的一部分，如果它有所作为。

Answer 1

如果上述数据是存储在变量中的json对象，请说data

data[0]['_id']

给出你想要的东西。

如果是字符串，请使用python的json module将其加载为json并访问上述数据，即

import json
data_j = json.loads(data)
data_j[0]['_id']

Answer 2

这是list，其中有一个dictionary，如果它被称为D

>>> D[0]['_id']
   '4eb04794f5f8077d1d000000'

Answer 3

与其他两个答案一样，如果您有原始数据结构，请使用这些。但如果所有这些都失败了，这可能会奏效：

pat = '_id":"'
i = s.find(pat)
if i >= 0:
    i += len(pat)
value = s[i:i+24]

如何在一组特定文本后找到一个字符串？

3 个答案: