从python中的String中提取特定单词

时间:2014-07-10 17:56:55

标签: python string python-2.7

我想提取“指数”之前的所有单词(即ForeverTrophyless,NoPainNoGame,Prize)并将所有单词放在列表中。我怎么能这样做?

foo = '[{"text":"ForeverTrophyless","indices":[0,18]},{"text":"ForeverTrophyless","indices":[19,37]},{"text":"Prize","indices":[38,56]},{"text":"ForeverTrophyless","indices":[57,75]},{"text":"NoPainNoGame","indices":[76,94]},{"text":"ForeverTrophyless","indices":[95,113]},{"text":"ForeverTrophyless","indices":[114,132]}]'

Python2.7

Pycharm Ubuntu 14.04

2 个答案:

答案 0 :(得分:3)

您可以使用ast.literal_eval将该字符串转换为字典列表。

foo = '[{"text":"ForeverTrophyless","indices":[0,18]},{"text":"ForeverTrophyless","indices":[19,37]},{"text":"Prize","indices":[38,56]},{"text":"ForeverTrophyless","indices":[57,75]},{"text":"NoPainNoGame","indices":[76,94]},{"text":"ForeverTrophyless","indices":[95,113]},{"text":"ForeverTrophyless","indices":[114,132]}]'

import ast
l = ast.literal_eval(foo)

l现在是:

[{'indices': [0, 18], 'text': 'ForeverTrophyless'},
 {'indices': [19, 37], 'text': 'ForeverTrophyless'},
 {'indices': [38, 56], 'text': 'Prize'},
 {'indices': [57, 75], 'text': 'ForeverTrophyless'},
 {'indices': [76, 94], 'text': 'NoPainNoGame'},
 {'indices': [95, 113], 'text': 'ForeverTrophyless'},
 {'indices': [114, 132], 'text': 'ForeverTrophyless'}]

然后使用列表理解

[i['text'] for i in l]

结果

['ForeverTrophyless', 'ForeverTrophyless', 'Prize', 'ForeverTrophyless', 'NoPainNoGame', 'ForeverTrophyless', 'ForeverTrophyless']

答案 1 :(得分:2)

foo似乎是一个有效的序列化JSON对象。您可以使用json.loads对其进行解析,然后检索列表解析中的所有text字段:

In [8]: from json import loads

In [9]: [x['text'] for x in loads(foo)]
Out[9]: 
['ForeverTrophyless',
 'ForeverTrophyless',
 'Prize',
 'ForeverTrophyless',
 'NoPainNoGame',
 'ForeverTrophyless',
 'ForeverTrophyless']