我希望得到“引理”的所有价值。在这个json:
{'sentences':
[{'indexeddependencies': [], 'words':
[
['Cinnamomum', {'CharacterOffsetBegin': '0', 'CharacterOffsetEnd': '10', 'Lemma': 'Cinnamomum', 'PartOfSpeech': 'NNP', 'NamedEntityTag': 'O'}],
['.', {'CharacterOffsetBegin': '14', 'CharacterOffsetEnd': '15', 'Lemma': '.', 'PartOfSpeech': '.', 'NamedEntityTag': 'O'}]
], 'parsetree': [], 'text': 'Cinnamomum.', 'dependencies': []
},
{'indexeddependencies': [], 'words':
[
['specific', {'CharacterOffsetBegin': '16', 'CharacterOffsetEnd': '24', 'Lemma': 'specific', 'PartOfSpeech': 'JJ', 'NamedEntityTag': 'O'}],
['immunoglobulin', {'CharacterOffsetBegin': '25', 'CharacterOffsetEnd': '39', 'Lemma': 'immunoglobulin', 'PartOfSpeech': 'NN', 'NamedEntityTag': 'O'}],
['measurement', {'CharacterOffsetBegin': '51', 'CharacterOffsetEnd': '62', 'Lemma': 'measurement', 'PartOfSpeech': 'NN', 'NamedEntityTag': 'O'}]
], 'parsetree': [], 'text': 'specific immunoglobulin measurement', 'dependencies': []
}]
}
如何使用python获取每个值?有五个Lemma键,但我无法获得所有这些键。
我试过这个,但它不起作用:
for i in range(len(words)): #in this case the range of i would be 5
lemma = result["sentences"][0]["words"][i][1]["Lemma"]
答案 0 :(得分:1)
我不确定您为何拥有此数据结构 - 假设您无法更改/重新整理它以更好地适应您的查询和用例,并且Lemma
密钥始终存在:
>>> [word[1]['Lemma']
for sentence in data['sentences']
for word in sentence['words']]
['Cinnamomum', '.', 'specific', 'immunoglobulin', 'measurement']
答案 1 :(得分:0)
这个简单的代码遍历所有内容并查找所有引理值(顺便说一下。你的json应该有"而不是'作为字符串引号,我想:
import json
with open('lemma.json') as f:
data = json.load(f)
def traverse(node):
for key in node:
if isinstance(node, list):
traverse(key)
elif isinstance(node, dict):
if key == 'Lemma':
print key, node[key]
continue
traverse(node[key])
traverse(data)
答案 2 :(得分:0)
您可以使用JSON encoder and decoder library
如果您使用该库,则写下:
import json
json.loads(result)
无论如何,我尝试将你的json放在验证器中,我得到一个错误
答案 3 :(得分:0)
将单引号更改为sed -i 's/\'/\"/g' sample.json
转换为json对象并按模块json
解析它
import json
with open('sample.json', encoding='utf-8') as data_file:
data = json.loads(data_file.read())
for sentence in data['sentences']:
for word in sentence['words']:
print(word[1]['Lemma'])
结果:
Cinnamomum
.
specific
immunoglobulin
measurement