我有这个JSON文件(它只是文件的一小部分):
[
{
"History bleed": {
"sentences": [
{
"words": [
[
"History",
{
"PartOfSpeech": "NN",
"CharacterOffsetEnd": "7",
"Lemma": "history",
"NamedEntityTag": "O",
"CharacterOffsetBegin": "0"
}
],
[
"bleed",
{
"PartOfSpeech": "VB",
"CharacterOffsetEnd": "39",
"Lemma": "bleed",
"NamedEntityTag": "O",
"CharacterOffsetBegin": "34"
}
]
],
"indexeddependencies": [],
"parsetree": [],
"text": "History of lower gastrointestinal bleed",
"dependencies": []
}
]
}
},
{
"Antigen of Bordetella": {
"sentences": [
{
"words": [
[
"Antigen",
{
"PartOfSpeech": "NN",
"CharacterOffsetEnd": "7",
"Lemma": "antigen",
"NamedEntityTag": "O",
"CharacterOffsetBegin": "0"
}
],
[
"of",
{
"PartOfSpeech": "IN",
"CharacterOffsetEnd": "10",
"Lemma": "of",
"NamedEntityTag": "O",
"CharacterOffsetBegin": "8"
}
],
[
"Bordetella",
{
"PartOfSpeech": "NN",
"CharacterOffsetEnd": "21",
"Lemma": "bordetellum",
"NamedEntityTag": "PERSON",
"CharacterOffsetBegin": "11"
}
]
],
"indexeddependencies": [],
"parsetree": [],
"text": "Antigen of Bordetella",
"dependencies": []
}
]
}
},
{
"Anti-Histoplasma": {
"sentences": [
{
"words": [
[
"Anti-Histoplasma",
{
"PartOfSpeech": "JJ",
"CharacterOffsetEnd": "16",
"Lemma": "anti-histoplasma",
"NamedEntityTag": "O",
"CharacterOffsetBegin": "0"
}
],
],
"indexeddependencies": [],
"parsetree": [],
"text": "Anti-Histoplasma capsulatum IgG",
"dependencies": []
}
]
}
}
]
我希望得到这个:
{
"sentences": [
{
"words": [
[
"Antigen",
{
"PartOfSpeech": "NN",
"CharacterOffsetEnd": "7",
"Lemma": "antigen",
"NamedEntityTag": "O",
"CharacterOffsetBegin": "0"
}
],
[
"of",
{
"PartOfSpeech": "IN",
"CharacterOffsetEnd": "10",
"Lemma": "of",
"NamedEntityTag": "O",
"CharacterOffsetBegin": "8"
}
],
[
"Bordetella",
{
"PartOfSpeech": "NN",
"CharacterOffsetEnd": "21",
"Lemma": "bordetellum",
"NamedEntityTag": "PERSON",
"CharacterOffsetBegin": "11"
}
]
],
"indexeddependencies": [],
"parsetree": [],
"text": "Antigen of Bordetella",
"dependencies": []
}
]
}
要获得我写这个:
with open(pathOfTheJsonFIle) as f:
data = json.load(f)
print(data['Antigen of Bordetella'])
但是我得到了这个错误: list indices必须是整数,而不是str
此文件非常大(有超过10,000个项目)所以我想使用一些索引找到项目 Bordetella Antigen (例如,不写数据[2])< / p>
答案 0 :(得分:0)
那是因为JSON文件以列表而不是字典开头。
尝试:
for i in data:
if 'Antigen of Bordetella' in i:
print i
答案 1 :(得分:0)
使用itertools你可以这样做:
from itertools import ifilter
...
searchkey = "Antigen of Bordetella"
search_data = ifilter(lambda X: searchkey in X, data).next()[searchkey]