获取Python

时间:2016-04-18 10:31:09

标签: python json

我有这个JSON文件(它只是文件的一小部分):

[
   {
      "History bleed": {
         "sentences": [
            {
               "words": [
                  [
                     "History",
                     {
                        "PartOfSpeech": "NN",
                        "CharacterOffsetEnd": "7",
                        "Lemma": "history",
                        "NamedEntityTag": "O",
                        "CharacterOffsetBegin": "0"
                     }
                  ],
                  [
                     "bleed",
                     {
                        "PartOfSpeech": "VB",
                        "CharacterOffsetEnd": "39",
                        "Lemma": "bleed",
                        "NamedEntityTag": "O",
                        "CharacterOffsetBegin": "34"
                     }
                  ]
               ],
               "indexeddependencies": [],
               "parsetree": [],
               "text": "History of lower gastrointestinal bleed",
               "dependencies": []
            }
         ]
      }
   },
   {
      "Antigen of Bordetella": {
         "sentences": [
            {
               "words": [
                  [
                     "Antigen",
                     {
                        "PartOfSpeech": "NN",
                        "CharacterOffsetEnd": "7",
                        "Lemma": "antigen",
                        "NamedEntityTag": "O",
                        "CharacterOffsetBegin": "0"
                     }
                  ],
                  [
                     "of",
                     {
                        "PartOfSpeech": "IN",
                        "CharacterOffsetEnd": "10",
                        "Lemma": "of",
                        "NamedEntityTag": "O",
                        "CharacterOffsetBegin": "8"
                     }
                  ],
                  [
                     "Bordetella",
                     {
                        "PartOfSpeech": "NN",
                        "CharacterOffsetEnd": "21",
                        "Lemma": "bordetellum",
                        "NamedEntityTag": "PERSON",
                        "CharacterOffsetBegin": "11"
                     }
                  ]
               ],
               "indexeddependencies": [],
               "parsetree": [],
               "text": "Antigen of Bordetella",
               "dependencies": []
            }
         ]
      }
   },
   {
      "Anti-Histoplasma": {
         "sentences": [
            {
               "words": [
                  [
                     "Anti-Histoplasma",
                     {
                        "PartOfSpeech": "JJ",
                        "CharacterOffsetEnd": "16",
                        "Lemma": "anti-histoplasma",
                        "NamedEntityTag": "O",
                        "CharacterOffsetBegin": "0"
                     }
                  ],
               ],
               "indexeddependencies": [],
               "parsetree": [],
               "text": "Anti-Histoplasma capsulatum IgG",
               "dependencies": []
            }
         ]
      }
   }
]

我希望得到这个:

{
         "sentences": [
            {
               "words": [
                  [
                     "Antigen",
                     {
                        "PartOfSpeech": "NN",
                        "CharacterOffsetEnd": "7",
                        "Lemma": "antigen",
                        "NamedEntityTag": "O",
                        "CharacterOffsetBegin": "0"
                     }
                  ],
                  [
                     "of",
                     {
                        "PartOfSpeech": "IN",
                        "CharacterOffsetEnd": "10",
                        "Lemma": "of",
                        "NamedEntityTag": "O",
                        "CharacterOffsetBegin": "8"
                     }
                  ],
                  [
                     "Bordetella",
                     {
                        "PartOfSpeech": "NN",
                        "CharacterOffsetEnd": "21",
                        "Lemma": "bordetellum",
                        "NamedEntityTag": "PERSON",
                        "CharacterOffsetBegin": "11"
                     }
                  ]
               ],
               "indexeddependencies": [],
               "parsetree": [],
               "text": "Antigen of Bordetella",
               "dependencies": []
            }
         ]
      }

要获得我写这个:

    with open(pathOfTheJsonFIle) as f:
        data = json.load(f)
    print(data['Antigen of Bordetella']) 

但是我得到了这个错误: list indices必须是整数,而不是str

此文件非常大(有超过10,000个项目)所以我想使用一些索引找到项目 Bordetella Antigen (例如,不写数据[2])< / p>

2 个答案:

答案 0 :(得分:0)

那是因为JSON文件以列表而不是字典开头。

尝试:

for i in data:
    if 'Antigen of Bordetella' in i:
        print i

答案 1 :(得分:0)

使用itertools你可以这样做:

from itertools import ifilter
...
searchkey = "Antigen of Bordetella"
search_data = ifilter(lambda X: searchkey in X, data).next()[searchkey]