使用Pandas和json_normalize将JSON展平为一行

时间:2019-01-18 21:59:22

标签: python json dataframe

我有一个高度嵌套的复杂JSON内容:

 {
      "EligibilityCriteriaTerms": {
        "InclusionCriteria": {
          "Inclusion": [
            {
              "@disease": "Vaccine",
              "@id": "11111",
              "Criterion": "Subjects",
              "SubCriteria": {
                "SubCriterion": [
                  {
                    "@disease": "disease",
                    "@id": "8888",
                    "$": "Subjects with Prior DTP Vaccination"
                  },
                  {
                    "@disease": "DTP Vaccine",
                    "@id": "8888",
                    "$": "Subjects with Prior Other Vaccination"
                  }
                ]
              }
            },
            {
              "@disease": "Vaccine",
              "@id": "88888",
              "Criterion": "Subjects with Normal/Adequate  Status",
              "SubCriteria": {
                "SubCriterion": {
                  "@disease": " Vaccine",
                  "@id": "777777",
                  "$": "Subjects"
                }
              }
            }
          ]
        },
        "ExclusionCriteria": {
          "Exclusion": [
            {
              "@disease": " Vaccine",
              "@id": "666666",
              "Criterion": "Subjects for other indication"
            },
            {
              "@disease": " Vaccine",
              "@id": "55555",
              "Criterion": "SubjectsIngredients"
            },
            {
              "@disease": "Vaccine",
              "@id": "8334",
              "Criterion": "Subjects",
              "SubCriteria": {
                "SubCriterion": [
                  {
                    "@disease": "DTP Vaccine",
                    "@id": "333",
                    "$": " immune dysfunction"
                  },
                  {
                    "@disease": "DTP Vaccine",
                    "@id": "444",
                    "$": "Subjects disease/disorder"
                  }
                ]
              }
            }
          ]
        }
      }
    }

我试图将整个JSON对象展平。当我使用以下代码时:

pd_1=json_normalize(json_content_eligibility_citeria_terms['EligibilityCriteriaTerms'],record_path=[['ExclusionCriteria','Exclusion']],errors='ignore')
pd_1_eligibility_citeria_terms=pd_1_eligibility_citeria_terms.fillna('')
test2=pd.DataFrame.from_records(pd_1_eligibility_citeria_terms.SubCriteria) 
test2.columns=['@disease','@id','$']   # This line throws an error  ValueError: Length mismatch: Expected axis has 1 elements, new values have 3 elements    

有没有一种方法可以将值提取到test2数据框中,并与pd_1进行串联,并在一行中显示结果

我看到了使用json_normalize()和在meta []属性中使用选项的多种方法,但是该结构无法使用meta属性,是否缺少任何设置。这是正确的做法吗?

0 个答案:

没有答案