Question

我的数据是一个真正嵌套的json文件。每行代表一个相互独立的完全转储（序列化为字符串）的JSON对象。

我真的很想从我的数据中获取“ gene_info”，所以我认为我必须将其转换为字典，然后才能调用“ gene_info”。我已经尝试过json.load或json.dump，但是它仍然是str类型。 dict（）或eval也不起作用。非常感谢您对我的帮助！

import json
with open(r"C:\Users\Caroline\Desktop\test.json") as f:

    dic=json.load(f)
print(type(dic))

此外，这是嵌套的字符串数据：

{\"target\": {\"tractability\": {\"smallmolecule\": {\"top_category\": \"Clinical_Precedence\", \"small_molecule_genome_member\": true, \"buckets\": [1, 4, 5, 7, 8],
\"high_quality_compounds\": 3141, \"ensemble\": 0.79167199, \"categories\": {\"clinical_precedence\": 1.0, \"predicted_tractable\": 1.0, \"discovery_precedence\": 1.0}
}, \"antibody\": {\"top_category\": \"Predicted Tractable - Medium to low confidence\", \"buckets\": [8], \"categories\": {\"predicted_tractable_med_low_confidence\":
0.25, \"clinical_precedence\": 0.0, \"predicted_tractable_high_confidence\": 0.0}}}, \"gene_info\": {\"symbol\": \"PIK3CA\", \"name\": \"phosphatidylinositol-4,5-bisph
osphate 3-kinase catalytic subunit alpha\"}, \"id\": \"ENSG00000121879\"}, \"association_score\": {\"datatypes\": {\"literature\": 0.3247602795788519, \"rna_expression
\": 0.0, \"genetic_association\": 1.0, \"somatic_mutation\": 1.0, \"known_drug\": 1.0, \"animal_model\": 0.0, \"affected_pathway\": 1.0}, \"overall\": 1.0, \"datasourc
es\": {\"progeny\": 0.724214062011764, \"sysbio\": 0.0, \"expression_atlas\": 0.0, \"europepmc\": 0.3247602795788519, \"intogen\": 0.663641617866191, \"phewas_catalog\
": 0.0, \"uniprot_literature\": 1, \"phenodigm\": 0.0, \"eva\": 0.9967753031181473, \"gene2phenotype\": 1.0, \"gwas_catalog\": 0.0, \"slapenrich\": 0.8174919500924461,
 \"genomics_england\": 1, \"postgap\": 0.0, \"uniprot\": 1, \"chembl\": 1, \"cancer_gene_census\": 0.9062532321909299, \"reactome\": 1, \"uniprot_somatic\": 0.86829246
38231565, \"eva_somatic\": 0.9083243889916071, \"crispr\": 0.983372489229025}}, \"disease\": {\"efo_info\": {\"therapeutic_area\": {\"labels\": [\"neoplasm\"], \"codes
\": [\"EFO_0000616\"]}, \"path\": [[\"EFO_0000616\"]], \"label\": \"neoplasm\"}, \"id\": \"EFO_0000616\"}, \"is_direct\": true, \"evidence_count\": {\"datatypes\": {\"
literature\": 6053.0, \"rna_expression\": 0.0, \"genetic_association\": 54.0, \"somatic_mutation\": 999.0, \"known_drug\": 295.0, \"animal_model\": 0.0, \"affected_pat
hway\": 534.0}, \"total\": 7935.0, \"datasources\": {\"progeny\": 11.0, \"sysbio\": 0.0, \"expression_atlas\": 0.0, \"europepmc\": 6053.0, \"intogen\": 19.0, \"phewas_
catalog\": 0.0, \"uniprot_literature\": 5.0, \"phenodigm\": 0.0, \"eva\": 17.0, \"gene2phenotype\": 1.0, \"gwas_catalog\": 0.0, \"slapenrich\": 480.0, \"genomics_engla
nd\": 2.0, \"postgap\": 0.0, \"uniprot\": 29.0, \"chembl\": 295.0, \"cancer_gene_census\": 321.0, \"reactome\": 36.0, \"uniprot_somatic\": 21.0, \"eva_somatic\": 638.0
, \"crispr\": 7.0}}, \"id\": \"ENSG00000121879-EFO_0000616\"}

Answer 1

json.load(file_object)将文件对象作为参数

import json

with open(r"C:\Users\Caroline\Desktop\test.json") as f:

    dic=json.load(f)

Answer 2

尝试在文件C：\ Users \ Caroline \ Desktop \ test.json中添加格式正确的json，例如

{
  "target": {
    "tractability": {
      "smallmolecule": {
        "top_category": "Clinical_Precedence",
        "small_molecule_genome_member": true,
        "buckets": [
          1,
          4,
          5,
          7,
          8
        ],
        "high_quality_compounds": 3141,
        "ensemble": 0.79167199,
        "categories": {
          "clinical_precedence": 1.0,
          "predicted_tractable": 1.0,
          "discovery_precedence": 1.0
        }
      },
      "antibody": {
        "top_category": "Predicted Tractable - Medium to low confidence",
        "buckets": [
          8
        ],
        "categories": {
          "predicted_tractable_med_low_confidence": 0.25,
          "clinical_precedence": 0.0,
          "predicted_tractable_high_confidence": 0.0
        }
      }
    },
    "gene_info": {
      "symbol": "PIK3CA",
      "name": "phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha"
    },
    "id": "ENSG00000121879"
  },
  "association_score": {
    "datatypes": {
      "literature": 0.3247602795788519,
      "rna_expression": 0.0,
      "genetic_association": 1.0,
      "somatic_mutation": 1.0,
      "known_drug": 1.0,
      "animal_model": 0.0,
      "affected_pathway": 1.0
    },
    "overall": 1.0,
    "datasources": {
      "progeny": 0.724214062011764,
      "sysbio": 0.0,
      "expression_atlas": 0.0,
      "europepmc": 0.3247602795788519,
      "intogen": 0.663641617866191,
      "phewas_catalog": 0.0,
      "uniprot_literature": 1,
      "phenodigm": 0.0,
      "eva": 0.9967753031181473,
      "gene2phenotype": 1.0,
      "gwas_catalog": 0.0,
      "slapenrich": 0.8174919500924461,
      "genomics_england": 1,
      "postgap": 0.0,
      "uniprot": 1,
      "chembl": 1,
      "cancer_gene_census": 0.9062532321909299,
      "reactome": 1,
      "uniprot_somatic": 0.8682924638231565,
      "eva_somatic": 0.9083243889916071,
      "crispr": 0.983372489229025
    }
  },
  "disease": {
    "efo_info": {
      "therapeutic_area": {
        "labels": [
          "neoplasm"
        ],
        "codes": [
          "EFO_0000616"
        ]
      },
      "path": [
        [
          "EFO_0000616"
        ]
      ],
      "label": "neoplasm"
    },
    "id": "EFO_0000616"
  },
  "is_direct": true,
  "evidence_count": {
    "datatypes": {
      "literature": 6053.0,
      "rna_expression": 0.0,
      "genetic_association": 54.0,
      "somatic_mutation": 999.0,
      "known_drug": 295.0,
      "animal_model": 0.0,
      "affected_pathway": 534.0
    },
    "total": 7935.0,
    "datasources": {
      "progeny": 11.0,
      "sysbio": 0.0,
      "expression_atlas": 0.0,
      "europepmc": 6053.0,
      "intogen": 19.0,
      "phewas_catalog": 0.0,
      "uniprot_literature": 5.0,
      "phenodigm": 0.0,
      "eva": 17.0,
      "gene2phenotype": 1.0,
      "gwas_catalog": 0.0,
      "slapenrich": 480.0,
      "genomics_england": 2.0,
      "postgap": 0.0,
      "uniprot": 29.0,
      "chembl": 295.0,
      "cancer_gene_census": 321.0,
      "reactome": 36.0,
      "uniprot_somatic": 21.0,
      "eva_somatic": 638.0,
      "crispr": 7.0
    }
  },
  "id": "ENSG00000121879-EFO_0000616"
}

然后您的代码应该可以正常工作：

import json
with open(r"C:\Users\Caroline\Desktop\test.json") as f:
    dic=json.load(f)
print(type(dic))
print("gene_info : ", dic["target"]["gene_info"])

如何从转储（序列化为字符串）JSON对象中解析/调用关键字？

2 个答案: