我的数据是一个真正嵌套的json文件。每行代表一个相互独立的完全转储(序列化为字符串)的JSON对象。
我真的很想从我的数据中获取“ gene_info”,所以我认为我必须将其转换为字典,然后才能调用“ gene_info”。 我已经尝试过json.load或json.dump,但是它仍然是str类型。 dict()或eval也不起作用。非常感谢您对我的帮助!
import json
with open(r"C:\Users\Caroline\Desktop\test.json") as f:
dic=json.load(f)
print(type(dic))
此外,这是嵌套的字符串数据:
{\"target\": {\"tractability\": {\"smallmolecule\": {\"top_category\": \"Clinical_Precedence\", \"small_molecule_genome_member\": true, \"buckets\": [1, 4, 5, 7, 8],
\"high_quality_compounds\": 3141, \"ensemble\": 0.79167199, \"categories\": {\"clinical_precedence\": 1.0, \"predicted_tractable\": 1.0, \"discovery_precedence\": 1.0}
}, \"antibody\": {\"top_category\": \"Predicted Tractable - Medium to low confidence\", \"buckets\": [8], \"categories\": {\"predicted_tractable_med_low_confidence\":
0.25, \"clinical_precedence\": 0.0, \"predicted_tractable_high_confidence\": 0.0}}}, \"gene_info\": {\"symbol\": \"PIK3CA\", \"name\": \"phosphatidylinositol-4,5-bisph
osphate 3-kinase catalytic subunit alpha\"}, \"id\": \"ENSG00000121879\"}, \"association_score\": {\"datatypes\": {\"literature\": 0.3247602795788519, \"rna_expression
\": 0.0, \"genetic_association\": 1.0, \"somatic_mutation\": 1.0, \"known_drug\": 1.0, \"animal_model\": 0.0, \"affected_pathway\": 1.0}, \"overall\": 1.0, \"datasourc
es\": {\"progeny\": 0.724214062011764, \"sysbio\": 0.0, \"expression_atlas\": 0.0, \"europepmc\": 0.3247602795788519, \"intogen\": 0.663641617866191, \"phewas_catalog\
": 0.0, \"uniprot_literature\": 1, \"phenodigm\": 0.0, \"eva\": 0.9967753031181473, \"gene2phenotype\": 1.0, \"gwas_catalog\": 0.0, \"slapenrich\": 0.8174919500924461,
\"genomics_england\": 1, \"postgap\": 0.0, \"uniprot\": 1, \"chembl\": 1, \"cancer_gene_census\": 0.9062532321909299, \"reactome\": 1, \"uniprot_somatic\": 0.86829246
38231565, \"eva_somatic\": 0.9083243889916071, \"crispr\": 0.983372489229025}}, \"disease\": {\"efo_info\": {\"therapeutic_area\": {\"labels\": [\"neoplasm\"], \"codes
\": [\"EFO_0000616\"]}, \"path\": [[\"EFO_0000616\"]], \"label\": \"neoplasm\"}, \"id\": \"EFO_0000616\"}, \"is_direct\": true, \"evidence_count\": {\"datatypes\": {\"
literature\": 6053.0, \"rna_expression\": 0.0, \"genetic_association\": 54.0, \"somatic_mutation\": 999.0, \"known_drug\": 295.0, \"animal_model\": 0.0, \"affected_pat
hway\": 534.0}, \"total\": 7935.0, \"datasources\": {\"progeny\": 11.0, \"sysbio\": 0.0, \"expression_atlas\": 0.0, \"europepmc\": 6053.0, \"intogen\": 19.0, \"phewas_
catalog\": 0.0, \"uniprot_literature\": 5.0, \"phenodigm\": 0.0, \"eva\": 17.0, \"gene2phenotype\": 1.0, \"gwas_catalog\": 0.0, \"slapenrich\": 480.0, \"genomics_engla
nd\": 2.0, \"postgap\": 0.0, \"uniprot\": 29.0, \"chembl\": 295.0, \"cancer_gene_census\": 321.0, \"reactome\": 36.0, \"uniprot_somatic\": 21.0, \"eva_somatic\": 638.0
, \"crispr\": 7.0}}, \"id\": \"ENSG00000121879-EFO_0000616\"}
答案 0 :(得分:0)
json.load(file_object)
将文件对象作为参数
import json
with open(r"C:\Users\Caroline\Desktop\test.json") as f:
dic=json.load(f)
答案 1 :(得分:0)
尝试在文件C:\ Users \ Caroline \ Desktop \ test.json中添加格式正确的json,例如
{
"target": {
"tractability": {
"smallmolecule": {
"top_category": "Clinical_Precedence",
"small_molecule_genome_member": true,
"buckets": [
1,
4,
5,
7,
8
],
"high_quality_compounds": 3141,
"ensemble": 0.79167199,
"categories": {
"clinical_precedence": 1.0,
"predicted_tractable": 1.0,
"discovery_precedence": 1.0
}
},
"antibody": {
"top_category": "Predicted Tractable - Medium to low confidence",
"buckets": [
8
],
"categories": {
"predicted_tractable_med_low_confidence": 0.25,
"clinical_precedence": 0.0,
"predicted_tractable_high_confidence": 0.0
}
}
},
"gene_info": {
"symbol": "PIK3CA",
"name": "phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha"
},
"id": "ENSG00000121879"
},
"association_score": {
"datatypes": {
"literature": 0.3247602795788519,
"rna_expression": 0.0,
"genetic_association": 1.0,
"somatic_mutation": 1.0,
"known_drug": 1.0,
"animal_model": 0.0,
"affected_pathway": 1.0
},
"overall": 1.0,
"datasources": {
"progeny": 0.724214062011764,
"sysbio": 0.0,
"expression_atlas": 0.0,
"europepmc": 0.3247602795788519,
"intogen": 0.663641617866191,
"phewas_catalog": 0.0,
"uniprot_literature": 1,
"phenodigm": 0.0,
"eva": 0.9967753031181473,
"gene2phenotype": 1.0,
"gwas_catalog": 0.0,
"slapenrich": 0.8174919500924461,
"genomics_england": 1,
"postgap": 0.0,
"uniprot": 1,
"chembl": 1,
"cancer_gene_census": 0.9062532321909299,
"reactome": 1,
"uniprot_somatic": 0.8682924638231565,
"eva_somatic": 0.9083243889916071,
"crispr": 0.983372489229025
}
},
"disease": {
"efo_info": {
"therapeutic_area": {
"labels": [
"neoplasm"
],
"codes": [
"EFO_0000616"
]
},
"path": [
[
"EFO_0000616"
]
],
"label": "neoplasm"
},
"id": "EFO_0000616"
},
"is_direct": true,
"evidence_count": {
"datatypes": {
"literature": 6053.0,
"rna_expression": 0.0,
"genetic_association": 54.0,
"somatic_mutation": 999.0,
"known_drug": 295.0,
"animal_model": 0.0,
"affected_pathway": 534.0
},
"total": 7935.0,
"datasources": {
"progeny": 11.0,
"sysbio": 0.0,
"expression_atlas": 0.0,
"europepmc": 6053.0,
"intogen": 19.0,
"phewas_catalog": 0.0,
"uniprot_literature": 5.0,
"phenodigm": 0.0,
"eva": 17.0,
"gene2phenotype": 1.0,
"gwas_catalog": 0.0,
"slapenrich": 480.0,
"genomics_england": 2.0,
"postgap": 0.0,
"uniprot": 29.0,
"chembl": 295.0,
"cancer_gene_census": 321.0,
"reactome": 36.0,
"uniprot_somatic": 21.0,
"eva_somatic": 638.0,
"crispr": 7.0
}
},
"id": "ENSG00000121879-EFO_0000616"
}
然后您的代码应该可以正常工作:
import json
with open(r"C:\Users\Caroline\Desktop\test.json") as f:
dic=json.load(f)
print(type(dic))
print("gene_info : ", dic["target"]["gene_info"])