数据探索JSON在pandas中嵌套数据

时间:2016-06-25 18:26:13

标签: python json pandas dataframe

如何将我的JSON数据放入合理的数据框?我有一个深度嵌套的文件,我的目标是进入一个大型数据框。下面的Github存储库中的所有内容都是described

http://www.github.com/simongraham/dataExplore.git

1 个答案:

答案 0 :(得分:1)

使用嵌套的jsons,您需要遍历各个级别,提取所需的段。对于较大json的营养部分,考虑迭代每个nutritionPortions级别,每次运行pandas规范化并连接到最终数据帧:

import pandas as pd
import json

with open('/Users/simongraham/Desktop/Kaido/Data/kaidoData.json') as f:
    data = json.load(f)

# INITIALIZE DF
nutrition = pd.DataFrame()

# ITERATIVELY CONCATENATE
for item in data[0]["nutritionPortions"]:    
    if 'ftEnergyKcal' in item.keys():      # MISSING IN 3 OF 53 LEVELS
        temp = (pd.io
            .json
            .json_normalize(item, 'nutritionNutrients',
                ['vcNutritionId','vcUserId','vcPortionId','vcPortionName','vcPortionSize',
                 'ftEnergyKcal', 'vcPortionUnit','dtConsumedDate'])
            )
        nutrition = pd.concat([nutrition, temp])

nutrition.head()

输出

   ftValue  nPercentRI    vcNutrient                  vcNutritionPortionId  \
0     0.00         0.0       alcohol  c993ac30-ecb4-4154-a2ea-d51dbb293f66   
1     0.00         0.0          bcfa  c993ac30-ecb4-4154-a2ea-d51dbb293f66   
2     7.80         6.0        biotin  c993ac30-ecb4-4154-a2ea-d51dbb293f66   
3    49.40         2.0       calcium  c993ac30-ecb4-4154-a2ea-d51dbb293f66   
4     1.82         0.0  carbohydrate  c993ac30-ecb4-4154-a2ea-d51dbb293f66   

  vcTrafficLight vcUnit       dtConsumedDate  \
0                     g  2016-04-12T00:00:00   
1                     g  2016-04-12T00:00:00   
2                   µg  2016-04-12T00:00:00   
3                    mg  2016-04-12T00:00:00   
4                     g  2016-04-12T00:00:00   

                          vcNutritionId  ftEnergyKcal  \
0  070b97a4-d562-427d-94a8-1de1481df5d1          18.2   
1  070b97a4-d562-427d-94a8-1de1481df5d1          18.2   
2  070b97a4-d562-427d-94a8-1de1481df5d1          18.2   
3  070b97a4-d562-427d-94a8-1de1481df5d1          18.2   
4  070b97a4-d562-427d-94a8-1de1481df5d1          18.2   

                               vcUserId vcPortionName vcPortionSize  \
0  fe585e3d-2863-46fe-a41f-290bf58ad169         1 mug           260   
1  fe585e3d-2863-46fe-a41f-290bf58ad169         1 mug           260   
2  fe585e3d-2863-46fe-a41f-290bf58ad169         1 mug           260   
3  fe585e3d-2863-46fe-a41f-290bf58ad169         1 mug           260   
4  fe585e3d-2863-46fe-a41f-290bf58ad169         1 mug           260   

  vcPortionId vcPortionUnit  
0           2            ml  
1           2            ml  
2           2            ml  
3           2            ml  
4           2            ml