将Pandas Dataframe转换为嵌套字典

时间:2018-09-27 12:01:46

标签: python python-3.x pandas dictionary dataframe

我正在尝试将数据框转换为嵌套字典,但到目前为止没有成功。

数据框:clean_data['Model', 'Problem', 'Size']

这是我的数据的样子:

 Model                Problem             Size
 lenovo a6020         screen broken         1
 lenovo a6020a40      battery              60
                      bluetooth            60
                      buttons              60
 lenovo k4            wi-fi                 3
                      bluetooth             3

我想要的输出:

{
  "name": "Brand",
  "children": [
     {
         "name": "Lenovo",
         "children": [
             {
              "name": "lenovo a6020",
              "children": {
                  "name": "screen broken",
                  "size": 1
               }
             },
             {
              "name": "lenovo a6020a40",
              "children": [
                 {
                   "name": "battery",
                   "size": 60
                 },
                 {
                   "name": "bluetooth",
                   "size": 60
                 },
                 {
                   "name": "buttons",
                   "size": 60
                 }
               ]
             },
             {
              "name": "lenovo k4",
              "children": [
                {
                  "name": "wi-fi",
                  "size": 3
                },
                {
                  "name": "bluetooth",
                  "size": 3
                }
               ]
            }
         ]
      }
   ]
 }

我尝试了pandas.DataFrame.to_dict方法,但是它返回的是一个简单的字典,但我希望它像上面提到的那样。

1 个答案:

答案 0 :(得分:6)

使用:

print (df)
             Model        Problem  size
0     lenovo a6020  screen broken     1
1  lenovo a6020a40        battery    60
2              NaN      bluetooth    60
3              NaN        buttons    60
4        lenovo k4          wi-fi     3
5              NaN      bluetooth     3

 #repalce missing values by forward filling
df = df.ffill()
#split Model column by first whitesapces to 2 columns 
df[['a','b']] = df['Model'].str.split(n=1, expand=True)

#each level convert to list of dictionaries
#for correct keys use rename
L = (df.rename(columns={'Problem':'name'})
        .groupby(['a','b'])['name','size']
        .apply(lambda x: x.to_dict('r'))
        .rename('children')
        .reset_index()
        .rename(columns={'b':'name'})
        .groupby('a')['name','children']
        .apply(lambda x: x.to_dict('r'))
        .rename('children')
        .reset_index()
        .rename(columns={'a':'name'})
        .to_dict('r')
        )
#print (L)

#create outer level by contructor
d = { "name": "Brand", "children": L}

print (d)

{
    'name': 'Brand',
    'children': [{
        'name': 'lenovo',
        'children': [{
            'name': 'a6020',
            'children': [{
                'name': 'screen broken',
                'size': 1
            }]
        }, {
            'name': 'a6020a40',
            'children': [{
                'name': 'battery',
                'size': 60
            }, {
                'name': 'bluetooth',
                'size': 60
            }, {
                'name': 'buttons',
                'size': 60
            }]
        }, {
            'name': 'k4',
            'children': [{
                'name': 'wi-fi',
                'size': 3
            }, {
                'name': 'bluetooth',
                'size': 3
            }]
        }]
    }]
}