如何将pandas DataFrame结果转换为用户定义的json格式

时间:2016-02-18 05:41:42

标签: python json pandas dataframe

data_df = pandas.read_csv('details.csv')
data_df = data_df.replace('Null', np.nan)
df = data_df.groupby(['country', 'branch']).count()
df = df.drop('sales', axis=1)  
df = df.reset_index()
print df

我想将数据框( df )的结果转换为我在下面提到的用户定义的json格式。打印结果( df )后,我将以

的形式获得结果
country     branch      no_of_employee     total_salary    count_DOB   count_email
  x            a            30                 2500000        20            25
  x            b            20                 350000         15            20
  y            c            30                 4500000        30            30
  z            d            40                 5500000        40            40
  z            e            10                 1000000        10            10
  z            f            15                 1500000        15            15

我想将此转换为Json。我想要的格式是

x
   {
      a
        {
              no.of employees:30
              total salary:2500000
              count_email:25
         }
       b
         {
              no.of employees:20
              total salary:350000
              count_email:25

           }
     }

   y
     {

        c
         {
              no.of employees:30
              total salary:4500000
              count_email:30

           }
      }
   z
     {
       d
         {
              no.of employees:40
              total salary:550000
              count_email:40
         }
       e
         {
              no.of employees:10
              total salary:100000
              count_email:15

         }
        f
         {
              no.of employees:15
              total salary:1500000
              count_email:15

         }
    }

请注意,我不想要Json中数据帧结果中的所有字段(例如:count_DOB)

1 个答案:

答案 0 :(得分:2)

您可以将groupbyapply to_dictto_json一起使用:

  country branch  no_of_employee  total_salary  count_DOB  count_email
0       x      a              30       2500000         20           25
1       x      b              20        350000         15           20
2       y      c              30       4500000         30           30
3       z      d              40       5500000         40           40
4       z      e              10       1000000         10           10
5       z      f              15       1500000         15           15

g = df.groupby('country')[["branch", "no_of_employee","total_salary", "count_email"]]
                              .apply(lambda x: x.set_index('branch').to_dict(orient='index'))
print g.to_json()
{
    "x": {
        "a": {
            "total_salary": 2500000,
            "no_of_employee": 30,
            "count_email": 25
        },
        "b": {
            "total_salary": 350000,
            "no_of_employee": 20,
            "count_email": 20
        }
    },
    "y": {
        "c": {
            "total_salary": 4500000,
            "no_of_employee": 30,
            "count_email": 30
        }
    },
    "z": {
        "e": {
            "total_salary": 1000000,
            "no_of_employee": 10,
            "count_email": 10
        },
        "d": {
            "total_salary": 5500000,
            "no_of_employee": 40,
            "count_email": 40
        },
        "f": {
            "total_salary": 1500000,
            "no_of_employee": 15,
            "count_email": 15
        }
    }
}

我尝试print g.to_dict(),但JSON无效(请检查here)。