将熊猫数据框转换为嵌套字典/ json格式

时间:2019-10-29 13:57:56

标签: python json dataframe dictionary

我正在尝试将以下pandas dataframe(python)转换为嵌套字典格式。 输入数据熊猫数据框

statetraffic   |state    | act       | traffic| reward | header      | time |   id

  stateacttraf |     1   |    1      | 12      |  22     |   str1    |   1572340221000 | 34022100
  stateacttraf |     1   |    2      | 87      |  30     |   str1    |   1572340221000 | 34022100
  stateacttraf |     1   |    3      | 1       |  48     |   str1    |   1572340221000 | 34022100
  stateacttraf |     2   |    1      | 10      |  13     |   str1    |   1572340221000 | 34022100
  stateacttraf |     2   |    2      | 80      |  27     |   str1    |   1572340221000 | 34022100
  stateacttraf |     2   |    3      | 10      |  60     |   str1    |   1572340221000 | 34022100

尝试了以下代码,但无效:

1)final_op = input_df.to_dict(orient='records') -> does not provide the answer       
2)from jsonmerge import merge; 
message = {'statetraffic': 'stateacttraf'}; 
message1 = {'time': time.time()}; 
result = merge(final_op, message, message2) -> Neither does this provide the answer either

需要某种形式的嵌套字典

期望字典/ json输出是这样的:

{

{  "statetraffic":"stateacttraf",
   "time":1572340221000,
   "str1":{ 
      "id":34022100,
      "state":1,
      "act":1,
      "trafficSplit":12,
      "reward":22
   }
{ 
   "statetraffic":"stateacttraf",
   "time":1572340221000,
   "str1":{ 
      "id":34022100,
      "state":1,
      "act":2,
      "trafficSplit":87,
      "reward":30
   }
{ 
   "statetraffic":"stateacttraf",
   "time":1572340221000,
   "str1":{ 
      "id":34022100,
      "state":1,
      "act":3,
      "trafficSplit":1,
      "reward":48
   }
{  "statetraffic":"stateacttraf",
   "time":1572340221000,
   "str1":{ 
      "id":34022100,
      "state":2,
      "act":1,
      "trafficSplit":10,
      "reward":13
   }
}

迫切需要这种格式的输出。因此,任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:1)

尝试一下,假设您的数据框为main_dict = df.to_dict() uprow= ["statetraffic","time","header"] drow = ["id","state" ,"act" ,"traffic","reward"] datalist = [] for c in range(df.shape[0]): subd = {} for k,v in main_dict.items(): subd[k] = v[c] subd_ = subd.copy() tmp = subd.get("header") subd[tmp] = 0 for i in uprow: del subd_[i] subd[tmp]=subd_ for i in drow: del subd[i] del subd["header"] datalist.append(subd) print(datalist)

{{1}}