熊猫数据框到JSON层次结构

时间:2019-07-29 20:28:47

标签: python json pandas

我的熊猫数据框如下:

tree    nodes   classes cues    directions  thresholds  exits
1   1   4   i;i;n;i PLC2hrOGTT;Age;BMI;TimesPregnant    >;>;>;> 126;29;29.7;6   1;0;1;0.5
2   2   3   i;i;n   PLC2hrOGTT;Age;BMI  >;>;>   126;29;29.7 0;1;0.5
3   3   4   i;i;n;i PLC2hrOGTT;Age;BMI;TimesPregnant    >;>;>;> 126;29;29.7;6   1;0;0;0.5
4   4   4   i;i;n;i PLC2hrOGTT;Age;BMI;TimesPregnant    >;>;>;> 126;29;29.7;6   1;1;0;0.5
5   5   4   i;i;n;i PLC2hrOGTT;Age;BMI;TimesPregnant    >;>;>;> 126;29;29.7;6   0;1;0;0.5
6   6   3   i;i;n   PLC2hrOGTT;Age;BMI  >;>;>   126;29;29.7 0;0;0.5
7   7   4   i;i;n;i PLC2hrOGTT;Age;BMI;TimesPregnant    >;>;>;> 126;29;29.7;6   1;1;1;0.5
8   8   4   i;i;n;i PLC2hrOGTT;Age;BMI;TimesPregnant    >;>;>;> 126;29;29.7;6   0;0;0;0.5

,我想像这样将其转换为JSON(仅用于第一行的示例):

[
    {
            "cues": "PLC2hrOGTT", "directions": ">", "thresholds": "126",
            "parent": "null",
            "children": [
              {
                "cues": "Age", "directions": ">", "thresholds": "29",
                "parent": "PLC2hrOGTT",
                "children": [
                  {
                    "cues": "BMI", "directions": ">", "thresholds": "29.7",
                    "parent": "Age",
                    "children": [
                      {
                        "cues": "TimesPregnant", "directions": ">", "thresholds": "6",
                        "parent": "BMI",
                        "children": [
                          {
                            "cues": "False",
                            "parent": "TimesPregnant",
                          },
                          {
                            "cues": "True",
                            "parent": "TimesPregnant",
                          }
                        ]
                      },
                      {
                        "cues": "True",
                        "parent": "BMI",
                      }
                    ]
                  },
                  {
                    "cues": "False",
                    "parent": "Age"
                  },
                ]
              },
              {
                "cues": "True",
                "parent": "PLC2hrOGTT"
              },
            ]
          }
        ];

以此类推。

当前return tree_definitions.to_json(orient='records')不起作用。所以我想知道有没有办法用to_json做到这一点?或任何其他方式,我该怎么做?

tree_definitions.to_json(orient ='records')`输出:

[{"tree":1,"nodes":4,"classes":"i;i;n;i","cues":"PLC2hrOGTT;Age;BMI;TimesPregnant","directions":">;>;>;>","thresholds":"126;29;29.7;6","exits":"1;0;1;0.5"},{"tree":2,"nodes":3,"classes":"i;i;n","cues":"PLC2hrOGTT;Age;BMI","directions":">;>;>","thresholds":"126;29;29.7","exits":"0;1;0.5"},{"tree":3,"nodes":4,"classes":"i;i;n;i","cues":"PLC2hrOGTT;Age;BMI;TimesPregnant","directions":">;>;>;>","thresholds":"126;29;29.7;6","exits":"1;0;0;0.5"},{"tree":4,"nodes":4,"classes":"i;i;n;i","cues":"PLC2hrOGTT;Age;BMI;TimesPregnant","directions":">;>;>;>","thresholds":"126;29;29.7;6","exits":"1;1;0;0.5"},{"tree":5,"nodes":4,"classes":"i;i;n;i","cues":"PLC2hrOGTT;Age;BMI;TimesPregnant","directions":">;>;>;>","thresholds":"126;29;29.7;6","exits":"0;1;0;0.5"},{"tree":6,"nodes":3,"classes":"i;i;n","cues":"PLC2hrOGTT;Age;BMI","directions":">;>;>","thresholds":"126;29;29.7","exits":"0;0;0.5"},{"tree":7,"nodes":4,"classes":"i;i;n;i","cues":"PLC2hrOGTT;Age;BMI;TimesPregnant","directions":">;>;>;>","thresholds":"126;29;29.7;6","exits":"1;1;1;0.5"},{"tree":8,"nodes":4,"classes":"i;i;n;i","cues":"PLC2hrOGTT;Age;BMI;TimesPregnant","directions":">;>;>;>","thresholds":"126;29;29.7;6","exits":"0;0;0;0.5"}]

我得到的另一个熊猫数据框视图,由8个不同的二叉树组成 This is the dataframe I get, it consists of 8 different binary trees

1 个答案:

答案 0 :(得分:0)

您将需要更多地处理数据。您将需要分别将[“提示”,“退出”,“路线”,“阈值”]分成4列。然后,您可以使用groupby来处理(我假设会是这样)“ cues0”,依此类推。按照您想要的方式分组后,请查看以下很棒的代码{ "name": "Remote Testing", "type": "coreclr", "request": "attach", "processId": "${command:pickRemoteProcess}", "preLaunchTask": "remotePush", "pipeTransport": { "pipeProgram": "C:\\plink.exe", "pipeArgs": ["-T", "administrator@myHost"], "debuggerPath": "~/vsdbg/vsdbg", }, "justMyCode": false, "sourceFileMap": { "/home/administrator/sites/mySite": "${workspaceRoot}" } } 我不确定这对缺少值(例如“ exits3”和“ directions3”列中的值)有什么影响,所以是YMMV。希望这会有所帮助。