将表格数据转换为表示树

时间:2015-11-02 18:25:40

标签: python-2.7

我正在尝试使用Python 2.7将数据从表转换为嵌套字典。

这是数据样本 -

P  C  GC  GGC  
--------------
P1 C1 GC1 GGC1
P1 C1 GC1 GGC2
P1 C1 GC2 GGC3
P1 C1 GC2 GGC4
P1 C1 GC2 GGC5
P1 C2 GC3 GGC6
P1 C2 GC3 GGC7
P1 C2 GC4 GGC8
P1 C2 GC4 GGC9
P2 C3 GC5 GGC10
P2 C3 GC5 GGC11
P2 C3 GC5 GGC12

这里的行代表父母,孩子,大孩子,伟大的GC。这意味着层次结构是4级深度。 (我有一个案例,可能有5或6级深,但我不需要动态调整到水平的解决方案。硬编码到4级的解决方案是好的)。

我需要将此数据转换为表示树的嵌套字典。 (后来进入UI树视图类型的元素。)

预期输出为 -

[
  {
    text: "P1",
    nodes: [
      {
        text: "C1",
        nodes: [
          {
            text: "GC1",
            nodes: [
              {
                text: "GGC1"
              },
              {
                text: "GGC2"
              }
            ]
          },
          {
            text: "GC2",
            nodes: [
              {
                text: "GGC3"
              },
              {
                text: "GGC4"
              },
              {
                text: "GGC5"
              }
            ]
          }
        ]
      },
      {
        text: "C2",
        nodes: [
          {
            text: "GC3",
            nodes: [
              {
                text: "GGC6"
              },
              {
                text: "GGC7"
              }
            ]
          },
          {
            text: "GC4",
            nodes: [
              {
                text: "GGC8"
              },
              {
                text: "GGC9"
              }
            ]
          }
        ]
      }
    ]
  },
  {
    text: "P2",
    nodes: [
      {
        text: "C3",
        nodes: [
          {
            text: "GC5",
            nodes: [
              {
                text: "GGC10"
              },
              {
                text: "GGC12"
              }
            ]
          },
        ]
      }
    ]
  }
];

此行中的每一列都会转换为层次结构中的级别。父母基本上处于等级制度的顶层。

解决此问题的最佳方法是什么? (使用pandas< 0.15.1的解决方案也很好。)

PS - Python新手在这里。

1 个答案:

答案 0 :(得分:1)

您可以遍历df并打印以获得类似的结构,

我已经将pandas数据帧用于表,但这不是必需的。你可以用简单的清单。

df
Out[36]: 
        km  price   1b  1c   1d
0   240000   3650  hey  yo  OMG
1   139800   3800  hey  yo  OMG
2   150500   4400  hey  yo  OMG
3   185530   4450  hey  yo  OMG
4   176000   5250  hey  yo  OMG
5   114800   5350  hey  yo  OMG
6   166800   5800  hey  yo  OMG
7    89000   5990  hey  yo  OMG
8   144500   5999  hey  yo  OMG
9    84000   6200  hey  yo  OMG
10   82029   6390  hey  yo  OMG



 for x in df.iterrows():
    print '{text: "%s",nodes: [{text: "%s",nodes: [{text: "%s",nodes: [{text: "%s" },{text: "%s"}]},]}]},' % (x[1][0], x[1][1],x[1][2], x[1][3], x[1][4])

{text: "240000",nodes: [{text: "3650",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "139800",nodes: [{text: "3800",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "150500",nodes: [{text: "4400",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "185530",nodes: [{text: "4450",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "176000",nodes: [{text: "5250",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "114800",nodes: [{text: "5350",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "166800",nodes: [{text: "5800",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "89000",nodes: [{text: "5990",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "144500",nodes: [{text: "5999",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "84000",nodes: [{text: "6200",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},
{text: "82029",nodes: [{text: "6390",nodes: [{text: "hey",nodes: [{text: "yo" },{text: "OMG"}]},]}]},