嵌套变量长度字典列表到pandas DF

时间:2016-01-11 11:36:08

标签: python dictionary pandas

我的字典看起来像这样:

{ I1 : [['A',1],['B',2],['C',3]],
  I2 : [['B',2],['D',4]],
  I3 : [['A',2],['E',5]]
}

即我有一个索引(键),然后是可变数量的对。我想创建一个与字典具有相同索引的pandas数据帧,其中列是列表对的第一个值,值是列表对的第二个值,并且NaN被填入缺失值(即行I2将具有“A&#39”栏中的NaN。有没有一个光滑的方法来做到这一点?

3 个答案:

答案 0 :(得分:3)

import pandas as pd

a={ 'I1' : [['A',1],['B',2],['C',3]],
    'I2' : [['B',2],['D',4]],
    'I3' : [['A',2],['E',5]]
  }


# create a list of dictionary from each rows

'''
The map function is used to convert say 'I3'
to integer 3, which can then be used to sort on
This is done because sorting  merely by the index string will lead to
say 'I15' to appear before 'I4'(assuming a more general 
case of you having more than just 3 indexes)
'''
# the sorted function is used because the order of keys is not maintained in a dict

row_dict = [dict(a[idx]) for _,idx in sorted(zip(map(lambda x: int(x[1:]),a),a))]

df=pd.DataFrame(row_dict)


    A   B   C   D   E
0   1   2   3 NaN NaN
1 NaN   2 NaN   4 NaN
2   2 NaN NaN NaN   5

答案 1 :(得分:2)

假设I1,I2,I3是字符串,你可以使用它:

import pandas as pd

a={ 'I1' : [['A',1],['B',2],['C',3]],
  'I2' : [['B',2],['D',4]],
  'I3' : [['A',2],['E',5]]
}

df=pd.DataFrame([dict(val) for key,val in a.items()])
print df

    A   B   C   D   E
0   1   2   3 NaN NaN
1   2 NaN NaN NaN   5
2 NaN   2 NaN   4 NaN

答案 2 :(得分:2)

您可以使用@ manu190455解决方案,但在使用pandas.DataFramesorted参数传递给key之前对其进行排序:

d = { 'I1' : [['A',1],['B',2],['C',3]],
    'I2' : [['B',2],['D',4]],
    'I3' : [['A',2],['E',5]]}

sorted_d = sorted(d.items(), key = lambda x: x[0])

In [263]: sorted_d
Out[263]:
[('I1', [['A', 1], ['B', 2], ['C', 3]]),
 ('I2', [['B', 2], ['D', 4]]),
 ('I3', [['A', 2], ['E', 5]])]

df = pd.DataFrame([dict(val) for key, val in sorted_d])

In [265]: df
Out[265]:
    A   B   C   D   E
0   1   2   3 NaN NaN
1 NaN   2 NaN   4 NaN
2   2 NaN NaN NaN   5