带有元组的字典到数据帧中

时间:2017-01-30 19:59:13

标签: python pandas numpy dictionary dataframe

我有一个python dictonary,其中键作为日期,值为元组,如下所示。

dct = {'01/24/2017 01:10:23.1230':('a',12),
        '12/25/2016 10:12:45.128':('b',23),
        '11/16/2016 09:39:55.459':('c',45),
        '01/12/2017 15:55:20.783':('d',34)}

想把它写成带有常量(userid)的Dataframe,如下所示。

   userid   Date                            value1                value2
0  123     '01/24/2017 01:10:23.1230'        a                     12
1  123     '12/25/2016 10:12:45.128'         b                     23
2  123     '11/16/2016 09:39:55.459'         c                     45
3  123     '01/12/2017 15:55:20.783'         d                     34

尝试将字典转换为列表或numpy数组以写入Dataframe但字典中的元组,我无法将它们分开。有什么想法吗?

2 个答案:

答案 0 :(得分:3)

如果需要选择新列的位置,您可以将DataFrame.from_dictDataFrame.insert一起使用:

d = {'01/24/2017 01:10:23.1230':('a',12),'12/25/2016 10:12:45.128':('b',23),'11/16/2016 09:39:55.459':('c',45),'01/12/2017 15:55:20.783':('d',34)}
df = pd.DataFrame.from_dict(d, orient='index').reset_index()
df.columns = ['Date','value1','value2']
df.insert(0, 'userid', 123)
print (df)
   userid                      Date value1  value2
0     123  01/24/2017 01:10:23.1230      a      12
1     123   12/25/2016 10:12:45.128      b      23
2     123   01/12/2017 15:55:20.783      d      34
3     123   11/16/2016 09:39:55.459      c      45

如果需要新列到DataFrame

df['userid'] = 123
print (df)
                       Date value1  value2  userid
0  01/24/2017 01:10:23.1230      a      12     123
1   12/25/2016 10:12:45.128      b      23     123
2   01/12/2017 15:55:20.783      d      34     123
3   11/16/2016 09:39:55.459      c      45     123

或使用assign的解决方案:

df = df.assign(userid=123)
print (df)
                       Date value1  value2  userid
0  01/24/2017 01:10:23.1230      a      12     123
1   12/25/2016 10:12:45.128      b      23     123
2   01/12/2017 15:55:20.783      d      34     123
3   11/16/2016 09:39:55.459      c      45     123

通过评论编辑:

使用dict comprehension添加新值123

d1 = {k:(123, v[0], v[1]) for k,v in d.items()}
print (d1)
{'01/24/2017 01:10:23.1230': (123, 'a', 12), 
'11/16/2016 09:39:55.459': (123, 'c', 45), 
'01/12/2017 15:55:20.783': (123, 'd', 34), 
'12/25/2016 10:12:45.128': (123, 'b', 23)}

df = pd.DataFrame.from_dict(d1, orient='index').reset_index()
df.columns = ['Date','userid','value1','value2']
print (df)
                       Date  userid value1  value2
0  01/24/2017 01:10:23.1230     123      a      12
1   11/16/2016 09:39:55.459     123      c      45
2   01/12/2017 15:55:20.783     123      d      34
3   12/25/2016 10:12:45.128     123      b      23

答案 1 :(得分:0)

这样的事情:

pd.DataFrame(data=dct).T.reset_index()
Out[13]: 
                      index  0   1
0   01/12/2017 15:55:20.783  d  34
1  01/24/2017 01:10:23.1230  a  12
2   11/16/2016 09:39:55.459  c  45
3   12/25/2016 10:12:45.128  b  23

PS:不要将dict用作变量名,否则你将取代dict类。