Question

我们说我有以下Pandas数据帧：

Name    Day    Earnings
Aaron    1      100
Aaron    3      250
Aaron    4      125
Bill     2      55
Bill     3       62
Bill     5       1000

所以我想最终：

Name         Series
Aaron       [1:100, 2:0, 3:250, 4:125]
Bill        [1:0, 2:55, 3:62, 4:0, 5:1000]

我可以使用简单的旧应用和groupby（如果是这样，我无法找到正确的组合）吗？或者有更好的方法吗？

到目前为止，我能得到的最接近的是：

>>> for Name, Info in df.groupby('Name'):
...    print(zip(Info['Day'], Info['Earnings']))
... 
[(1, 100), (3, 250), (4, 125)]
[(2, 55), (3, 62), (5, 1000)]

这是我用来生成数据框的csv：

Name,Day,Earnings
Aaron,1,100
Aaron,3,250
Aaron,4,125
Bill,2,55
Bill,3,62
Bill,5,1000

Answer 1

你可以这样做：

import re
import itertools

names = '''Aaron    1      100
Aaron    3      250
Aaron    4      125
Bill     2      55
Bill     3       62
Bill     5       1000'''

print [(n,  map(lambda x: '%s:%s'%(x[1],x[2]), l))
 for n, l in itertools.groupby(
    [re.split('\W+', l)
     for l in names.split('\n')], lambda x: x[0])]

从一列或几列pandas数据帧生成有序序列的最佳方法？

1 个答案: