使用pandas从dataframe获取具有相同名称的行列表

时间:2017-11-22 08:26:23

标签: python-3.x pandas

正在寻找一种获取部分行列表的方法。

Name    x   y   r
  a     9   81  63
  a     98  5   89
  b     51  50  73
  b     41  22  14
  c     6   18  1
  c     1   93  55
  d     57  2   90
  d     58  24  20

所以我试图按如下方式获取字典,

di = {a:{0: [9,81,63], 1: [98,5,89]},
    b:{0:[51,50,73], 1:[41,22,14]},
    c:{0:[6,18,1], 1:[1,93,55]},
    d:{0:[57,2,90], 1:[58,24,20]}}

2 个答案:

答案 0 :(得分:5)

有时最好尽量减少占地面积和开销 使用itertools.countcollections.defaultdict

from itertools import count
from collections import defaultdict

counts = {k: count(0) for k in df.Name.unique()}
d = defaultdict(dict)

for k, *v in df.values.tolist():
    d[k][next(counts[k])] = v

dict(d)

{'a': {0: [9, 81, 63], 1: [98, 5, 89]},
 'b': {0: [51, 50, 73], 1: [41, 22, 14]},
 'c': {0: [6, 18, 1], 1: [1, 93, 55]},
 'd': {0: [57, 2, 90], 1: [58, 24, 20]}}

答案 1 :(得分:4)

groupbycount list的自定义函数一起使用,最后转换输出Series to_dict

di = (df.groupby('Name')['x','y','r']
        .apply(lambda x: dict(zip(range(len(x)),x.values.tolist())))
        .to_dict())

print (di)
{'b': {0: [51, 50, 73], 1: [41, 22, 14]}, 
 'a': {0: [9, 81, 63], 1: [98, 5, 89]}, 
 'c': {0: [6, 18, 1], 1: [1, 93, 55]}, 
 'd': {0: [57, 2, 90], 1: [58, 24, 20]}}

详情:

print (df.groupby('Name')['x','y','r']
         .apply(lambda x: dict(zip(range(len(x)),x.values.tolist()))))
Name
a      {0: [9, 81, 63], 1: [98, 5, 89]}
b    {0: [51, 50, 73], 1: [41, 22, 14]}
c       {0: [6, 18, 1], 1: [1, 93, 55]}
d     {0: [57, 2, 90], 1: [58, 24, 20]}
dtype: object

感谢您volcano建议使用enumerate

di = (df.groupby('Name')['x','y','r']
       .apply(lambda x: dict(enumerate(x.values.tolist())))
       .to_dict())

为了更好的测试,可以使用自定义功能:

def f(x):
    #print (x)
    a = range(len(x))
    b = x.values.tolist()
    print (a)
    print (b)
    return dict(zip(a,b))

[[9, 81, 63], [98, 5, 89]]
range(0, 2)
[[9, 81, 63], [98, 5, 89]]
range(0, 2)
[[51, 50, 73], [41, 22, 14]]
range(0, 2)
[[6, 18, 1], [1, 93, 55]]
range(0, 2)
[[57, 2, 90], [58, 24, 20]]

di = df.groupby('Name')['x','y','r'].apply(f).to_dict()
print (di)