用熊猫实现count,groupby,np.repeat和agg的问题

时间:2018-10-23 15:28:25

标签: python python-3.x pandas numpy pandas-groupby

我也有类似的数据框熊猫:

df = pd.DataFrame({'x': np.random.rand(61800), 'y':np.random.rand(61800), 'z':np.random.rand(61800)})

我需要为以下结果计算出数据集:

extract = df.assign(count=np.repeat(range(10),10)).groupby('count',as_index=False).agg(['mean','min', 'max'])

Result

但是,如果我使用np.repeat(range(150),150)),则会收到此错误:

1 个答案:

答案 0 :(得分:1)

这不起作用,因为您执行的.assign需要具有足够的值以适合原始数据帧:

In [81]: df = pd.DataFrame({'x': np.random.rand(61800), 'y':np.random.rand(61800), 'z':np.random.rand(61800)})

In [82]: df.assign(count=np.repeat(range(10),10))

ValueError: Length of values does not match length of index

在这种情况下,如果我们做10组重复6,180次,一切都会很好:

In [83]: df.assign(count=np.repeat(range(10),6180))
Out[83]:
              x         y         z  count
0      0.781364  0.996545  0.756592      0
1      0.609127  0.981688  0.626721      0
2      0.547029  0.167678  0.198857      0
3      0.184405  0.484623  0.219722      0
4      0.451698  0.535085  0.045942      0
...         ...       ...       ...    ...
61795  0.783192  0.969306  0.974836      9
61796  0.890720  0.286384  0.744779      9
61797  0.512688  0.945516  0.907192      9
61798  0.526564  0.165620  0.766733      9
61799  0.683092  0.976219  0.524048      9

[61800 rows x 4 columns]

In [84]: extract = df.assign(count=np.repeat(range(10),6180)).groupby('count',as_index=False).agg(['mean','min', 'max'])

In [85]: extract
Out[85]:
              x                             y                             z
           mean       min       max      mean       min       max      mean       min       max
count
0      0.502338  0.000230  0.999546  0.501603  0.000263  0.999842  0.503807  0.000113  0.999826
1      0.500392  0.000059  0.999979  0.499935  0.000012  0.999767  0.500114  0.000230  0.999811
2      0.498377  0.000023  0.999832  0.496921  0.000003  0.999475  0.502887  0.000028  0.999828
3      0.504970  0.000637  0.999680  0.500943  0.000256  0.999902  0.497370  0.000257  0.999969
4      0.501195  0.000290  0.999992  0.498617  0.000149  0.999779  0.497895  0.000022  0.999877
5      0.499476  0.000186  0.999956  0.503227  0.000308  0.999907  0.504688  0.000100  0.999756
6      0.495488  0.000378  0.999606  0.499893  0.000119  0.999740  0.495924  0.000031  0.999556
7      0.498443  0.000005  0.999417  0.495728  0.000262  0.999972  0.501255  0.000087  0.999978
8      0.494110  0.000014  0.999888  0.495197  0.000074  0.999970  0.493215  0.000166  0.999718
9      0.496333  0.000365  0.999307  0.502074  0.000110  0.999856  0.499164  0.000035  0.999927