如何用熊猫同时分组和总结?

时间:2017-10-16 20:14:40

标签: python pandas

我有一个数据框,例如

Year    Age     Count
1999    0       80
        1       80
        2       80
        3       80
        4       90
        5       100
        ...
2000    0       60
        ....

我想将年龄分组在不同的范围内,例如[0,5),[5,10],......并获得这些范围的相关总数。所以上面会变成

Year    Age     Count
1999    0-4     410
        5-9     ...
        ...
2000    0-4     ...
        ...

使用groupbysum有一种简单的方法吗?

1 个答案:

答案 0 :(得分:0)

您可以使用pd.cut()(如@MaxU建议的那样)制作中间Exception in thread "main" java.lang.UnsupportedOperationException at org.nd4j.linalg.api.complex.BaseComplexNDArray.putScalar(BaseComplexNDArray.java:1947) at org.nd4j.linalg.api.complex.BaseComplexNDArray.putScalar(BaseComplexNDArray.java:1804) at org.nd4j.linalg.api.complex.BaseComplexNDArray.copyFromReal(BaseComplexNDArray.java:545) at org.nd4j.linalg.api.complex.BaseComplexNDArray.<init>(BaseComplexNDArray.java:159) at org.nd4j.linalg.api.complex.BaseComplexNDArray.<init>(BaseComplexNDArray.java:167) at org.nd4j.linalg.cpu.nativecpu.complex.ComplexNDArray.<init>(ComplexNDArray.java:104) at org.nd4j.linalg.cpu.nativecpu.CpuNDArrayFactory.createComplex(CpuNDArrayFactory.java:166) at org.nd4j.linalg.factory.Nd4j.createComplex(Nd4j.java:3345) at org.nd4j.linalg.convolution.DefaultConvolutionInstance.convn(DefaultConvolutionInstance.java:116) at org.nd4j.linalg.convolution.BaseConvolution.convn(BaseConvolution.java:66) at com.example.demo.Main.testing(Main.java:41) at com.example.demo.Main.main(Main.java:34) 列:

Age_Range

cut_points = range(0, df.Age.max() + 5, 5) df['Age_Range'] = pd.cut(df.Age, cut_points) df.groupby(['Year', 'Age_Range'])['Count'].sum() 函数为range()创建切割点,介于0和最大值之间,加上5,增量为5。