如何以总和限制密集排列组中的等级?

时间:2018-12-27 03:35:40

标签: python pandas numpy

我想用group byrank进行编码,但条件是如果容器的总和超过2000,则应将其放入下一组。熊猫能做到吗?

我有以下数据:

+---+----------+--+------------------+
| 1 | Load No. |  |  Code  Weight    |
| 2 | 1        |  |  4000   200      |
| 3 | 2        |  |  4000  1800      |
| 4 | 3        |  |  4000   400      |
| 5 | 4        |  |  4000   1000     |
| 6 | 5        |  |  5000   1000     |
| 7 | 6        |  |  5000   800      |
| 8 | 7        |  |  5000   1200     |
+---+----------+--+------------------+

输出:

| 1 | Load No. | Code  Weight Container Total Sum 
| 2 | 1        | 4000   200     1         2000 
| 3 | 2        | 4000   1800    1         2000 
| 4 | 3        | 4000   400     2         1400 
| 5 | 4        | 4000   1000    2         1400 
| 6 | 5        | 5000   1000    3         1800 
| 7 | 6        | 5000   800     3         1800 
| 8 | 7        | 5000   1200    4         1200 

1 个答案:

答案 0 :(得分:0)

一种获取Container的方法

s=df.Weight.cumsum()/2000
pd.cut(s,np.arange(0,max(s)+1,1)).cat.codes+1
0    1
1    1
2    2
3    2
4    3
5    3
6    4
dtype: int8
df['container']=pd.cut(s,np.arange(0,max(s)+1,1)).cat.codes+1

然后我们使用transform

df['total sum']=df.groupby('container').Weight.transform('sum')
df
   LoadNo.  Code  Weight  container  total sum
0        1  4000     200          1       2000
1        2  4000    1800          1       2000
2        3  4000     400          2       1400
3        4  4000    1000          2       1400
4        5  5000    1000          3       1800
5        6  5000     800          3       1800
6        7  5000    1200          4       1200