我正在尝试按Department属性对数据进行分组,然后对它进行分组,然后填充数据集中间的两个字段(评分和数字)上方和下方的空间。
我尝试过让groupby工作,但无济于事。我的计划是让groupby工作,然后应用以下代码查看我是否可以使填充正常工作。
# This won't work on its own because I need to group the data first.
df = df.mask(df == 0).ffill()
这就是我的开始:
| Department | Range | Rating | Number | |--------------|----------|--------------|--------| | Admin | 0 (None) | | | | Admin | 01 to 3 | | | | Admin | 01 to 3 | | | | Admin | 01 to 3 | | | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 07 to 10 | | | | Admin | 07 to 10 | | | | Admin | 07 to 10 | | | | Admin | 07 to 10 | | | | Distribution | 0 (None) | | | | Distribution | 01 to 3 | | | | Distribution | 01 to 3 | | | | Distribution | 01 to 3 | | | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 07 to 10 | | | | Distribution | 07 to 10 | | | | Distribution | 07 to 10 | | | | Distribution | 07 to 10 | | |
这就是我想要的
| Department | Range | Rating | Number | |--------------|----------|--------------|--------| | Admin | 0 (None) | 1. Too Low | 1 | | Admin | 01 to 3 | 1. Too Low | 1 | | Admin | 01 to 3 | 1. Too Low | 1 | | Admin | 01 to 3 | 1. Too Low | 1 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 04 to 6 | 2. On Target | 2 | | Admin | 07 to 10 | 3. Too High | 3 | | Admin | 07 to 10 | 3. Too High | 3 | | Admin | 07 to 10 | 3. Too High | 3 | | Admin | 07 to 10 | 3. Too High | 3 | | Distribution | 0 (None) | 1. Too Low | 1 | | Distribution | 01 to 3 | 1. Too Low | 1 | | Distribution | 01 to 3 | 1. Too Low | 1 | | Distribution | 01 to 3 | 1. Too Low | 1 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 04 to 6 | 2. On Target | 2 | | Distribution | 07 to 10 | 3. Too High | 3 | | Distribution | 07 to 10 | 3. Too High | 3 | | Distribution | 07 to 10 | 3. Too High | 3 | | Distribution | 07 to 10 | 3. Too High | 3 |
有没有动态的方法可以做到这一点?
答案 0 :(得分:1)
您可以将pd.concat
与groupby
结合使用,并利用自定义函数来填充逻辑:
# convert to numeric
df['Number'] = pd.to_numeric(df['Number'])
# assign values by index
def filler(x):
idx = np.where(x['Number'].notnull())[0]
x.iloc[:idx[0], -2:] = ['1. Too Low', 1]
x.iloc[idx[-1]+1:, -2:] = ['3. Too High', 3]
return x
# concatenate transformed dataframe slices
res = pd.concat(df_slice.pipe(filler) for _, df_slice in df.groupby('Department'))
结果:
print(res)
Department Range Rating Number
0 Admin 0 (None) 1. Too Low 1.0
1 Admin 01 to 3 1. Too Low 1.0
2 Admin 01 to 3 1. Too Low 1.0
3 Admin 01 to 3 1. Too Low 1.0
4 Admin 04 to 6 2. On Target 2.0
5 Admin 04 to 6 2. On Target 2.0
6 Admin 04 to 6 2. On Target 2.0
7 Admin 07 to 10 3. Too High 3.0
8 Admin 07 to 10 3. Too High 3.0
9 Admin 07 to 10 3. Too High 3.0
10 Admin 07 to 10 3. Too High 3.0
11 Distribution 0 (None) 1. Too Low 1.0
12 Distribution 01 to 3 1. Too Low 1.0
13 Distribution 01 to 3 1. Too Low 1.0
14 Distribution 01 to 3 1. Too Low 1.0
15 Distribution 04 to 6 2. On Target 2.0
16 Distribution 04 to 6 2. On Target 2.0
17 Distribution 04 to 6 2. On Target 2.0
18 Distribution 07 to 10 3. Too High 3.0
19 Distribution 07 to 10 3. Too High 3.0
20 Distribution 07 to 10 3. Too High 3.0
21 Distribution 07 to 10 3. Too High 3.0