是否有更简单/更正确的方式来分配动态组? 让我们关注df:
group days(int, >0)
A 1
B 12
A 14
A 16
A 19
B 23
C 92
C 12
我想根据以下规则分配子组:
if days >20 then subgroup = 4
if days <= 20 then subgroup = 3
if days <= 10 then subgroup = 2
if days == 0 then subgroup = 1
以下是我现在的表现:
df['subgroup'] = 4
df.loc[df['days'] >20,'subgroup'] = 4
df.loc[df['days'] <=20,'subgroup'] = 3
df.loc[df['days'] <=10,'subgroup'] = 2
df.loc[df['days'] ==0,'subgroup'] = 1
df = df.reset_index()
df['dynamic_subgroup'] = df.groupby(['group'])['subgroup'].rank(method='dense')
结果表就是这个:
group days(int, >0) dynamic_subgroup
A 1 1
B 12 1
A 14 2
A 16 3
A 19 4
B 23 2
C 92 2
C 12 1
我想知道是否有更容易/更好的方法在熊猫中获得相同的结果?通常,对代码的任何更正都会受到赞赏。
答案 0 :(得分:3)
您可以使用cut
进行分箱:
bins = [-1, 0, 10, 20, np.inf]
labels=[1,2,3,4]
df['subgroup'] = pd.cut(df['days'], bins=bins, labels=labels)
print (df)
group days subgroup
0 A 1 2
1 B 12 3
2 A 14 3
3 A 16 3
4 A 19 3
5 B 23 4
6 C 92 4
7 C 12 3
答案 1 :(得分:2)
df.assign(subgroup=np.searchsorted([0, 10, 20], df.days.values) + 1)
group days subgroup
0 A 1 2
1 B 12 3
2 A 14 3
3 A 16 3
4 A 19 3
5 B 23 4
6 C 92 4
7 C 12 3