我具有以下数据格式:
col_a col_b col_c
0 10 12 11
1 8 6 99
我想成为:
col_a col_b col_c 0-10 11-20 >20
0 10 12 11 1 2 0
1 8 6 99 2 0 1
答案 0 :(得分:1)
按条件创建布尔掩码,并按True
计算sum
的值:
m1 = (df > 0) & (df <=10)
m2 = (df > 10) & (df <=20)
m3 = (df > 20)
df['0-10'] = m1.sum(axis=1)
df['11-20'] = m2.sum(axis=1)
df['>20'] = m3.sum(axis=1)
print (df)
col_a col_b col_c 0-10 11-20 >20
0 10 12 11 1 2 0
1 8 6 99 2 0 1
可以使用cut
解决方案,但必须通过DataFrame.stack
和SeriesGroupBy.value_counts
使用Series.unstack
重塑:
df1 = (pd.cut(df.stack(),
bins=[0,10,20, np.inf],
labels=['0-10','11-20','>20'])
.groupby(level=0)
.value_counts()
.unstack(fill_value=0))
df = df.join(df1)
print (df)
col_a col_b col_c 0-10 11-20 >20
0 10 12 11 1 2 0
1 8 6 99 2 0 1