我想在索引之后,百分位数值之前生成一列,以打印我要在Var1中循环的变量。有任何想法吗?
dropped_df基本上是同一件事,但是在使用qcut查找百分位数值之前,它将删除所有0。
抱歉,idk如何编辑期望的输出。但基本上此列:['A_Spend','A_Spend_drop','B_Spend',............,'score','score_drop']是预期的列在10%列的左侧。
Var1 = ['A_Spend','B_Spend', 'C_Spend', 'D_Spend', 'completed_count', 'score']
df_drop_percentile_total= pd.DataFrame(columns=["10%", "20%", "30%", "40%", "50%", "60%", "70%", "80%", "90%", "100%"])
for i in Var1:
a = pd.qcut(df_drop[i], 10, duplicates= 'drop').cat.categories.right
df_drop_percentile_total = df_drop_percentile_total.append(pd.DataFrame([a]).rename(columns={0: "10%", 1: "20%", 2: "30%", 3: "40%", 4: "50%", 5: "60%", 6: "70%", 7: "80%", 8: "90%", 9: "100%"}), ignore_index=True, sort=False)
dropped_df = df_drop[df_drop[i] != 0]
a = pd.qcut(dropped_df[i], 10, duplicates= 'drop').cat.categories.right
df_drop_percentile_total = df_drop_percentile_total.append(pd.DataFrame([a]).rename(columns={0: "10%", 1: "20%", 2: "30%", 3: "40%", 4: "50%", 5: "60%", 6: "70%", 7: "80%", 8: "90%", 9: "100%"}), ignore_index=True, sort=False)
10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
0 3.39 5.887 8.829 12.415 17.05 23.434 32.978 49.039 85.088 2963.267
1 3.524 6.02 8.963 12.574 17.223 23.626 33.208 49.318 85.477 2963.267
2 9.18 1207.051
3 3.843 5.284 7.109 9.146 11.548 14.929 19.55 27.424 43.493 1207.051
4 1 2 3 4 5 7 11 19 5499
5 2 3 4 5 7 11 19 5499
6 393 427 449 463 476 488 502 525 556 756
7 393 427 449 463 476 488 502 525 556 756
8 31.394 62.76 95.253 128.522 164.541 204.317 252.899 316.975 425.442 2963.267
9 31.481 62.879 95.352 128.602 164.632 204.359 252.985 317.03 425.598 2963.267
10 7.493 16.16 34.357 572.296