Question

我想在数据框中选择列中满足条件的列。我的df看起来像这样。

                                      1           2           3
size                         135.000000   34.000000    1.000000
rel_size                       0.115582    0.029110    0.000856
mean_score_exam               60.903704   84.647059   64.000000
overall_mean_score_exam       68.234589   68.234589   68.234589
mean_score_non_exam          510.911111  643.117647  489.000000
overall_mean_score_non_exam  547.501712  547.501712  547.501712
pass_rate                      0.814815    1.000000    1.000000
overall_pass_rate              0.872432    0.872432    0.872432
derivation_from_pass_rate     -0.057617    0.127568    0.127568

现在我要删除小于5的列。

                                      1           2
size                         135.000000   34.000000
rel_size                       0.115582    0.029110
mean_score_exam               60.903704   84.647059
overall_mean_score_exam       68.234589   68.234589
mean_score_non_exam          510.911111  643.117647
overall_mean_score_non_exam  547.501712  547.501712
pass_rate                      0.814815    1.000000
overall_pass_rate              0.872432    0.872432
derivation_from_pass_rate     -0.057617    0.127568

似乎是一个非常简单的任务，但我不知道如何完成我已经尝试掩盖像这样的列

results.iloc[[0]] > 5

         0     1     2      3      4      5     6      7      8     9
size  True  True  True  False  False  False  True  False  False  True

但是我现在不知道如何将其应用于数据框。

Answer 1

选项1：

res = df.loc[:, df.loc["size"] >= 5]

选项2：

res = df.drop(columns=df.columns[df.loc["size"] < 5])

结果：

In [25]: res
Out[25]:
                                      1           2
size                         135.000000   34.000000
rel_size                       0.115582    0.029110
mean_score_exam               60.903704   84.647059
overall_mean_score_exam       68.234589   68.234589
mean_score_non_exam          510.911111  643.117647
overall_mean_score_non_exam  547.501712  547.501712
pass_rate                      0.814815    1.000000
overall_pass_rate              0.872432    0.872432
derivation_from_pass_rate     -0.057617    0.127568

Answer 2

理想情况下，您应该转置数据框

df1_transposed = df1.T 
df1_transposed = df1_transposed[df1_transposed['size'] > 5]

根据行中的值选择列

2 个答案: