如何基于一列的值获取前三位和后三位的所有行?

时间:2019-07-12 16:06:09

标签: python pandas dataframe

我的df如下:

BacksGas_Flow_sccm  ContextID   StepID  Time_Elapsed        iso_forest  anomaly_score           alarm
96.875              7296124     19      39.798              -1          -0.22435033280902072    3
96.875              7296125     19      39.993              -1          -0.22435033280902072    3
96.875              7296406     19      39.829              -1          -0.22435033280902072    3
96.875              7296405     19      39.243              -1          -0.22435033280902072    3
96.6796875          7317148     19      38.801              -1          -0.22435033280902072    3
96.6796875          7317149     19      38.801              -1          -0.22435033280902072    3
96.58203125         7293851     19      40.226              -1          -0.22435033280902072    3
96.58203125         7293852     19      40.031000000000006  -1          -0.22435033280902072    3
96.38671875         7293732     19      39.945              -1          -0.22435033280902072    3
96.38671875         7293731     19      39.945              -1          -0.22435033280902072    3
95.80078125         7297416     19      39.666000000000004  -1          -0.22435033280902072    3
95.80078125         7297415     19      39.541000000000004  -1          -0.22435033280902072    3
18.5546875          7321507     19      38.107              -1          -0.25368125176672074    -3
18.5546875          7322950     19      37.734              -1          -0.25368125176672074    -3
18.45703125         7320222     19      37.906000000000006  -1          -0.25368125176672074    -3
18.45703125         7323150     19      37.755              -1          -0.25368125176672074    -3
18.45703125         7323151     19      38.02               -1          -0.25368125176672074    -3
18.45703125         7320221     19      38.069              -1          -0.25368125176672074    -3
18.359375           7291023     19      37.718              -1          -0.25420996401901275    -3
18.359375           7291024     19      37.933              -1          -0.25420996401901275    -3
18.26171875         7316192     19      38.741              -1          -0.25420996401901275    -3
18.26171875         7312681     19      38.084              -1          -0.25420996401901275    -3
18.26171875         7312682     19      37.830000000000005  -1          -0.25420996401901275    -3
18.26171875         7316191     19      37.679              -1          -0.25420996401901275    -3
18.1640625          7291050     19      38.299              -1          -0.25420996401901275    -3
18.1640625          7311617     19      38.031000000000006  -1          -0.25420996401901275    -3
18.1640625          7324929     19      38.119              -1          -0.25420996401901275    -3
18.1640625          7291049     19      37.841              -1          -0.25420996401901275    -3
18.1640625          7311618     19      38.031000000000006  -1          -0.25420996401901275    -3
18.1640625          7324930     19      38.119              -1          -0.25420996401901275    -3
18.06640625         7306076     19      38.098              -1          -0.25420996401901275    -3
18.06640625         7317385     19      37.967000000000006  -1          -0.25420996401901275    -3
18.06640625         7316312     19      38.169000000000004  -1          -0.25420996401901275    -3
18.06640625         7306077     19      38.098              -1          -0.25420996401901275    -3
18.06640625         7317386     19      37.967000000000006  -1          -0.25420996401901275    -3
18.06640625         7316311     19      38.169000000000004  -1          -0.25420996401901275    -3

我想从BacksGas_Flow_sccm列中获取属于最高3值和最低3值的所有行。

在以上df中:

BacksGas_Flow_sccm列中的最高3个值是:96.875、96.6796875、95.80078125

BacksGas_Flow_sccm列中的最低3个值是:18.06640625、18.1640625、18.26171875

预期输出:

BacksGas_Flow_sccm  ContextID   StepID  Time_Elapsed    iso_forest  anomaly_score   alarm
    96.875  7296124 19  39.798  -1  -0.22435033280902072    3
    96.875  7296125 19  39.993  -1  -0.22435033280902072    3
    96.875  7296406 19  39.829  -1  -0.22435033280902072    3
    96.875  7296405 19  39.243  -1  -0.22435033280902072    3
    96.6796875  7317148 19  38.801  -1  -0.22435033280902072    3
    96.6796875  7317149 19  38.801  -1  -0.22435033280902072    3
    96.58203125 7293851 19  40.226  -1  -0.22435033280902072    3
    96.58203125 7293852 19  40.031000000000006  -1  -0.22435033280902072    3
    18.26171875 7316192 19  38.741  -1  -0.25420996401901275    -3
    18.26171875 7312681 19  38.084  -1  -0.25420996401901275    -3
    18.26171875 7312682 19  37.830000000000005  -1  -0.25420996401901275    -3
    18.26171875 7316191 19  37.679  -1  -0.25420996401901275    -3
    18.1640625  7291050 19  38.299  -1  -0.25420996401901275    -3
    18.1640625  7311617 19  38.031000000000006  -1  -0.25420996401901275    -3
    18.1640625  7324929 19  38.119  -1  -0.25420996401901275    -3
    18.1640625  7291049 19  37.841  -1  -0.25420996401901275    -3
    18.1640625  7311618 19  38.031000000000006  -1  -0.25420996401901275    -3
    18.1640625  7324930 19  38.119  -1  -0.25420996401901275    -3
    18.06640625 7306076 19  38.098  -1  -0.25420996401901275    -3
    18.06640625 7317385 19  37.967000000000006  -1  -0.25420996401901275    -3
    18.06640625 7316312 19  38.169000000000004  -1  -0.25420996401901275    -3
    18.06640625 7306077 19  38.098  -1  -0.25420996401901275    -3
    18.06640625 7317386 19  37.967000000000006  -1  -0.25420996401901275    -3
    18.06640625 7316311 19  38.169000000000004  -1  -0.25420996401901275    -3

我尝试使用pd.nlargestpd.nsmallest,但输出错误。

这怎么办?

预先感谢

1 个答案:

答案 0 :(得分:1)

您可以通过drop_duplicates()nlargestnsmallest的组合来实现:

s=df.BacksGas_Flow_sccm.drop_duplicates()
(df[df.BacksGas_Flow_sccm.isin(pd.concat([s.nlargest(3),s.nsmallest(3)]))]
                                                   .reset_index(drop=True))

BacksGas_Flow_sccm  ContextID   StepID  Time_Elapsed    iso_forest  anomaly_score   alarm
0   96.875000   7296124 19  39.798  -1  -0.22435    3
1   96.875000   7296125 19  39.993  -1  -0.22435    3
2   96.875000   7296406 19  39.829  -1  -0.22435    3
3   96.875000   7296405 19  39.243  -1  -0.22435    3
4   96.679688   7317148 19  38.801  -1  -0.22435    3
5   96.679688   7317149 19  38.801  -1  -0.22435    3
6   96.582031   7293851 19  40.226  -1  -0.22435    3
7   96.582031   7293852 19  40.031  -1  -0.22435    3
8   18.261719   7316192 19  38.741  -1  -0.25421    -3
9   18.261719   7312681 19  38.084  -1  -0.25421    -3
10  18.261719   7312682 19  37.830  -1  -0.25421    -3
....
....