使用python删除数据框中某些列中具有相同值的行

时间:2018-03-02 04:50:33

标签: python pandas dataframe rows

我有一个包含5列的数据框:filename,#line_changed-hist,#line_changed-myers,#line_changed-min和#line_changed-pat以及数千个数据。我想要做的是在最后四列(所有line_changed)具有相同值时删除行。假设我的数据帧名为" datamerge3:

filename                    #line_changed-hist #line_changed-myers #line_changed-min #line_changed-pat
---------------------------------------------------------------------------------------------------------------
.../util/HBaseFsck.java     1808                1806                1806              1806
.../hfile/HFileBlock.java   1036                1032                1032              1040
.../HConnectionManager.java  794                 772                 772               774
.../TestCompatibility.java   762                 762                 762               762
.../master/MockServer.java   605                 605                 605               605
.../TestRowEndpoint.java     598                 598                 598               598
.../TestHBaseFsck.java       576                 572                 572               572
.../TestEndLevel.java         11                   0                   0                 0

我需要删除最后四列(#line_changed)中具有相同值的所有行。例如,数据4,5和6.然后,将其保存到新的csv文件。这是我写的代码:

for nn in range(0,len(datamerge3)-1):
    dmhist = datamerge3.iloc[nn]['#line_changed-hist']
    dmmyers = datamerge3.iloc[nn]['#line_changed-myers']
    dmmin = datamerge3.iloc[nn]['#line_changed-min']
    dmpat = datamerge3.iloc[nn]['#line_changed-pat']
    if ((dmhist == dmmyers) and (dmhist == dmmin) and (dmhist == dmpat)):
        datamerge3.drop([nn])
    else:
        pass

datamerge3.to_csv('diff_file.csv')

但代码没有用。在代码中是否有我想念的东西?

2 个答案:

答案 0 :(得分:3)

IIUC,您可以将diffany与布尔索引一起使用:

df[df.iloc[:,-4:].diff(axis=1).fillna(0).any(1)]

输出:

                      filename  #line_changed-hist  #line_changed-myers  #line_changed-min  #line_changed-pat
1      .../util/HBaseFsck.java              1808.0               1806.0             1806.0             1806.0
2    .../hfile/HFileBlock.java              1036.0               1032.0             1032.0             1040.0
3  .../HConnectionManager.java               794.0                772.0              772.0              774.0
7       .../TestHBaseFsck.java               576.0                572.0              572.0              572.0

答案 1 :(得分:1)

您可以使用query,但需要输入列名:

private PendingIntent getGeofencePendingIntent() {
        if(geofencePendingIntent != null)
            return geofencePendingIntent;
        Intent in = new Intent(SetProfileOnlineActivity.this,GeofenceTransitionsIntentService.class);
        geofencePendingIntent = PendingIntent.getService(SetProfileOnlineActivity.this,111451,in,PendingIntent.FLAG_UPDATE_CURRENT);
        return geofencePendingIntent;
    }