我有一个包含5列的数据框:filename,#line_changed-hist,#line_changed-myers,#line_changed-min和#line_changed-pat以及数千个数据。我想要做的是在最后四列(所有line_changed)具有相同值时删除行。假设我的数据帧名为" datamerge3:
filename #line_changed-hist #line_changed-myers #line_changed-min #line_changed-pat
---------------------------------------------------------------------------------------------------------------
.../util/HBaseFsck.java 1808 1806 1806 1806
.../hfile/HFileBlock.java 1036 1032 1032 1040
.../HConnectionManager.java 794 772 772 774
.../TestCompatibility.java 762 762 762 762
.../master/MockServer.java 605 605 605 605
.../TestRowEndpoint.java 598 598 598 598
.../TestHBaseFsck.java 576 572 572 572
.../TestEndLevel.java 11 0 0 0
我需要删除最后四列(#line_changed)中具有相同值的所有行。例如,数据4,5和6.然后,将其保存到新的csv文件。这是我写的代码:
for nn in range(0,len(datamerge3)-1):
dmhist = datamerge3.iloc[nn]['#line_changed-hist']
dmmyers = datamerge3.iloc[nn]['#line_changed-myers']
dmmin = datamerge3.iloc[nn]['#line_changed-min']
dmpat = datamerge3.iloc[nn]['#line_changed-pat']
if ((dmhist == dmmyers) and (dmhist == dmmin) and (dmhist == dmpat)):
datamerge3.drop([nn])
else:
pass
datamerge3.to_csv('diff_file.csv')
但代码没有用。在代码中是否有我想念的东西?
答案 0 :(得分:3)
IIUC,您可以将diff
和any
与布尔索引一起使用:
df[df.iloc[:,-4:].diff(axis=1).fillna(0).any(1)]
输出:
filename #line_changed-hist #line_changed-myers #line_changed-min #line_changed-pat
1 .../util/HBaseFsck.java 1808.0 1806.0 1806.0 1806.0
2 .../hfile/HFileBlock.java 1036.0 1032.0 1032.0 1040.0
3 .../HConnectionManager.java 794.0 772.0 772.0 774.0
7 .../TestHBaseFsck.java 576.0 572.0 572.0 572.0
答案 1 :(得分:1)
您可以使用query,但需要输入列名:
private PendingIntent getGeofencePendingIntent() {
if(geofencePendingIntent != null)
return geofencePendingIntent;
Intent in = new Intent(SetProfileOnlineActivity.this,GeofenceTransitionsIntentService.class);
geofencePendingIntent = PendingIntent.getService(SetProfileOnlineActivity.this,111451,in,PendingIntent.FLAG_UPDATE_CURRENT);
return geofencePendingIntent;
}