获取行索引以删除两列为零的地方

时间:2018-10-01 18:36:36

标签: python dataframe filter

这是我的专栏:

'CD Block_Code','Total Population Female','Illiterate Female','Total/Rural/Urban'

我要删除女性总人口为零或文盲女性为零的行。

代码

df_cleaned = df.copy(deep = True)

entry_to_remove = [] ;

for index, col in  df.iterrows():

    if (col['Total Population Female'] == '0') or col['Illiterate Female'] == '0':      
        entry_to_remove.append(index)   

    print("entry_to_remove: {}".format(len(entry_to_remove)))

df_cleaned.drop(entry_to_remove, axis = 0, inplace = True)

df_cleaned.head(3)

当我运行最后一个代码时,它给我零行,实际上只有634行为零。

因此将有4个群集,我想获取所有4个群集的数据     分别进行进一步分析。

1 个答案:

答案 0 :(得分:0)

更简单的方法是使用2种条件建立索引:

df[(df['Illiterate Female']!=0) & (df['Total Population Female']!=0)]

示例:

>>> df
   CD Block_Code  Illiterate Female  Total Population Female
0              0                  1                        1
1              0                  1                        1
2              0                  1                        0
3              0                  0                        1

>>> df[(df['Illiterate Female']!=0) & (df['Total Population Female']!=0)]
   CD Block_Code  Illiterate Female  Total Population Female
0              0                  1                        1
1              0                  1                        1

您还可以基于底层的numpy数组进行过滤,这对于大型数据帧可能更快,但可读性却很差:

df[(df[['Illiterate Female','Total Population Female']].values != 0).all(1)]

   CD Block_Code  Illiterate Female  Total Population Female
0              0                  1                        1
1              0                  1                        1