根据python中的值删除列?

时间:2016-05-19 05:09:25

标签: python dataframe

如何根据列值删除数据框列?我想删除所有具有null /'或零的列。假设熊猫数据帧df

df['c1']=[1,2,3,3,4]
df['c2']=["a1","a2","a2","a2","a1"]
df['c3']=[1,2,3,3,5]
df['c4']=['','',0,0,0]
df['c5']=[np.nan,np.nan,0,0,0]
print df

输出

    c1  c2  c3 c4   c5
 0   1  a1   1     NaN
 1   2  a2   2     NaN
 2   3  a2   3  0  0.0
 3   3  a2   3  0  0.0
 4   4  a1   5  0  0.0

我希望代码找到列c4c5并删除它。

2 个答案:

答案 0 :(得分:5)

这为示例数据帧提供了技巧。

badvalues = [0, np.nan]
goodcolumns = [n for n in df.columns 
               if not df[n].isin(badvalues).any()]
df = df[goodcolumns]

如果您的nan值无法解决问题,可以使用isnull

goodcolumns = [n for n in df.columns
               if not ((df[n] == 0) | df[n].isnull()).any()]

答案 1 :(得分:0)

您可以applymap使用drop

>>> cols_to_drop = df.applymap(lambda x: x in [0, np.nan]).any()
>>> cols_to_drop
c1    False
c2    False
c3    False
c4     True
c5     True
dtype: bool
>>> df.drop(df.columns[cols_to_drop], axis=1)
   c1  c2  c3
0   1  a1   1
1   2  a2   2
2   3  a2   3
3   3  a2   3
4   4  a1   5