熊猫如何在所有浮点数均为NaN时删除行

时间:2019-09-08 12:42:55

标签: python pandas dataframe

我有以下df

  AAA BBB CCC DDD  ID1  ID2  ID3  ID4
0 txt txt txt txt  10   NaN  12   NaN
1 txt txt txt txt  10   NaN  12   13
2 txt txt txt txt  NaN  NaN  NaN  NaN

具有以下dtypes

AAA          object
BBB          object
CCC          object
DDD          object
ID1          float64
ID2          float64
ID3          float64
ID4          float64

是否只有在所有浮点数均为NaN时才删除行?

输出:

  AAA BBB CCC DDD  ID1  ID2  ID3  ID4
0 txt txt txt txt  10   NaN  12   NaN
1 txt txt txt txt  10   NaN  12   13

我无法使用df.dropna(subset = ['ID1','ID2','ID3','ID4'])完成此操作,因为我的实际df有多个动态浮动列。

谢谢

3 个答案:

答案 0 :(得分:3)

使用DataFrame.select_dtypes获取所有浮点列,然后测试非缺失值,并按DataFrame.any选择每行至少一个非错误值-这样就删除了错误的浮点行:

df1 = df[df.select_dtypes(float).notna().any(axis=1)]
print (df1)
   AAA  BBB  CCC  DDD   ID1  ID2   ID3   ID4
0  txt  txt  txt  txt  10.0  NaN  12.0   NaN
1  txt  txt  txt  txt  10.0  NaN  12.0  13.0

您应将DataFrame.dropna的解决方案更改为传递浮点数列,并更改参数how='all'以测试每行是否所有NaN

df1 = df.dropna(subset=df.select_dtypes(float).columns, how='all')
#for return same dataframe 
#df.dropna(subset=df.select_dtypes(float).columns, how='all', inplace=True)

如果可能,可以通过np.floating检查多种浮点数:

df1 = df.dropna(subset=df.select_dtypes(np.floating).columns, how='all')

答案 1 :(得分:1)

使用

df.dropna(subset=df.select_dtypes(include=np.number).columns, how='all')

我建议使用include=np.number,因为它包含所有float dtypes-它们都可能包含NaN。使用include=float时,您仅获得标准的npfloat64 dtype

例如:

df['ID5'] = np.array([1,2,np.nan], dtype=np.float16)


>>> df.select_dtypes(include=float).columns.tolist()
['ID1', 'ID2', 'ID3', 'ID4']

>>> df.select_dtypes(include=np.number).columns.tolist()
['ID1', 'ID2', 'ID3', 'ID4', 'ID5']

答案 2 :(得分:0)

您可以将NaN替换为0,然后删除仅包含NaN的那些列

  

df.loc[:,~df.replace(0,np.nan).isna().all()]