假设我关注了DataFrame
:
import pandas as pd
import numpy as np
df = pd.DataFrame({"A":[11,21,31], "B":[12,22,32], "C":[np.nan,23,33], "D":[np.nan,24,34], "E":[15,25,35]})
哪个会返回:
>>> df
A B C D E
0 11 12 NaN NaN 15
1 21 22 23.0 24.0 25
2 31 32 33.0 34.0 35
nan
值的列我知道如何删除具有nan
值的行的所有列,如下所示:
out1 = df.dropna(axis=1, how="any")
哪个返回:
>>> out1
A B E
0 11 12 15
1 21 22 25
2 31 32 35
但是,我期望在找到nan
值之后删除所有列。在玩具示例代码中,预期输出为:
A B
0 11 12
1 21 22
2 31 32
在nan
pandas
的任何行中找到DataFrame
后,如何删除所有列?
答案 0 :(得分:4)
我会做什么:
any
df.loc[:, ~df.isna().cumsum(axis=1).any(axis=0)]
给我:
A B
0 11 12
1 21 22
2 31 32
答案 1 :(得分:0)
我可以找到一种获取预期输出的方法:
colFirstNaN = df.isna().any(axis=0).idxmax() # Find column that has first NaN element in any row
indexColLastValue = df.columns.tolist().index(colFirstNaN) -1
ColLastValue = df.columns[indexColLastValue]
out2 = df.loc[:, :ColLastValue]
然后输出将是:
>>> out2
A B
0 11 12
1 21 22
2 31 32