Question

通过pandas read_excel阅读excel后，最终会出现类型字符串为＆＃39; nan＆＃39;的行。我尝试使用此处讨论的所有可用方法删除它们，但似乎它不起作用：

以下是尝试：

df.dropna(subset=['A'], inplace=True)

我认为这样可行，它减少了数据框中的行数，而没有删除具有'nan'的行

df = df[df.A.str.match('nan') == False]

Answer 1

我们可以replace'先'然后使用dropna

df.replace({'A':{'nan':np.nan}}).dropna(subset=['A'], inplace=True)

Answer 2

更好的方法是通过布尔索引，因为它们是字符串，即

df = pd.DataFrame({"A":['nan',1,2,3],'B':[1,2,3,'nan']})

# To remove 'nan's from only A
print(df[(df.A!='nan')])

#   A    B
#1  1    2
#2  2    3
#3  3  nan


#For removing all the rows that hold `nan`
print(df[(df!='nan').all(1)])
#   A  B
#1  1  2
#2  2  3

Pandas数据框删除行＆＃39; nan＆＃39;按列名称

2 个答案: