Question

我有一个数据框，其中有一些缺少的值，例如“ none”。

import pandas as pd df = pd.DataFrame ({'Category': (['none',''women','kids']), 'Sales': (['none','none','40']), '# of customers': (['30','none','50']) })

我想删除值最多的行或列为“ none”。这该怎么做？谢谢

Answer 1

第一种解决方案是将任何字符都不视为NaN，我们将eq与sum一起使用（如果需要使用row将sum(axis=1)删除）

df.loc[:,df.eq('none').sum().lt(2)]
Out[559]: 
  # of customers Category
0             30     none
1           none    women
2             50     kids

第二种解决方案是假设您的人都不是np.nan，并且将dropna与thresh一起使用

#df=df.replace('none',np.nan)

df.dropna(axis=0,thresh=2)#here thresh is Require that many non-NA values.
Out[563]: 
  # of customers Category Sales
2             50     kids    40

Answer 2

或者：

df.loc[:,(df=='none').sum()<=1]

输出：

  # of customers Category
0             30     none
1           none    women
2             50     kids

熊猫-如果大多数情况下具有特定值，如何删除行或列？

2 个答案: