Question

我正在使用Excel数据。我的数据框df中有几列是对象类型并且具有空值。我想编写一个代码，可以用“NA”替换df的任何列中的所有空白值。我怎么能用熊猫做到这一点？这也可以使用applymap吗？

以下是列类型：

id                           object
name                         object
year_founded                float64
city                         object
country                      object
type                         object

dtype: object

示例数据：

df = pd.DataFrame({'id': ['apple_inc'],'name':['Apple Inc'],'year_founded':[],'city'
:[],'country':['US'],'type':[]})

Answer 1

有两个地方可以处理na值。

其中一个是加载文件时，pd.read_excel提供处理na值的参数，例如na_values。

pd.read_excel(file, na_values=['', ' '])

另一个是Pandas miss data提供了一些处理na值的函数，例如replace，fillna等。

df.replace('', np.nan)

您需要注意的另一件事是您的空白值是什么，它们可能是＆＃39;＆＃39;或者＆＃39; ＆＃39;或者＆＃39; \ t＆＃39;和更多。如果您不确定，或者存在不同类型的空白值，您可以尝试常规方式：

df.replace('^[\s]*$', np.nan, regex=True)

感谢。

Answer 2

IIUC你可以这么做：

In [217]: df
Out[217]:
  city country         id       name type year_founded
0           US  apple_inc  Apple Inc

In [218]: df = df.replace('', 'NA')

In [219]: df
Out[219]:
  city country         id       name type year_founded
0   NA      US  apple_inc  Apple Inc   NA           NA

处理对象类型列

2 个答案: