Question

我在我的数据框apply上使用my_df，如下所示：

my_df['column_C'] = my_df.apply(lambda x : 'hello' if x['column_B'] is None else x['column_B'] )

我想：

  if x['column_B'] = None -> return 'hello'
  if x['column_B'] != None -> return x['column_B']

然后我收到了以下错误：

<ipython-input-31-aa087c9a635e> in <lambda>(x)
----> 1 my_df['column_C'] = my_df.apply(lambda x : 'hello' if x['column_B'] is None else x['column_B'] )

/usr/local/lib/python3.4/dist-packages/pandas/core/series.py in __getitem__(self, key)
    599         key = com._apply_if_callable(key, self)
    600         try:
--> 601             result = self.index.get_value(self, key)
    602 
    603             if not is_scalar(result):

/usr/local/lib/python3.4/dist-packages/pandas/indexes/base.py in get_value(self, series, key)
   2187             # python 3
   2188             if is_scalar(key):  # pragma: no cover
-> 2189                 raise IndexError(key)
   2190             raise InvalidIndexError(key)
   2191 

IndexError: ('column_B', 'occurred at index column_A')

有谁知道我在这里做错了什么？

Answer 1

您需要应用指定axis=1将其应用于每一行，而不是每列。请参阅DataFrame.apply上的文档：

axis : {0 or 'index', 1 or 'columns'}, default 0

* 0 or 'index': apply function to each column
* 1 or 'columns': apply function to each row

在您当前的通话中，当它真正使用与x['column_B']对应的pd.Series时，找不到column_A。

因此，如果您使用以下内容，它将起作用。

my_df['column_C'] = my_df.apply(lambda x : 'hello' 
                                if x['column_B'] is None
                                else x['column_B'], axis=1)

注意：正如上面的评论中所指出的，DataFrame.fillna更适合此任务。

python：pandas apply function：InvalidIndexError

1 个答案: