这里发生了什么？

Question

我有一个简单的数据框：

    a       b
 0  horse  cat
 1  dog    elephant

运行：

df.loc[:,'a'].apply(lambda x: x.upper())

或

df.loc[:,'b'].apply(lambda x: x.upper())

将相应列中的动物大写。但是，运行

df.loc[:,'a':'b'].apply(lambda x: x.upper())

或

df.loc[:,['a','b']].apply(lambda x: x.upper())

导致“AttributeError :(”'Series'对象没有属性'upper'“，'出现在索引a'）”。

显然，我想知道如何修复它（即能够立刻将两个列都大写）。但是我也想知道一个列如何能够自己拥有属性'upper'，但是当lambda作为多个列的一部分应用于它时会丢失它。

Answer 1

使用applymap以便：

将函数应用于旨在以元素方式运行的DataFrame，例如为DataFrame中的每个系列执行map（func，series）

df[['a', 'b']].applymap(lambda x: x.upper())

       a         b
0  HORSE       CAT
1    DOG  ELEPHANT

Answer 2

使用str accessor：

df.loc[:,'a':'b'].apply(lambda x: x.str.upper())

输出：

       a         b
0  HORSE       CAT
1    DOG  ELEPHANT

好的，让我们做一个小小的调试：

def f(x):
    print(type(x))
    print(type(x[0]))

df.loc[:,'a':'b'].apply(f)

输出：

<class 'pandas.core.series.Series'>
<class 'str'>
<class 'pandas.core.series.Series'>
<class 'str'>

我们正在使用pd.DataFrame.apply。

在这种情况下，pandas Series被传递给函数f，因此我们可以使用.str访问器调用字符串函数upper。

现在，让我们看一下第一种情况：

def f(x):
    print(type(x))
    print(type(x[0]))

df.loc[:,'a'].apply(f)

输出：

<class 'str'>
<class 'str'>
<class 'str'>
<class 'str'>

在这里，他们正在使用pd.Series.apply并传递每个值本身。因此，我们可以直接在每个值上调用字符串函数upper。

并且，您也可以在其解决方案中使用pd.DataFrame.applymap作为@chrisz shows将数据帧的每个单元格值传递给函数。