Question

在实际案例中可能不是排列数据的最佳方式，但它是一个很好的例子：

In [16]:
import operator
In [17]:
DF=pd.DataFrame({'Val1':[[2013, 37722.322],[1998, 32323.232]],
                 'Val2':[[2013, 37722.322],[1998, 32323.232]]})
In [18]:
print DF
                Val1               Val2
0  [2013, 37722.322]  [2013, 37722.322]
1  [1998, 32323.232]  [1998, 32323.232]

[2 rows x 2 columns]

apply给出了错误的结果

In [19]:
print DF.apply(operator.itemgetter(-1), axis=1)
   Val1       Val2
0  2013  37722.322
1  1998  32323.232

[2 rows x 2 columns]

但是applymap给出了正确的结果！

In [20]:
print DF.applymap(operator.itemgetter(-1))
        Val1       Val2
0  37722.322  37722.322
1  32323.232  32323.232

[2 rows x 2 columns]

为什么会这样？

Answer 1

如果您使用

，则更容易看到发生了什么

df = pd.DataFrame({'Val1':[[1, 2],[3, 4]],
                 'Val2':[[5, 6],[7, 8]]})

     Val1    Val2
0  [1, 2]  [5, 6]
1  [3, 4]  [7, 8]

df.apply(operator.itemgetter(-1), axis=1)在每一行调用operator.itemgetter(-1)。

例如，在第一行，operator.itemgetter(-1)返回最后一项，即[5, 6]。由于此值是可迭代的，因此将其值分配给两列Val1和Val2。结果是

In [149]: df.apply(operator.itemgetter(-1), axis=1)
Out[149]: 
   Val1  Val2
0     5     6
1     7     8

相比之下，applymap分别对DataFrame中的每个单元格进行操作，因此operator.itemgetter(-1)会返回每个单元格中的最后一项。

In [150]: df.applymap(operator.itemgetter(-1))
Out[150]: 
   Val1  Val2
0     2     6
1     4     8

Answer 2

只是添加@unutbu和@jeff所说的，如果有3列开头：

In [26]:

print DF
                Val1               Val2               Val3
0  [2013, 37722.322]  [2014, 37722.322]  [2015, 37722.322]
1  [1997, 32323.232]  [1998, 32323.232]  [1999, 32323.232]

[2 rows x 3 columns]
In [27]:

print DF.apply(operator.itemgetter(-1), axis=1)
0    [2015, 37722.322]
1    [1999, 32323.232]
dtype: object

结果列表（长度为2）不能强制为3的长度序列，结果现在是一系列列表。

使用operator.itemgetter v.s应用的行为不一致applymap operator.itemgetter

2 个答案: