一次更改pandas DataFrame的多个列中的某些值

时间:2013-11-08 20:11:36

标签: python pandas

假设我有以下DataFrame:

In [1]: df
Out[1]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

这可以按预期工作:

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1   NaN      4    bad
2     2      5   good

但这不是:

In [2]: df[['apple', 'banana']][df.cherry == 'bad'] = np.nan
In [3]: df
Out[3]:
  apple banana cherry
0     0      3   good
1     1      4    bad
2     2      5   good

为什么呢?如何在不必写出两行的情况下实现'apple'和'banana'值的转换,如

In [2]: df['apple'][df.cherry == 'bad'] = np.nan
In [3]: df['banana'][df.cherry == 'bad'] = np.nan

2 个答案:

答案 0 :(得分:33)

您应该使用loc并执行不链接

In [11]: df.loc[df.cherry == 'bad', ['apple', 'banana']] = np.nan

In [12]: df
Out[12]: 
   apple  banana cherry
0      0       3   good
1    NaN     NaN    bad
2      2       5   good

查看returning a view vs a copy上的文档,如果你链接到了副本(并扔掉了),但如果你在一个地方进行,那么pandas巧妙地意识到你想要分配给原始

答案 1 :(得分:4)

这是因为df[['apple', 'banana']][df.cherry == 'bad'] = np.nan分配给DataFrame的副本。试试这个:

df.ix[df.cherry == 'bad', ['apple', 'banana']] = np.nan