Question

实际用例是，只要它们小于零，我想将某些命名列中的所有值都替换为零，但不要理会其他列。假设在下面的数据框中，我想将a和b列中的所有值都设为零，但将d列保留为空。

df = pd.DataFrame({'a': [0, -1, 2], 'b': [-3, 2, 1],
                       'c': ['foo', 'goo', 'bar'], 'd' : [1,-2,1]})
df 
   a  b    c  d
0  0 -3  foo  1
1 -1  2  goo -2
2  2  1  bar  1

此问题的可接受答案中的第二段：How to replace negative numbers in Pandas Data Frame by zero确实提供了一种解决方法，我可以将列d的数据类型设置为非数字，然后再将其更改为：

df['d'] = df['d'].astype(object)
num = df._get_numeric_data()
num[num <0] = 0
df['d'] = df['d'].astype('int64')
df
   a  b    c  d
0  0  0  foo  1
1  0  2  goo -2
2  2  1  bar  1

但这看起来真的很乱，这意味着我需要知道我不想更改的列的列表，而不是我确实想要更改的列表。

有没有一种方法可以直接指定列名

Answer 1

您可以使用mask和列过滤：

df[['a','b']] = df[['a','b']].mask(df<0, 0)
df

输出

   a  b    c  d
0  0  0  foo  1
1  0  2  goo -2
2  2  1  bar  1

Answer 2

使用np.where

cols_to_change = ['a', 'b', 'd']

df.loc[:, cols_to_change] = np.where(df[cols_to_change]<0, 0, df[cols_to_change])

    a   b   c   d
0   0   0   foo 1
1   0   2   goo 0
2   2   1   bar 1

根据条件替换指定列列表中的值

2 个答案: