Question

我知道如何根据另一列的值创建一个包含apply或np.where的新列，但是有选择地更改现有列值的方法正在逃避我;我怀疑涉及df.ix？我接近了吗？

例如，这是一个简单的数据帧（我的数据行有数万行）。我想更改“标记”中的值。栏目（如果名称以字母结尾＆＃39; e＆＃39;：

，请告诉＆＃39; Blue＆＃39;）

>>> import pandas as pd
>>> df = pd.DataFrame({'name':['Mick', 'John', 'Christine', 'Stevie', 'Lindsey'], \
        'flag':['Purple', 'Red', nan, nan, nan]})[['name', 'flag']]
>>> print df

        name    flag
0       Mick  Purple
1       John     Red
2  Christine     NaN
3     Stevie     NaN
4    Lindsey     NaN
[5 rows x 2 columns]

我可以根据我的标准制作一个布尔系列：

>boolean_result = df.name.str.contains('e$')
>print boolean_result
0    False
1    False
2     True
3     True
4    False
Name: name, dtype: bool

我只需要关键步骤来获得以下结果：

>>> print result_wanted
        name    flag
0       Mick  Purple
1       John     Red
2  Christine    Blue
3     Stevie    Blue
4    Lindsey     NaN

Answer 1

df['flag'][df.name.str.contains('e$')] = 'Blue'

Answer 2

pandas.DataFrame.mask(cond, other=nan) 做您想做的事。

当条件为 True 时，它用 other 的值替换值。

df['flag'].mask(boolean_result, other='blue', inplace=True)

inplace=True 表示对数据执行就地操作。

如果你想在条件 false 时替换值，你可以考虑使用 pandas.DataFrame.where()

Pandas根据布尔数组修改列值

2 个答案: