假设我有一个pandas.DataFrame,比如说
In [1]: df = pd.DataFrame([['a', 'x'], ['b', 'y'], ['c', 'z']],
index=[10, 20, 30],
columns=['first', 'second'])
In [2]: df
Out[2]:
first second
10 a x
20 b y
30 c z
我想用第二列的相应条目更新第一列的前两个条目。首先我试过
to_change = df.index <= 20
df[to_change]['first'] = df[to_change]['second']
但这不起作用。然而,
df['first'][to_change] = df['second'][to_change]
工作正常。
有人可以解释一下吗?这种行为背后的理性是什么?虽然我经常使用大熊猫,但我发现这些问题有时难以预测特定的熊猫代码实际上会做些什么。也许有人可以提供一些见解,帮助我改善我对熊猫内部运作的心理模型。
答案 0 :(得分:2)
在master / 0.13(很快发布)
现在将警告您(正在通过提升/忽略选项控制)您正在修改副本
In [1]: df = pd.DataFrame([['a', 'x'], ['b', 'y'], ['c', 'z']],
...: index=[10, 20, 30],
...: columns=['first', 'second'])
In [2]: df
Out[2]:
first second
10 a x
20 b y
30 c z
In [3]: to_change = df.index <= 20
In [4]: df[to_change]['first'] = df[to_change]['second']
pandas/core/generic.py:1008: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
warnings.warn(t,SettingWithCopyWarning)
In [5]: df['first'][to_change] = df['second'][to_change]