Question

我有几行代码

df = df.groupby(by=['col_A','col_B'])['float_col_c']
df.loc[:,'amount_cumulative'] = df.apply(lambda x: x.cumsum())

哪个会发出警告：

/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py:362: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[key] = _infer_fill_value(value)
/anaconda3/lib/python3.6/site-packages/pandas/core/indexing.py:543: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.obj[item] = s

通常，当我看到该错误时，可以将某些内容更改为.loc[]来解决它，但是在这种情况下，警告似乎是另一个问题。我知道我可以抑制警告，但是我更想了解我在使用Pandas语法时遇到的问题。任何有关如何更正此语法的建议都将受到赞赏。

Answer 1

~~我相信这是因为.loc[:, 'amount_cumulative']索引所产生的，它返回了df的一部分，而不是对新列的引用~~

更新：df本身就是一个副本，正如@QuangHoang正确指出的那样，在这种情况下，以下内容仍会引发错误。

您可以通过以下简单的操作而获得预期的结果而不会发出警告：

df['amount_cumulative'] = df.groupby(['col_A','col_B'])['float_col_c'].cumsum()

Answer 2

您的df_rev_melt_trim很可能已经是另一个数据框的副本。您的命名old_df = pd.DataFrame({'A':np.random.randint(1,10,1000), 'B':np.random.randint(1,10,1000), 'C':np.random.uniform(0,1,1000)}) df = old_df[old_df['A'] > 5] df['amount_cumulative'] = df.groupby(by=['A','B'])['C'].cumsum()也表明了这一点。测试

old_df.loc[df.index,'amount_cumulative'] = df.groupby(by=['A','B'])['C'].cumsum()

产生相同的警告。相反，您可以执行以下操作：

{{1}}

没有警告显示。

熊猫集团随后提出抛出警告

2 个答案: