了解Pandas SettingWithCopyWarning

时间:2014-10-23 19:56:21

标签: python python-2.7 pandas

我有以下代码,但不太明白为什么它会抛出警告。我已经阅读了documentation,但仍然无法理解为什么这种用法会导致警告。任何见解都将不胜感激。

>>> df = pandas.DataFrame({'a': [1,2,3,4,5,6,7], 'b': [11,22,33,44,55,66,77]})
>>> reduced_df = df[df['a'] > 3]
>>> reduced_df
   a   b
3  4  44
4  5  55
5  6  66
6  7  77
>>> reduced_df['a'] /= 3

Warning (from warnings module):
   File "__main__", line 1
SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_index,col_indexer] = value instead
>>> reduced_df
          a   b
3  1.333333  44
4  1.666667  55
5  2.000000  66
6  2.333333  77

1 个答案:

答案 0 :(得分:6)

此处的警告是告诉您,尽管出现reduced_df df并不是对In [14]: foo = [0] bar = foo bar.append(1) print(foo,bar) [0, 1] [0, 1] 切片的引用,但实际上是副本。这与普通语义不同,人们会期望这会导致引用,并且对该引用的修改将影响引用和原始对象(当然,对于可变对象):

In [18]:

df.loc[df['a']>3,'a'] =df['a']/3
df
Out[18]:
          a   b
0  1.000000  11
1  2.000000  22
2  3.000000  33
3  1.333333  44
4  1.666667  55
5  2.000000  66
6  2.333333  77

因此,如果您想要修改df的特定切片,那么您应该执行警告建议:

copy()

或者制作一个明确的深层副本,调用In [20]: reduced_df = df[df['a'] > 3].copy() reduced_df['a'] /=3 reduced_df Out[20]: a b 3 1.333333 44 4 1.666667 55 5 2.000000 66 6 2.333333 77 In [21]: # orig df is unmodified df Out[21]: a b 0 1 11 1 2 22 2 3 33 3 4 44 4 5 55 5 6 66 6 7 77 并修改副本而不会产生任何警告:

{{1}}