如何限制熊猫SettingWithCopyWarning?

时间:2019-08-09 14:43:42

标签: python pandas dataframe

我有这段代码会触发此警告,但我不知道为什么会发生或如何解决:

  • self.data是熊猫数据框
  • self.tag_freq是使用value_counts()的列中的self.data
  • 的Pandas DataFrame结果
def edit_tag(lst, old_set, new):
    return [elem if elem not in old_set else new for elem in lst]

toretag = self.tag_freq['count'] < lim
count = self.tag_freq['count'][toretag].sum()

classlist = list(self.tag_freq[toretag].index)
self.data['newtags'] = self.data['tags'].apply(lambda x: edit_tag(lst=x, old_set=classlist, new='underrep'))
SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  self.data['newtags'] = self.data['tags'].apply(lambda x: edit_tag(lst=x, old_set=classlist, new='underrep'))

我试图将上面代码的最后一行更改为使用.loc,但它仍会触发警告。

self.data.loc[:, 'newtags'] = self.data['tags'].apply(lambda x: edit_tag(lst=x, old_set=classlist, new='underrep'))

我尝试使用以下代码复制警告,但没有成功。

import pandas as pd

df = pd.DataFrame(data=None)

df['col0'] = list(range(100))
df['col0'] = df['col0'].apply(lambda x: x*2)

编辑: 帖子(Getting SettingWithCopyWarning warning even after using .loc in pandas [duplicate])说,问题是因为我正在操作数据框的副本,因此要修复它,我保证我正在使用的数据框是一个副本,但对象本身:

data = self.data.copy()
data['newtags'] = data['tags'].apply(lambda x: edit_tag(lst=x, old_set=classlist, new='underrep'))

然后我使用self.data对象更新data

0 个答案:

没有答案