这个熊猫警告来自哪里?

时间:2017-12-12 15:49:12

标签: python python-3.x pandas dataframe chained-assignment

我有一个DataFrame。为了进行统计条件测试,我根据布尔列(' mar')将其分成两部分。我想使用两个表之间的计数比率来添加一个表示真实值的比例的列在' mar'其他列的每个组合的列,如下所示。

tail

我已经去suggested page调查警告了。当我分配新列时,我正在使用表单>>> df_nomar alc cig mar cnt 1 1 1 0 538 3 1 0 0 456 5 0 1 0 43 7 0 0 0 279 >>> df_mar alc cig mar cnt 0 1 1 1 911 2 1 0 1 44 4 0 1 1 3 6 0 0 1 2 >>> df_mar.loc[:, 'prop'] = np.array(df_mar['cnt'])/(np.array(df_mar['cnt']) + np.array(df_nomar['cnt'])) /usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py:296: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self.obj[key] = _infer_fill_value(value) /usr/local/lib/python3.5/dist-packages/pandas/core/indexing.py:476: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy self.obj[item] = s >>> df_mar alc cig mar cnt prop 0 1 1 1 911 0.628709 2 1 0 1 44 0.088000 4 0 1 1 3 0.065217 6 0 0 1 2 0.007117 ,正如建议的那样。

那为什么我仍然得到这个警告?

1 个答案:

答案 0 :(得分:1)

对于对齐数据,您似乎需要DataFrame reset_index的相同大小:

a = df_mar['cnt'].reset_index(drop=True)
b = df_nomar['cnt'].reset_index(drop=True)
df_mar['prop'] = (a/(a + b)).values

另一种解决方案是values转换为numpy array

a = df_mar['cnt'].values
b = df_nomar['cnt'].values
df_mar['prop'] = a / (a + b)

print (df_mar)
   alc  cig  mar  cnt      prop
0    1    1    1  911  0.628709
2    1    0    1   44  0.088000
4    0    1    1    3  0.065217
6    0    0    1    2  0.007117
  

这个熊猫警告来自哪里

它明显来自上面的代码。如果过滤DataFrame,则需要copy

df_nomar = df[df['mar'] == 0].copy()
df_mar = df[df['mar'] == 1].copy()

如果稍后修改df中的值,您会发现修改不会传播回原始数据(df_nomardf_mar),并且Pandas会发出警告。