Pandas SettingWithCopyWarning for fillna with MultiIndex DataFrame

时间:2019-05-31 11:58:14

标签: python pandas dataframe warnings nan

The line with fillna() raises the warning even though it is not performed inplace. Why is that?

import pandas as pd
import numpy as np


tuples = [('foo', 1), ('foo', 2), ('bar', 1), ('bar', 2)]
index = pd.MultiIndex.from_tuples(tuples)

df = pd.DataFrame(np.random.randn(2, 4), columns=index)
df.loc[0, ('foo', 1)] = np.nan

# this works without warning
# df = pd.DataFrame({'foo': [1, np.nan, 3], 'bar': [np.nan, 22, 33]]})  

df1 = df[['foo', 'bar']]
# df1 = df[['foo', 'bar']].copy()  # this does not help
filled = df1.fillna({'foo': 100, 'bar': 200}, inplace=False)

The problem does not appear if foo and bar are ordinary columns, not multiindexed.

1 个答案:

答案 0 :(得分:0)

这是误报,因此不应在此处提出警告。我认为问题在于fillna不了解“ foo”和“ bar”适用于您的MultiIndex列的特定级别。

我建议在fillna内部调用GroupBy作为解决方法,直到实现此功能为止。

fill = {'foo': 100, 'bar': 200}
df1.groupby(level=0, axis=1).apply(lambda x: x.fillna(fill[x.name]))

          foo                 bar          
            1         2         1         2
0  100.000000  1.040531 -1.516983 -0.866276
1   -0.055035 -0.107310  1.365467 -0.097696

或者,要直接使用fillna,请指定一个元组的字典(因为MultiIndex),

df1.fillna({('foo', 1): 100, ('foo', 2): 100})

          foo                 bar          
            1         2         1         2
0  100.000000  1.040531 -1.516983 -0.866276
1   -0.055035 -0.107310  1.365467 -0.097696