首先,创建此DataFrame:
df = pd.DataFrame([[1,-2,3],[4,5,-6],[-7,8,9]],
columns=pd.MultiIndex.from_tuples(
[('foo', 'bar'), ('foo', 'baz'), ('ignore', 'other')]))
那是:
foo ignore
bar baz other
0 1 -2 3
1 4 5 -6
2 -7 8 9
现在,尝试将foo
下的负值替换为NAN:
df.foo[df.foo < 0] = np.nan
除了打印警告之外什么都不做:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
好的,让我们这样做:
df.loc[:,'foo'][df.foo < 0] = np.nan
那不会打印警告,但它也什么都不做!
但如果我们使用非NAN值,它就有效:
df.loc[:,'foo'][df.foo < 0] = 666
现在我有:
foo ignore
bar baz other
0 1 666 3
1 4 5 -6
2 666 8 9
但是我想用NAN填充,而不是666.有一种简单的方法吗?
答案 0 :(得分:0)
您可以slicers使用DataFrame.mask
:
idx = pd.IndexSlice
sliced = df.loc[:, idx['foo',:]]
print (sliced)
foo
bar baz
0 1 -2
1 4 5
2 -7 8
df.loc[:, idx['foo',:]] = sliced.mask(sliced < 0)
print (df)
foo ignore
bar baz other
0 1.0 NaN 3
1 4.0 5.0 -6
2 NaN 8.0 9
concat
的另一种解决方案:
idx = pd.IndexSlice
df1 = df.loc[:, idx['foo',:]]
print (df1)
foo
bar baz
0 1 -2
1 4 5
2 -7 8
df1 = df1.mask(df1 < 0)
print (df1)
foo
bar baz
0 1.0 NaN
1 4.0 5.0
2 NaN 8.0
print (pd.concat([df1, df.drop('foo', axis=1, level=0)], axis=1))
foo ignore
bar baz other
0 1.0 NaN 3
1 4.0 5.0 -6
2 NaN 8.0 9