我在带有MultiIndex(行,属性)的Pandas数据框中有下表。我有类似的数据框,其中包含“类”和“概率”的值,但这些数据框具有单个索引(行)。
1 2 3 4 5 6 7 8 9 10 ... 69 70 71 72 73 74 75 76 77 78
row attribute
0 class - - - - - - - - - - ... - - - - - - - - - -
probability - - - - - - - - - - ... - - - - - - - - - -
1 class - - - - - - - - - - ... - - - - - - - - - -
probability - - - - - - - - - - ... - - - - - - - - - -
2 class - - - - - - - - - - ... - - - - - - - - - -
probability - - - - - - - - - - ... - - - - - - - - - -
现在如何将具有attribute ='class'属性的所有行的值设置为具有正确形状的另一个数据框中的值?同样,“概率”也是如此。我尝试了以下方法:
df.loc[df.attribute == "class"] = labels[sorted.values]
导致
AttributeError: 'DataFrame' object has no attribute 'attribute'
我对MultiIndex还是很陌生,因此希望获得任何提示,非常感谢!
答案 0 :(得分:0)
我认为需要:
df.loc[df.index.get_level_values("attribute") == "class"] = labels[sorted.values]
示例:
np.random.seed(789)
mux = pd.MultiIndex.from_product([np.arange(3), ['class','probability']],
names=('row','attribute'))
df = pd.DataFrame(np.random.randint(10, size=(6, 10)), index=mux)
print (df)
0 1 2 3 4 5 6 7 8 9
row attribute
0 class 3 2 1 3 4 8 4 1 8 0
probability 1 1 9 8 9 4 1 4 1 3
1 class 8 1 4 9 6 5 3 5 4 9
probability 7 6 6 5 0 8 5 4 8 1
2 class 1 4 2 6 5 9 0 6 2 8
probability 8 8 9 1 4 2 1 5 5 9
labels = pd.DataFrame(np.random.randint(2, size=(2, 10)), index=['class','probability'])
print (labels)
0 1 2 3 4 5 6 7 8 9
class 0 0 1 1 0 0 0 0 0 1
probability 1 1 0 0 0 0 0 1 0 0
如果您想通过重复的行来替换值,请使用numpy.repeat
:
mask = df.index.get_level_values("attribute") == "class"
df.loc[mask] = np.repeat(labels.loc[['class']].values, mask.sum(), axis=0)
print (df)
0 1 2 3 4 5 6 7 8 9
row attribute
0 class 0 0 1 1 0 0 0 0 0 1
probability 1 1 9 8 9 4 1 4 1 3
1 class 0 0 1 1 0 0 0 0 0 1
probability 7 6 6 5 0 8 5 4 8 1
2 class 0 0 1 1 0 0 0 0 0 1
probability 8 8 9 1 4 2 1 5 5 9
详细信息:
print (np.repeat(labels.loc[['class']].values, mask.sum(), axis=0))
[[0 0 1 1 0 0 0 0 0 1]
[0 0 1 1 0 0 0 0 0 1]
[0 0 1 1 0 0 0 0 0 1]]