以下似乎应该可行,但不会:
import pandas as pd
import numpy as np
df = pd.DataFrame()
for l1 in ('a', 'b'):
for l2 in ('one', 'two'):
df[l1, l2] = np.random.random(size=5)
df.columns = pd.MultiIndex.from_tuples(df.columns, names=['L1', 'L2'])
df['difference'] = df['b']-df['a']
我收到以下错误:
ValueError: Wrong number of items passed 2, placement implies 1
我可以通过以下方式解决这个问题:
difference = df['b']-df['a']
df['difference', 'one'] = difference['one']
df['difference', 'two'] = difference['two']
但这似乎效率低下。有更有效的方法吗?
答案 0 :(得分:0)
您可以一次性完成此操作:
In [11]: df[[("difference", "one"), ("difference", "two")]] = df['b'] - df['a']
In [12]: df
Out[12]:
L1 a b difference
L2 one two one two one two
0 0.585409 0.563870 0.535770 0.868020 -0.049639 0.304150
1 0.404546 0.102884 0.254945 0.362751 -0.149601 0.259868
2 0.475362 0.601632 0.476761 0.665126 0.001400 0.063495
3 0.926288 0.615655 0.257977 0.668778 -0.668311 0.053123
4 0.509069 0.706685 0.355842 0.891862 -0.153227 0.185177
更一般地说,您可以使用MultiIndex,例如生成from_product
:
In [21]: m = pd.MultiIndex.from_product(["difference", ["one", "two"]])
In [22]: df[m] = df['b'] - df['a']
其中RHS可以是结果.columns。