我有一个多索引数据框(以客户和年份为索引)。我想知道客户级别之间的差异。我可以重置索引,执行groupby,但这似乎很多。
有没有办法根据级别执行df.diff
之类的内容?
例如,此处merchant
和year
是一个索引。我可以使用数据框操作找到成员的差异吗?
merchant year Members
A 2015 10
A 2016 20
B 2015 11
B 2016 7
C 2015 1
C 2016 0
预期输出
merchant year Members
A 2015 Nan
A 2016 10
B 2015 Nan
B 2016 -4
C 2015 Nan
C 2016 -1
答案 0 :(得分:2)
merchant
级别diff
与df = df.groupby(level='merchant')['Members'].diff().reset_index()
print (df)
merchant year Members
0 A 2015 NaN
1 A 2016 10.0
2 B 2015 NaN
3 B 2016 -4.0
4 C 2015 NaN
5 C 2016 -1.0
:
import numpy as np
x = np.array([1, 1, 1, 1])
x = [2 + x[0:i] .dot(y[0:i]) for i in range(0, len(x))]
print(x) # returns [2, 3, 5, 8]
x = np.array([1, 1, 1, 1])
for i in range(0, len(x)):
x[i] = 2 + x[0:i] .dot(y[0:i])
print(x) # returns [2, 4, 12, 48]
答案 1 :(得分:0)
也许你可以尝试unstack()
功能
df['Members'] = df['Members'].unstack('merchant').diff().stack()