假设我有这样的数据
id X Y Z Z_previous Z_index
---------------------------------------
0 1 2 10 0 0
0 1 2 20 10 1
0 1 3 30 0 0
0 1 4 40 0 0
0 2 2 50 0 0
0 2 2 60 50 1
0 2 2 70 60 2
0 2 3 80 0 0
0 2 3 90 80 1
0 2 3 100 90 2
0 2 3 110 100 3
0 2 4 120 0 0
我想计算X,Y对的先前值和“索引”。最终结果应该像这样
pf[Z_previous] = df.Z.shift(1)
pf[X_previous] = df.X.shift(1)
pf[Y_previous] = df.Y.shift(1)
那么,我做了3个新的班次
if X != X_previous || Y != Y_previous:
Z_previous = 0
现在我会做这样的事情
ActiveWorkbook.ActiveeSheet.Rows(1).Find("location", lookat:=xlWhole).Select
ActiveCell.Offset(0, 0).Seleect
ActiveCell.EntireColumn.Select
Range(Selection, ActiveCell.SpecialCells(xlLastCell)).Select
Selection.Delete
我不确定如何对数据框执行此操作。
有更好的方法吗?
答案 0 :(得分:1)
您可以这样做:
# row index in a group
df2['index']=df.groupby(['X','Y']).cumcount()+1
# groupby to calculate aggregates
xf = df2.groupby(['X','Y']).agg(Z_previous=('Z', 'shift'),
Z_index = ('index', 'shift')).fillna(0)
# join the result
df2 = pd.concat([df2.drop('index', 1), xf], axis=1)
print(df2)
id X Y Z Z_previous Z_index
0 0 1 2 10 0.0 0.0
1 0 1 2 20 10.0 1.0
2 0 1 3 30 0.0 0.0
3 0 1 4 40 0.0 0.0
4 0 2 2 50 0.0 0.0
5 0 2 2 60 50.0 1.0
6 0 2 2 70 60.0 2.0
7 0 2 3 80 0.0 0.0
8 0 2 3 90 80.0 1.0
9 0 2 3 100 90.0 2.0
10 0 2 3 110 100.0 3.0
11 0 2 4 120 0.0 0.0