所以,我想按周进行更改/比较(我可以使用班次进行更改),但要注意的是,我只想比较每个类别的星期(所以当下一个类别从第1周开始时,我不会不想将其与上一个类别第51周进行比较。
| Category | Weeknumber | BruttoTonnes |
Apple 1 15
... ... ...
Apple 51 8
Pear 1 5
... ... ...
Pear 51 12
这是我的解决方案,可悲的是,由于未知原因,我对数据框不执行任何操作:
for element in df.Category.unique():
print(df[df.Category == str(element)]['Category']) # This one works, so that is all good
df[df.Category == str(element)]['WeekOverWeek%'] = ((df[df.Category == str(element)]['BruttoTonnes'].shift(1)/df[df.Category == str(element)]['BruttoTonnes'])-1)*100
没有结果。没有错误,但也没有结果。
答案 0 :(得分:1)
通过对自身进行合并,我避免进行任何循环,因此整个过程都是矢量化的,因此应该快
import pandas as pd
# set some sample dummy data
df = pd.DataFrame([['Apple',51,20],['Apple',52,19],['Apple',1,14],['Apple',2,15.2],
['Apple',3,17],['Apple',4,17],['Apple',5,18],
['Orange',51,10.5],['Orange',52,9],['Orange',1,4],['Orange',2,7],
['Orange',3,8]],
columns=['Category','WeekNum','Tonnes'])
# Set previous week's week number
df['PrevWeekNum']= df['WeekNum']-1
# roll back to week 52 if 0
df.loc[df['PrevWeekNum']==0,['PrevWeekNum']]=52
# Get the previous week's tonnage by doing a left outer merge to itself
df['PrevTonnes']=df.merge( df, left_on=['Category','PrevWeekNum'], right_on=['Category','WeekNum'], how='left' )['Tonnes_y']
# Calculate the difference
df['WeekDelta']= df['Tonnes']-df['PrevTonnes']
结果
Category WeekNum Tonnes PrevWeekNum PrevTonnes WeekDelta 0 Apple 51 20.0 50 NaN NaN 1 Apple 52 19.0 51 20.0 -1.0 2 Apple 1 14.0 52 19.0 -5.0 3 Apple 2 15.2 1 14.0 1.2 4 Apple 3 17.0 2 15.2 1.8 5 Apple 4 17.0 3 17.0 0.0 6 Apple 5 18.0 4 17.0 1.0 7 Orange 51 10.5 50 NaN NaN 8 Orange 52 9.0 51 10.5 -1.5 9 Orange 1 4.0 52 9.0 -5.0 10 Orange 2 7.0 1 4.0 3.0 11 Orange 3 8.0 2 7.0 1.0
使用df.drop()删除不需要的任何列
理想情况下,您还应该在数据中包括日期或年份,以避免从错误的年份中查找每周吨位。