我有一个如下数据框。我想根据ID和产品明智地找出连续下降的最大周数。
import pandas as pd
raw_data = {'ID': ['101', '101', '101','101', '101', '101', '102', '102', '102', '102','102', '103', '103', '103', '103','104', '104', '104', '104','104','104'],
'product':['x','x','x','x','x','x','z','z','z','z','z','y','y','y','y','x','x','x','x','x','x'],
'Week': ['201828','201829','201830','201831','201832','201833','201829','201830','201831','201832','201830','201831','201832','201833','201830','201831','201832','201833','201834','201835','201836'],
'Orders': ['-15%','-4%','-6%','6%','-10%','15%','-26%','-15%','-56%','-15%','-4%', '5%', '-10%', '-10%', '15%', '-20%', '-11%','10%', '-15%', '-20%','-26%']}
df2 = pd.DataFrame(raw_data, columns = ['ID','product','Week','Orders'])
想要的输出:
答案 0 :(得分:4)
使用cumsum
创建附加键的一种方法
s=df2['Orders'].str.contains('-')
df2[s].groupby([df2.ID,(~s).groupby(df2['ID']).cumsum(),df2['product']]).size().max(level=[0,2])
Out[202]:
ID product
101 x 3
102 z 5
103 y 2
104 x 3
dtype: int64