如何为缺少的 YEARS 插入行,并推算年度 SALES 。
以下代码计算销售差异。但是,使用显式iloc指针技术将持续一年。
import pandas as pd
data = {"YEAR": [1990, 1995, 2000, 1990, 1995, 2000],
"COUNTRY": ["USA", "USA", "USA", "USA", "USA", "USA"],
"STATE": ["AZ", "AZ", "AZ", "AZ", "AZ", "AZ"],
"BRANCH":["Bed", "Bed", "Bed", "Kitchen", "Kitchen", "Kitchen"],
"SALES": [50, 80, 100, 10, 20, 50]}
df = pd.DataFrame(data)
value_first = df.iloc[0][4]
value_second = df.iloc[1][4]
delta_step = (value_second - value_first) / 5 # because 5 years between
for x in range(0, 6):
print((x * delta_step) + value_first)
答案 0 :(得分:0)
首先对每个组的缺失年份进行罚款,并将其计入merge
df
idx=df.groupby(['COUNTRY','STATE','BRANCH'])['YEAR'].\
apply(lambda x : pd.Series(range(min(x),max(x)+1))).\
reset_index(level=[0,1,2])
然后合并
yourdf=idx.merge(df,how='left')
然后使用interpolate
估算缺失值
yourdf['SALES']=yourdf.groupby(['COUNTRY','STATE','BRANCH'])['SALES'].apply(pd.Series.interpolate)