我有这个Pandas数据帧是一年快照:
data = pd.DataFrame({'ID' : (1, 2),
'area': (2, 3),
'population' : (100, 200),
'demand' : (100, 200)})
我想把它变成一个时间序列,人口每年增长10%,需求每年增长20%。在这个例子中,我这样做了两年多。
这应该是输出(注意:它包含一个已添加的'年份'列):
output = pd.DataFrame({'ID': (1,2,1,2,1,2),
'year': (1,1,2,2,3,3),
'area': (2,3,2,3,2,3),
'population': (100,200,110,220,121,242),
'demand': (100,200,120,240,144,288)})
答案 0 :(得分:0)
numpy
创建一个[1.1, 1.2]
数组,我重复一遍cumprod
[1.0, 1.0]
以说明初始条件pd.Series
pd.DataFrame
构造函数k = 5
cols = ['ID', 'area']
cum_ret = np.vstack(
[np.ones((1, 2)), np.array([[1.2, 1.1]]
)[[0] * k].cumprod(0)])[:, [0, 0, 1, 1]]
s = data.set_index(cols).unstack(cols)
pd.DataFrame(
cum_ret * s.values,
columns=s.index
).stack(cols).reset_index(cols).reset_index(drop=True)
ID area demand population
0 1 2 100.000 100.000
1 2 3 200.000 200.000
2 1 2 120.000 110.000
3 2 3 240.000 220.000
4 1 2 144.000 121.000
5 2 3 288.000 242.000
6 1 2 172.800 133.100
7 2 3 345.600 266.200
8 1 2 207.360 146.410
9 2 3 414.720 292.820
10 1 2 248.832 161.051
11 2 3 497.664 322.102
答案 1 :(得分:0)
设置变量:
k = 5 #Number of years to forecast
a = 1.20 #Demand Growth
b = 1.10 #Population Growth
预测数据框:
df_out = (data[['ID','area']].merge(pd.concat([(data[['demand','population']].mul([pow(a,i),pow(b,i)])).assign(year=i+1) for i in range(k)]),
left_index=True, right_index=True)
.sort_values(by='year'))
print(df_out)
输出:
ID area demand population year
0 1 2 100.00 100.00 1
1 2 3 200.00 200.00 1
0 1 2 120.00 110.00 2
1 2 3 240.00 220.00 2
0 1 2 144.00 121.00 3
1 2 3 288.00 242.00 3
0 1 2 172.80 133.10 4
1 2 3 345.60 266.20 4
0 1 2 207.36 146.41 5
1 2 3 414.72 292.82 5