A B C D
0 2002-01-12 10:00:00 John 19
1 2002-01-12 11:00:00 Africa 15
2 2002-01-12 12:00:00 Mary 30
3 2002-01-13 09:00:00 Billy 5
4 2002-01-13 11:00:00 Mira 6
5 2002-01-13 12:00:00 Hillary 50
6 2002-01-13 12:00:00 Romina 50
7 2002-01-14 10:00:00 George 30
8 2002-01-14 11:00:00 Denzel 12
9 2002-01-14 11:00:00 Michael 12
10 2002-01-14 12:00:00 Bisc 25
11 2002-01-16 10:00:00 Virgin 16
12 2002-01-16 11:00:00 Antonio 10
13 2002-01-16 12:00:00 Sito 5
我想创建两个新列df['E']
和df['F']
,知道相同的A
和B
值始终对应于相同的D
值:
df['E']
:D
值的方差百分比与先前的D
值相符。
df['F']
:D
与之前D
值之间的差异百分比为12:00:00。
输出应为:
A B C D E F
0 2002-01-12 10:00:00 John 19 0 0
1 2002-01-12 11:00:00 Africa 15 -21.05 0
2 2002-01-12 12:00:00 Mary 30 100.00 0
3 2002-01-13 09:00:00 Billy 5 -83.33 -83.33
4 2002-01-13 11:00:00 Mira 6 20.00 -80.00
5 2002-01-13 12:00:00 Hillary 50 733.33 66.66
6 2002-01-13 12:00:00 Romina 50 733.33 66.66
7 2002-01-14 10:00:00 George 30 -40.00 -40.00
8 2002-01-14 11:00:00 Denzel 12 -60.00 -76.00
9 2002-01-14 11:00:00 Michael 12 -60.00 -76.00
10 2002-01-14 12:00:00 Bisc 25 108.33 -50.00
11 2002-01-16 10:00:00 Virgin 16 -36.00 -36.00
12 2002-01-16 11:00:00 Antonio 10 -37.50 -60.00
13 2002-01-16 12:00:00 Sito 5 -50.00 -80.00
是否可以使用map
来获取它?
我试过了:
x = df[df['B'].eq(time(12))].drop_duplicates(subset=['A']).set_index('A')['D'](100 * (df.D - df.D.shift(1)) / df.D.shift(1)).fillna(0)
df['F'] = df['A'].map(x)
答案 0 :(得分:1)
使用:
df['E'] = df['D'].pct_change().mul(100).replace(0,np.nan).ffill().fillna(0).round(2)
s = df[df['B'].eq(time(12))].drop_duplicates(subset=['A']).set_index('A')['D']
df['F'] = (df['D'].div(df['A'].map(s.shift()))).sub(1).mul(100).round(2).fillna(0)
print (df)
A B C D E F
0 2002-01-12 10:00:00 John 19 0.00 0.00
1 2002-01-12 11:00:00 Africa 15 -21.05 0.00
2 2002-01-12 12:00:00 Mary 30 100.00 0.00
3 2002-01-13 09:00:00 Billy 5 -83.33 -83.33
4 2002-01-13 11:00:00 Mira 6 20.00 -80.00
5 2002-01-13 12:00:00 Hillary 50 733.33 66.67
6 2002-01-13 12:00:00 Romina 50 733.33 66.67
7 2002-01-14 10:00:00 George 30 -40.00 -40.00
8 2002-01-14 11:00:00 Denzel 12 -60.00 -76.00
9 2002-01-14 11:00:00 Michael 12 -60.00 -76.00
10 2002-01-14 12:00:00 Bisc 25 108.33 -50.00
11 2002-01-16 10:00:00 Virgin 16 -36.00 -36.00
12 2002-01-16 11:00:00 Antonio 10 -37.50 -60.00
13 2002-01-16 12:00:00 Sito 5 -50.00 -80.00
<强>解释强>:
E
列使用了pct_change
,然后将0
替换为NaN
并转发填充NaN
。F
列formula,地理位置A
与12:00:00
列中B
行的映射列{{1}}