我的DataFrame带有col1,col2,col3列。我想为col1中的每个组分别创建包含col2 [n + 3] / col2-1的另一个col4。
+-----+------+-----+
|col1 | col2 | col3|
+-----+------+-----+
| A | 2 | 4 |
+-----+------+-----+
| A | 4 | 5 |
+-----+------+-----+
| A | 7 | 7 |
+-----+------+-----+
| A | 3 | 8 |
+-----+------+-----+
| A | 7 | 3 |
+-----+------+-----+
| B | 8 | 9 |
+-----+------+-----+
| B | 10 | 10 |
+-----+------+-----+
| B | 8 | 9 |
+-----+------+-----+
| B | 20 | 15 |
+-----+------+-----+
输出应为
+-----+------+-----+-----+
|col1 | col2 | col3| col4|
+-----+------+-----+-----+
| A | 2 | 4 | 0.5| #(3/2-1)
+-----+------+-----+-----+
| A | 4 | 5 | 0.75| #(7/4-1)
+-----+------+-----+-----+
| A | 7 | 7 | NA |
+-----+------+-----+-----+
| A | 3 | 8 | NA |
+-----+------+-----+-----+
| A | 7 | 3 | NA |
+-----+------+-----+-----+
| B | 8 | 9 | 1.5 |
+-----+------+-----+-----+
| B | 10 | 10 | NA |
+-----+------+-----+-----+
| B | 8 | 9 | NA |
+-----+------+-----+-----+
| B | 20 | 15 | NA |
+-----+------+-----+-----+
我的代码是
df['col4']= df.groupby('col1').apply(lambda x:a['col2'].shift(-3)/a['col2']-1)
结果是col4,所有实体均为“ NA”。
我也用过
df['col4']= df.groupby('col1').pipe(lambda x:a['col2'].shift(-3)/a['col2']-1)
不考虑组“ A”和“ B”,结果为
+-----+------+-----+-------+
|col1 | col2 | col3| col4 |
+-----+------+-----+-------+
| A | 2 | 4 | 0.5 |
+-----+------+-----+-------+
| A | 4 | 5 | 0.75 |
+-----+------+-----+-------+
| A | 7 | 7 | 0.1428|
+-----+------+-----+-------+
| A | 3 | 8 | 2.33 |
+-----+------+-----+-------+
| A | 7 | 3 | 0.1428|
+-----+------+-----+-------+
| B | 8 | 9 | 1.5 |
+-----+------+-----+-------+
| B | 10 | 10 | NA |
+-----+------+-----+-------+
| B | 8 | 9 | NA |
+-----+------+-----+-------+
| B | 20 | 15 | NA |
+-----+------+-----+-------+
有人知道如何执行此任务或解决我的代码问题吗?
答案 0 :(得分:0)
IIUC:
df['col4'] = df.groupby('col1')['col2'].transform(lambda x: x.shift(-3)) / df['col2'] - 1
输出:
col1 col2 col3 col4
0 A 2 4 0.50
1 A 4 5 0.75
2 A 7 7 NaN
3 A 3 8 NaN
4 A 7 3 NaN
5 B 8 9 1.50
6 B 10 10 NaN
7 B 8 9 NaN
8 B 20 15 NaN
使用transform
在每个组中移动“ col2”,然后除以“ col2”并减去1。