我有一个这样的数据框:
| | Vowel | Number |
|---:|:--------|---------:|
| 0 | a | 2 |
| 1 | b | 3 |
| 2 | c | 4 |
| 3 | a | 4 |
| 4 | a | 8 |
| 5 | b | 2 |
| 6 | c | 5 |
| 7 | c | 9 |
我想根据元音和数字列创建具有差异值的列。我想要这个输出:
| | Vowel | Number | Diff |
|---:|:--------|---------:|-------:|
| 0 | a | 2 | nan |
| 1 | b | 3 | nan |
| 2 | c | 4 | nan |
| 3 | a | 4 | 2 |
| 4 | a | 8 | 4 |
| 5 | b | 2 | -1 |
| 6 | c | 5 | 1 |
| 7 | c | 9 | 4 |
因此,在元音列中查找值'a'时,第一个'a'获得值nan,因为之前在'Number'列中没有值。第二个'a'的值为2,因为4-2 =2。(数字列)。
我正在做这样的事情:
for i in list(set(df['Vowel'])):
one_vowel = df[df['Vowel'] == i]
for n in one_vowel['Number'].diff():
print(f'{i} and {n}')
结果:
b and nan
b and -1.0
a and nan
a and 2.0
a and 4.0
c and nan
c and 1.0
c and 4.0
但是我想根据该列获得正确的顺序。
请帮助我吗?
答案 0 :(得分:1)
尝试一下
df['Diff'] = df.groupby('Vowel')['Number'].diff()
输出
0 NaN
1 NaN
2 NaN
3 2.0
4 4.0
5 -1.0
6 1.0
7 4.0
Name: Diff, dtype: float64