Pandas noob在这里。在SO中找不到答案。非常感谢您的帮助。
我有一个包含2列的DataFrame。一列只是一个值,另一列是第一列的前5个值的滚动Min()。
+-------+-------+------+
| Index | Value | Min5 |
+-------+-------+------+
| 0 | 1.5 | 1.5 |
| 1 | 1 | 1 |
| 2 | 0.8 | 0.8 |
| 3 | 2 | 0.8 | --> Ex."0.8" is the min of (1.5, 1, 0.8, 2)
| 4 | 1.3 | 0.8 |
| 5 | 0.9 | 0.8 |
| 6 | 1 | 0.8 |
| 7 | 1.3 | 0.9 |
| 8 | 0.5 | 0.5 |
| 9 | 1.7 | 0.5 |
| 10 | 2.1 | 0.5 |
+-------+-------+------+
我想创建一个列,告诉我当前的Min值出现在多少行之前。我的目标是以这样一个DataFrame结尾:
+-------+-------+------+----------+
| Index | Value | Min5 | Distance |
+-------+-------+------+----------+
| 0 | 1.5 | 1.5 | 0 |
| 1 | 1 | 1 | 0 |
| 2 | 0.8 | 0.8 | 0 |
| 3 | 2 | 0.8 | 1 |
| 4 | 1.3 | 0.8 | 2 | --> Ex. 0.8 is 2 rows away (up)
| 5 | 0.9 | 0.8 | 3 |
| 6 | 1 | 0.8 | 4 |
| 7 | 1.3 | 0.9 | 2 |
| 8 | 0.5 | 0.5 | 0 |
| 9 | 1.7 | 0.5 | 1 |
| 10 | 2.1 | 0.5 | 2 |
+-------+-------+------+----------+
谢谢!
答案 0 :(得分:2)
您正在寻找idxmin
df.index-df.Value.rolling(5,min_periods=1).apply(pd.Series.idxmin,raw=False)
Out[27]:
0 0.0
1 0.0
2 0.0
3 1.0
4 2.0
5 3.0
6 4.0
7 2.0
8 0.0
9 1.0
10 2.0
dtype: float64
答案 1 :(得分:1)
df.Value.rolling(5, min_periods=1).apply(lambda s: np.argmin(s[::-1]), raw=True).astype(int)
0 0
1 0
2 0
3 1
4 2
5 3
6 4
7 2
8 0
9 1
10 2
Name: Value, dtype: int64
答案 2 :(得分:0)
我发现pandas.DataFrame.idxmax有效。
# create the Value column with index in range(len(Value))
import pandas as pd
Value = [1.5, 1, 0.8, 2, 1.3, 0.9, 1, 1.3, 0.5, 1.7, 2.1]
df = pd.DataFrame({
'Value': Value,
})
# Calculate values for the Min5 column
cal_Min5 = lambda x: [min(x[0: i + 1]) if i < 4 else min(x[i - 4: i + 1]) for i in range(len(x))]
df['Min5'] = cal_Min5(Value)
# Calculate values for the Distance column using the idxmax() method
cal_Distance =lambda x: [i - (x == x[i]).idxmax() for i in range(len(x))]
df['Distance'] = cal_Distance(df['Min5'])
print(df)
这将输出:
Value Min5 Distance
0 1.5 1.5 0
1 1.0 1.0 0
2 0.8 0.8 0
3 2.0 0.8 1
4 1.3 0.8 2
5 0.9 0.8 3
6 1.0 0.8 4
7 1.3 0.9 0
8 0.5 0.5 0
9 1.7 0.5 1
10 2.1 0.5 2