我有一个数据框:
| type | val
-----------------
0 | low | 0.5
1 | high | 1.2
2 | NaN | NaN
3 | low | 1
4 | NaN | NaN
5 | high | 3
6 | NaN | NaN
7 | low | 2
8 | high | 4
9 | NaN | NaN
10| low | 3
..............
98| low | 0.5
99| NaN | NaN
我想做的是找到一个像low1-> high1-> low2-> high2-> low3的模式,同时在上面的数据框中检查low2> low1和high2> high1等,并提取它们的值到新的数据框。
如果只有一部分满足(例如(low1-> high1),而不是其他),我也希望从这一点开始进行迭代,这样我就不会错过两者之间的任何模式。
我尝试使用iloc一次获取五个索引,并使用一个长的if语句比较它们,但这似乎不是最有效的编码方式
sh = high
sl = low
for idx in range(0 ,df.shape[0] -4)
if (sl in str(df.iloc[idx]['type'])) and (sh in str(df.iloc[idx+1]['type'])) and (sl in str(df.iloc[idx+2]['type'])) and (sh in str(df.iloc[idx+3]['type'])) and (sl in str(df.iloc[idx+4]['type'])) and (df.iloc[idx+4]['val'] > df.iloc[idx+2]['val']) and (df.iloc[idx+3]['val'] > df.iloc[idx+1]['val']) and (df.iloc[idx+2]['val'] > df.iloc[idx]['val']) and (0.3 * (df.iloc[idx+1]['val'] - df.iloc[idx]['val']) < ((df.iloc[idx+1]['val'] - df.iloc[idx+2]['val']))) and (0.3 * (df.iloc[idx+3]['val'] - df.iloc[idx+2]['val']) < ((df.iloc[idx+3]['val'] - df.iloc[idx+4]['val']))):
# get the 5 values here and append it the dataframe
最终结果的示例应为:
| type | val | pattern
--------------------------
0 | low | 0.5 | l1
1 | high | 1.2 | h1
2 | NaN | NaN | NaN
3 | low | 1 | l2
4 | NaN | NaN | NaN
5 | high | 3 | h2
6 | NaN | NaN | NaN
7 | low | 2 | l3
8 | high | 4 | NaN #NaN since this doesn't form a pattern (Our pattern always starts with a low)
9 | NaN | NaN | NaN
10| low | 3 | l1
..............
98| low | 0.5
99| NaN | NaN