我试图通过loop
dataframe
并将index
的值放入满足两个条件的dictionary
中。然后从满足条件的最后一个index
值开始再次遍历行。
我到目前为止有这个
d = {}
index_number = 12
for i, r in df.iloc[index_number:].iterrows():
print(index_number)
if r['Entry'] == 'Y':
print(i)
ix_num = i + 1
for e, ro in df.iloc[ix_num:].iterrows():
if ro['Exit'] == 'E':
d[i] = e
index_number = e
print(e)
input('Check')
break
df:
Entry Exit
12 Y NaN
13 Y NaN
14 Y E
15 Y E
16 Y NaN
17 Y NaN
18 Y NaN
19 NaN E
20 Y NaN
21 NaN E
22 NaN E
23 NaN E
24 Y NaN
25 Y NaN
26 NaN E
27 Y NaN
28 NaN E
29 NaN E
我遇到的问题是,由于某些原因,两个条件都满足时,index_number
并未用于第一个循环。
预期输出:
d = {12:14,15:19,20:21,24:26,27:28}
感谢您的帮助
编辑:
我现在正在使用以下内容:
v = []
x = []
for i, r in df.iterrows():
if r['Entry'] == 'Y':
x.append(i)
if r['Exit'] == 'E':
v.append(i)
d = {}
exce = []
check_val = 0
for i in x:
if i > check_val:
for e in v:
if e>i and e not in exce:
d[i] = e
exce.append(e)
check_val = e
break
答案 0 :(得分:1)
向量化操作:
df = pd.concat([df[df.Exit=='E']['Exit'],df[df.Entry=='Y']['Entry']])
df = df.reset_index().rename(columns = {0:'label'}).sort_values('index')
df = df[df['label']!=df['label'].shift(1)]
df['E_index'] = df['index'].shift(-1)
df = df[(df['label']+df['label'].shift(-1))=='YE']
d = dict(zip(df['index'].astype(int), df['E_index'].astype(int)))
print(d)
{12: 14, 15: 19, 20: 21, 24: 26, 27: 28}