遍历数据帧,一旦满足条件,从满足条件的地方再次开始循环?

时间:2018-12-15 11:54:50

标签: python pandas

我试图通过loop dataframe并将index的值放入满足两个条件的dictionary中。然后从满足条件的最后一个index值开始再次遍历行。

我到目前为止有这个

d = {}
index_number = 12
for i, r in df.iloc[index_number:].iterrows():
    print(index_number)
    if r['Entry'] == 'Y':
        print(i)
        ix_num = i + 1

        for e, ro in df.iloc[ix_num:].iterrows():
            if ro['Exit'] == 'E':
                d[i] = e
                index_number = e

                print(e)

                input('Check')
                break

df:    
    Entry   Exit
12       Y  NaN
13       Y  NaN
14       Y    E
15       Y    E
16       Y  NaN
17       Y  NaN
18       Y  NaN
19     NaN    E
20       Y  NaN
21     NaN    E
22     NaN    E
23     NaN    E
24       Y  NaN
25       Y  NaN
26     NaN    E
27       Y  NaN
28     NaN    E
29     NaN    E

我遇到的问题是,由于某些原因,两个条件都满足时,index_number并未用于第一个循环。

预期输出:

d = {12:14,15:19,20:21,24:26,27:28}

感谢您的帮助

编辑:

我现在正在使用以下内容:

v = []
x = []
for i, r in df.iterrows():
    if r['Entry'] == 'Y':
        x.append(i)
    if r['Exit'] == 'E':
        v.append(i)


d = {}
exce = []
check_val = 0
for i in x:
    if i > check_val:
        for e in v:
            if e>i and e not in exce:
                d[i] = e
                exce.append(e)
                check_val = e
                break

1 个答案:

答案 0 :(得分:1)

向量化操作:

df = pd.concat([df[df.Exit=='E']['Exit'],df[df.Entry=='Y']['Entry']])
df = df.reset_index().rename(columns = {0:'label'}).sort_values('index')
df = df[df['label']!=df['label'].shift(1)]
df['E_index'] = df['index'].shift(-1)
df = df[(df['label']+df['label'].shift(-1))=='YE']
d = dict(zip(df['index'].astype(int), df['E_index'].astype(int)))
print(d)

{12: 14, 15: 19, 20: 21, 24: 26, 27: 28}