修改代码以捕获大于的值 - 而不是完全匹配

时间:2016-08-20 08:31:04

标签: python pandas

以下代码适用于识别是否在后续行中命中或遗漏某个值,并使输出列显示满足条件的时间。

import datetime,numpy as np,pandas as pd;
nan = np.nan;

a = pd.DataFrame(  {'price': {datetime.time(9, 0): 1,   datetime.time(10, 0): 0,   datetime.time(11, 0): 3,   datetime.time(12, 0): 4,   datetime.time(13, 0): 7,   datetime.time(14, 0): 6,   datetime.time(15, 0): 5,   datetime.time(16, 0): 4,   datetime.time(17, 0): 0,   datetime.time(18, 0): 2,   datetime.time(19, 0): 4,   datetime.time(20, 0): 7},  'reversal': {datetime.time(9, 0): nan,   datetime.time(10, 0): nan,   datetime.time(11, 0): nan,   datetime.time(12, 0): nan,   datetime.time(13, 0): nan,
  datetime.time(14, 0): 6.0,   datetime.time(15, 0): nan,   datetime.time(16, 0): nan,   datetime.time(17, 0): nan,   datetime.time(18, 0): nan,   datetime.time(19, 0): nan,   datetime.time(20, 0): nan}});


a['target_hit_time']=a['target_miss_time']=nan;
a['target1']=a['reversal']+1;
a['target2']=a['reversal']-a['reversal'];
a.sort_index(1,inplace=True);

hits = a.ix[:,:-2].dropna();

for row,hit in hits.iterrows():

        forwardRows = [row]<a['price'].index.values

        targetHit = a.index.values[(hit['target1']==a['price'].values) & forwardRows][0];
        targetMiss = a.index.values[(hit['target2']==a['price'].values) & forwardRows][0];

        if targetHit>targetMiss:
            a.loc[row,"target_miss_time"] = targetMiss;
        else:
            a.loc[row,"target_hit_time"] = targetHit;


a

此图像显示上述代码的输出,可通过运行此代码轻松复制:

current working code

我遇到的问题是,当在真实数据上使用此代码时,价格可能不完全匹配和/或可能通过值间隔。因此,如果我们查看以下图片:

desired

如果我们要查找值target1而不只是查找值>= 7.5,我们会看到符合7.5条件。任何人都可以帮忙修改代码来实现这个目标吗?

2 个答案:

答案 0 :(得分:1)

一些ifs和那些全部:D ...

import datetime,numpy as np,pandas as pd;
nan = np.nan;

a = pd.DataFrame(  {'price': {datetime.time(9, 0): 1,   datetime.time(10, 0): 0,   datetime.time(11, 0): 3,   datetime.time(12, 0): 4,   datetime.time(13, 0): 7,   datetime.time(14, 0): 6,   datetime.time(15, 0): 5,   datetime.time(16, 0): 4,   datetime.time(17, 0): 2,   datetime.time(18, 0): 2,   datetime.time(19, 0): 4,   datetime.time(20, 0): 8},  'reversal': {datetime.time(9, 0): nan,   datetime.time(10, 0): nan,   datetime.time(11, 0): nan,   datetime.time(12, 0): nan,   datetime.time(13, 0): nan,
  datetime.time(14, 0): 6.0,   datetime.time(15, 0): nan,   datetime.time(16, 0): nan,   datetime.time(17, 0): nan,   datetime.time(18, 0): nan,   datetime.time(19, 0): nan,   datetime.time(20, 0): nan}});


a['target_hit_time']=a['target_miss_time']=nan;
a['target1']=a['reversal']+1;
a['target2']=a['reversal']-a['reversal'];
a.sort_index(1,inplace=True);

hits = a.ix[:,:-2].dropna();

for row,hit in hits.iterrows():

        forwardRows = a[a.index.values > row];
        targetHit = hit['target1']<=forwardRows['price'].values;
        targetMiss = hit['target2']==forwardRows['price'].values;
        targetHit = forwardRows[targetHit].head(1).index.values;
        targetMiss = forwardRows[targetMiss].head(1).index.values;

        targetHit, targetMiss = \
        targetHit[0] if targetHit else [], \
        targetMiss[0] if targetMiss else [];

        goMiss,goHit = False,False
        if targetHit and targetMiss:
            if targetHit>targetMiss: goMiss=True;
            else: goHit=True;
        elif targetHit and not targetMiss:goHit = True;
        elif not targetHit and targetMiss:goMiss = True;

        if goMiss:a.loc[row,"target_miss_time"] = targetMiss;
        elif goHit:a.loc[row,"target_hit_time"] = targetHit;



print '#'*50
print a
'''
##################################################
          price  reversal  target1  target2 target_hit_time  target_miss_time
09:00:00      1       NaN      NaN      NaN             NaN               NaN
10:00:00      0       NaN      NaN      NaN             NaN               NaN
11:00:00      3       NaN      NaN      NaN             NaN               NaN
12:00:00      4       NaN      NaN      NaN             NaN               NaN
13:00:00      7       NaN      NaN      NaN             NaN               NaN
14:00:00      6       6.0      7.0      0.0        20:00:00               NaN
15:00:00      5       NaN      NaN      NaN             NaN               NaN
16:00:00      4       NaN      NaN      NaN             NaN               NaN
17:00:00      2       NaN      NaN      NaN             NaN               NaN
18:00:00      2       NaN      NaN      NaN             NaN               NaN
19:00:00      4       NaN      NaN      NaN             NaN               NaN
20:00:00      8       NaN      NaN      NaN             NaN               NaN
'''

答案 1 :(得分:0)

如果不对代码进行大量修改,我就会提出这个问题:

import numpy as np

for row,hit in hits.iterrows():
        print ("row", row)
        print ("hit",hit)

        forwardRows = a[a.index.values > row]

        targetHit = forwardRows[(hit['target1'] <= forwardRows['price'].values)].head(1).index.values

        targetMiss = forwardRows[(hit['target2'] >= forwardRows['price'].values)].head(1).index.values

        if targetHit>targetMiss:
            a.loc[row,"target_miss_time"] = targetMiss
        else:
            a.loc[row,"target_hit_time"] = targetHit

    price   reversal    target1 target2 target_hit_time target_miss_time
09:00:00    1   NaN NaN NaN NaN NaN
10:00:00    0   NaN NaN NaN NaN NaN
11:00:00    3   NaN NaN NaN NaN NaN
12:00:00    4   NaN NaN NaN NaN NaN
13:00:00    7   NaN NaN NaN NaN NaN
14:00:00    6   6.5 7.5 0.0 [20:00:00]  NaN
15:00:00    5   NaN NaN NaN NaN NaN
16:00:00    4   NaN NaN NaN NaN NaN
17:00:00    2   NaN NaN NaN NaN NaN
18:00:00    2   NaN NaN NaN NaN NaN
19:00:00    4   NaN NaN NaN NaN NaN
20:00:00    8   NaN NaN NaN NaN NaN

这仍有待改进,因为targetHit,targetMiss返回一个数组,你需要检查数组中是否有任何元素,如果两个数组中都有元素 - 你需要比较第一个数据。现在它只有在一个数组为空时才有效。