如果条件为true,则使用与第x个前一行相反的值填充行

时间:2019-03-20 14:49:40

标签: python pandas

以下是我从中开始的数据框:

import pandas as pd
import numpy as np

d= {'PX_LAST':[1,2,3,3,3,1,2,1,1,1,3,3],'ma':[2,2,2,2,2,2,2,2,2,2,2,2],'action':[0,0,1,0,0,-1,0,1,0,0,-1,0]}
df_zinc = pd.DataFrame(data=d)

df_zinc

现在,我需要添加一个名为“ buy_sell”的列,该列:

  • 'action'== 1时,如果'PX_LAST'>'ma'填充为1,如果'PX_LAST'<'ma'填充-1。
  • 当“ action” ==-1时,将填充与先前填充的非零值相反的数字

FYI:在我的数据中,需要用上一个非零项目的相反行填充的行与前一个非零项目的距离始终相同(即,当前示例中为2)。这应该有助于编写代码。

我到目前为止编写的代码如下。对我来说似乎正确。您有什么建议要解决吗?

 while index < df_zinc.shape[0]:
    if df_zinc['action'][index] == 1:
        if df_zinc['PX_LAST'][index]<df_zinc['ma'][index]:
            df_zinc.loc[index,'buy_sell'] = -1
        else:
            df_zinc.loc[index,'buy_sell'] = 1
    elif df_zinc['action'][index] == -1:
            df_zinc['buy_sell'][index] = df_zinc['buy_sell'][index-3]*-1 
    index=index+1
df_zinc

结果数据帧如下:

    df_zinc['buy_sell'] = [0,0,1,0,0,-1,0,-1,0,0,1,0]

    df_zinc

4 个答案:

答案 0 :(得分:1)

因此,根据示例输出,这将是我的建议(并假设我正确理解了这个问题:

def buy_sell(row):
   if row['action'] == 0:
      return 0
   if row['PX_LAST'] > row['ma']:
      return 1 * (-1 if row['action'] == 0 else 1)
   else:
      return -1 * (-1 if row['action'] == 0 else 1)
   return 0

df_zinc = df_zinc.assign(buy_sell=df_zinc.apply(buy_sell, axis=1))      
df_zinc

这应符合规则的预期。它没有考虑到'PX_LAST'等于'ma'的可能性,默认情况下返回0,因为在这种情况下不清楚要遵循什么规则。

编辑

好的,在解释了新逻辑之后,我认为这应该可以解决问题:

def assign_buysell(df):
    last_nonzero = None
    def buy_sell(row):
        nonlocal last_nonzero
        if row['action'] == 0:
            return 0
        if row['action'] == 1:
            if row['PX_LAST'] < row['ma']:
                last_nonzero = -1
            elif row['PX_LAST'] > row['ma']:
                last_nonzero = 1
        elif row['action'] == -1:
            last_nonzero = last_nonzero * -1
        return last_nonzero
    return df.assign(buy_sell=df.apply(buy_sell, axis=1))
df_zinc = assign_buysell(df_zinc)

此解决方案与多年前看到非零值无关,它只记住最后一个非零值,并传递相反的wen操作为-1。

答案 1 :(得分:1)

您可以使用np.select,并使用np.nan作为满足第三个条件的行的标签:

c1 = df_zinc.action.eq(1) & df_zinc.PX_LAST.gt(df_zinc.ma)
c2 = df_zinc.action.eq(1) & df_zinc.PX_LAST.lt(df_zinc.ma)
c3 = df_zinc.action.eq(-1)

df_zinc['buy_sell'] = np.select([c1,c2, c3], [1, -1, np.nan])

现在,为了用上方NaNs行中的值(在这种情况下为n)填充3,您可以fillna使用数据框的移位版本:

df_zinc['buy_sell'] = df_zinc.buy_sell.fillna(df_zinc.buy_sell.shift(3)*-1)

输出

   PX_LAST  ma  action  buy_sell
0         1   2       0       0.0
1         2   2       0       0.0
2         3   2       1       1.0
3         3   2       0       0.0
4         3   2       0       0.0
5         1   2      -1      -1.0
6         2   2       0       0.0
7         1   2       1      -1.0
8         1   2       0       0.0
9         1   2       0       0.0
10        3   2      -1       1.0
11        3   2       0       0.0

答案 2 :(得分:0)

由于您有多个条件,因此我将使用np.select

conditions = [
    (df_zinc['action'] == 1) & (df_zinc['PX_LAST'] > df_zinc['ma']),
    (df_zinc['action'] == 1) & (df_zinc['PX_LAST'] < df_zinc['ma']),
    (df_zinc['action'] == -1) & (df_zinc['PX_LAST'] > df_zinc['ma']),
    (df_zinc['action'] == -1) & (df_zinc['PX_LAST'] < df_zinc['ma'])
]

choices = [1, -1, 1, -1]

df_zinc['buy_sell'] = np.select(conditions, choices, default=0)

结果

print(df_zinc)
    PX_LAST  ma  action  buy_sell
0         1   2       0         0
1         2   2       0         0
2         3   2       1         1
3         3   2       0         0
4         3   2       0         0
5         1   2      -1        -1
6         2   2       0         0
7         1   2       1        -1
8         1   2       0         0
9         1   2       0         0
10        3   2      -1         1
11        3   2       0         0

答案 3 :(得分:0)

这是我使用函数shift()捕获第三行数据的解决方案:

# evaluate the merged summary node in the graph
output, summ = sess.run([softmax, tf_fp_summaries], ...)
# explicitly write to file
summ_writer.add_summary(summ, global_step)
# optional, force to write to disk
summ_writer.flush()

输出:

df_zinc['buy_sell'] = 0
df_zinc.loc[(df_zinc['action'] == 1) & (df_zinc['PX_LAST'] < df_zinc['ma']), 'buy_sell'] = -1
df_zinc.loc[(df_zinc['action'] == 1) & (df_zinc['PX_LAST'] > df_zinc['ma']), 'buy_sell'] = 1
df_zinc.loc[df_zinc['action'] == -1, 'buy_sell'] = -df_zinc['buy_sell'].shift(3)
df_zinc['buy_sell'] = df_zinc['buy_sell'].astype(int)

print(df_zinc)