熊猫查找并返回布尔值

时间:2020-07-22 21:30:18

标签: python pandas

我有下面的数据框,我的目标是确定是否从一个时期到另一个时期持有股票。为此,我基于str_previous_previous_period_code的串联和str_current_period_code的字符串转换创建了两个查询代码tickerdate

我需要一个新列来返回布尔值10(如果它在前一个时期内被保存)。因此逻辑是:

  • 查找str_previous_period_code
  • 如果在数据框中找到它,则为df['value'] = 1,否则为df['value'] = 0

我已经尝试过.lookup()来启动逻辑,如下所示:

df['value'] = df.lookup(df['str_previous_period_code'], df['str_current_period_code'])

但是,出现以下关键错误:

KeyError: 'CXP2001-04-27' 
    ticker  date    close   next_period_close   NATR    score   return  str_previous_period_code    str_current_period_code
0   CXP 2001-04-27  4.615000    4.585000    3.700552    9   -0.006501   CXP2001-04-20   CXP2001-04-27
1   TOL 2001-04-27  1.851068    1.862219    3.174988    9   0.006024    TOL2001-04-20   TOL2001-04-27
2   WOW 2001-04-27  8.832543    8.941464    2.560720    9   0.012332    WOW2001-04-20   WOW2001-04-27
3   WES 2001-04-27  13.205642   12.771989   2.448139    9   -0.032839   WES2001-04-20   WES2001-04-27
4   PPT 2001-04-27  40.000000   40.400000   2.364224    9   0.010000    PPT2001-04-20   PPT2001-04-27
5   FLT 2001-04-27  23.398888   23.309237   2.281367    9   -0.003831   FLT2001-04-20   FLT2001-04-27
6   MIM 2001-04-27  1.260000    1.380000    5.696656    8   0.095238    MIM2001-04-20   MIM2001-04-27
7   ALL 2001-04-27  6.386961    6.113234    5.476623    8   -0.042857   ALL2001-04-20   ALL2001-04-27
8   CXP 2001-05-04  4.585000    4.650000    3.685788    9   0.014177    CXP2001-04-27   CXP2001-05-04
9   TOL 2001-05-04  1.862219    1.866679    3.139378    9   0.002395    TOL2001-04-27   TOL2001-05-04
10  WES 2001-05-04  12.771989   13.321481   2.572519    9   0.043023    WES2001-04-27   WES2001-05-04
11  WOW 2001-05-04  8.941464    9.456366    2.552963    9   0.057586    WOW2001-04-27   WOW2001-05-04
12  PPT 2001-05-04  40.400000   39.991000   2.313191    9   -0.010124   PPT2001-04-27   PPT2001-05-04
13  FLT 2001-05-04  23.309237   23.194881   2.262463    9   -0.004906   FLT2001-04-27   FLT2001-05-04
14  ALL 2001-05-04  6.113234    6.200552    5.699601    8   0.014283    ALL2001-04-27   ALL2001-05-04
15  MIM 2001-05-04  1.380000    1.340000    5.289190    8   -0.028986   MIM2001-04-27   MIM2001-05-04

1 个答案:

答案 0 :(得分:1)

我想您可以使用以下任一方法进行查找:

  • 通过Series获得的df['str_current_period_code']的{​​{3}}方法:
# to avoid calling the tolist method on each iteration:
previous_period_code = df['str_previous_period_code'].tolist()

# fill the 'value' column according to your logic :
df['value'] = df['str_current_period_code'].apply(
    lambda x: 1 if x in previous_period_code else 0)
  • df['str_current_period_code'] Series的{​​{3}}方法,但是结果将为True / False,而不是您要求的1和0(这是方法可能比第一个更快):
df['value'] = df['str_current_period_code'].isin(df['str_previous_period_code'])