Question

我有下面的数据框，我的目标是确定是否从一个时期到另一个时期持有股票。为此，我基于str_previous_previous_period_code的串联和str_current_period_code的字符串转换创建了两个查询代码ticker和date。

我需要一个新列来返回布尔值1或0（如果它在前一个时期内被保存）。因此逻辑是：

查找str_previous_period_code
如果在数据框中找到它，则为df['value'] = 1，否则为df['value'] = 0

我已经尝试过.lookup（）来启动逻辑，如下所示：

df['value'] = df.lookup(df['str_previous_period_code'], df['str_current_period_code'])

但是，出现以下关键错误：

KeyError: 'CXP2001-04-27'

    ticker  date    close   next_period_close   NATR    score   return  str_previous_period_code    str_current_period_code
0   CXP 2001-04-27  4.615000    4.585000    3.700552    9   -0.006501   CXP2001-04-20   CXP2001-04-27
1   TOL 2001-04-27  1.851068    1.862219    3.174988    9   0.006024    TOL2001-04-20   TOL2001-04-27
2   WOW 2001-04-27  8.832543    8.941464    2.560720    9   0.012332    WOW2001-04-20   WOW2001-04-27
3   WES 2001-04-27  13.205642   12.771989   2.448139    9   -0.032839   WES2001-04-20   WES2001-04-27
4   PPT 2001-04-27  40.000000   40.400000   2.364224    9   0.010000    PPT2001-04-20   PPT2001-04-27
5   FLT 2001-04-27  23.398888   23.309237   2.281367    9   -0.003831   FLT2001-04-20   FLT2001-04-27
6   MIM 2001-04-27  1.260000    1.380000    5.696656    8   0.095238    MIM2001-04-20   MIM2001-04-27
7   ALL 2001-04-27  6.386961    6.113234    5.476623    8   -0.042857   ALL2001-04-20   ALL2001-04-27
8   CXP 2001-05-04  4.585000    4.650000    3.685788    9   0.014177    CXP2001-04-27   CXP2001-05-04
9   TOL 2001-05-04  1.862219    1.866679    3.139378    9   0.002395    TOL2001-04-27   TOL2001-05-04
10  WES 2001-05-04  12.771989   13.321481   2.572519    9   0.043023    WES2001-04-27   WES2001-05-04
11  WOW 2001-05-04  8.941464    9.456366    2.552963    9   0.057586    WOW2001-04-27   WOW2001-05-04
12  PPT 2001-05-04  40.400000   39.991000   2.313191    9   -0.010124   PPT2001-04-27   PPT2001-05-04
13  FLT 2001-05-04  23.309237   23.194881   2.262463    9   -0.004906   FLT2001-04-27   FLT2001-05-04
14  ALL 2001-05-04  6.113234    6.200552    5.699601    8   0.014283    ALL2001-04-27   ALL2001-05-04
15  MIM 2001-05-04  1.380000    1.340000    5.289190    8   -0.028986   MIM2001-04-27   MIM2001-05-04

Answer 1

我想您可以使用以下任一方法进行查找：

通过Series获得的df['str_current_period_code']的{{3}}方法：

# to avoid calling the tolist method on each iteration:
previous_period_code = df['str_previous_period_code'].tolist()

# fill the 'value' column according to your logic :
df['value'] = df['str_current_period_code'].apply(
    lambda x: 1 if x in previous_period_code else 0)

df['str_current_period_code'] Series的{{3}}方法，但是结果将为True / False，而不是您要求的1和0（这是方法可能比第一个更快）：

df['value'] = df['str_current_period_code'].isin(df['str_previous_period_code'])

熊猫查找并返回布尔值

1 个答案: