我有下面的数据框,我的目标是确定是否从一个时期到另一个时期持有股票。为此,我基于str_previous_previous_period_code
的串联和str_current_period_code
的字符串转换创建了两个查询代码ticker
和date
。
我需要一个新列来返回布尔值1
或0
(如果它在前一个时期内被保存)。因此逻辑是:
str_previous_period_code
df['value'] = 1
,否则为df['value'] = 0
我已经尝试过.lookup()来启动逻辑,如下所示:
df['value'] = df.lookup(df['str_previous_period_code'], df['str_current_period_code'])
但是,出现以下关键错误:
KeyError: 'CXP2001-04-27'
ticker date close next_period_close NATR score return str_previous_period_code str_current_period_code
0 CXP 2001-04-27 4.615000 4.585000 3.700552 9 -0.006501 CXP2001-04-20 CXP2001-04-27
1 TOL 2001-04-27 1.851068 1.862219 3.174988 9 0.006024 TOL2001-04-20 TOL2001-04-27
2 WOW 2001-04-27 8.832543 8.941464 2.560720 9 0.012332 WOW2001-04-20 WOW2001-04-27
3 WES 2001-04-27 13.205642 12.771989 2.448139 9 -0.032839 WES2001-04-20 WES2001-04-27
4 PPT 2001-04-27 40.000000 40.400000 2.364224 9 0.010000 PPT2001-04-20 PPT2001-04-27
5 FLT 2001-04-27 23.398888 23.309237 2.281367 9 -0.003831 FLT2001-04-20 FLT2001-04-27
6 MIM 2001-04-27 1.260000 1.380000 5.696656 8 0.095238 MIM2001-04-20 MIM2001-04-27
7 ALL 2001-04-27 6.386961 6.113234 5.476623 8 -0.042857 ALL2001-04-20 ALL2001-04-27
8 CXP 2001-05-04 4.585000 4.650000 3.685788 9 0.014177 CXP2001-04-27 CXP2001-05-04
9 TOL 2001-05-04 1.862219 1.866679 3.139378 9 0.002395 TOL2001-04-27 TOL2001-05-04
10 WES 2001-05-04 12.771989 13.321481 2.572519 9 0.043023 WES2001-04-27 WES2001-05-04
11 WOW 2001-05-04 8.941464 9.456366 2.552963 9 0.057586 WOW2001-04-27 WOW2001-05-04
12 PPT 2001-05-04 40.400000 39.991000 2.313191 9 -0.010124 PPT2001-04-27 PPT2001-05-04
13 FLT 2001-05-04 23.309237 23.194881 2.262463 9 -0.004906 FLT2001-04-27 FLT2001-05-04
14 ALL 2001-05-04 6.113234 6.200552 5.699601 8 0.014283 ALL2001-04-27 ALL2001-05-04
15 MIM 2001-05-04 1.380000 1.340000 5.289190 8 -0.028986 MIM2001-04-27 MIM2001-05-04
答案 0 :(得分:1)
我想您可以使用以下任一方法进行查找:
Series
获得的df['str_current_period_code']
的{{3}}方法:# to avoid calling the tolist method on each iteration:
previous_period_code = df['str_previous_period_code'].tolist()
# fill the 'value' column according to your logic :
df['value'] = df['str_current_period_code'].apply(
lambda x: 1 if x in previous_period_code else 0)
df['str_current_period_code']
Series
的{{3}}方法,但是结果将为True
/ False
,而不是您要求的1和0(这是方法可能比第一个更快):df['value'] = df['str_current_period_code'].isin(df['str_previous_period_code'])