以下df
A B ..... THRESHOLD
DATE
2011-01-01 NaN NaN ..... NaN
2012-01-01 -0.041158 -0.161571 ..... 0.329038
2013-01-01 0.238156 0.525878 ..... 0.110370
2014-01-01 0.606738 0.854177 ..... -0.095147
2015-01-01 0.200166 0.385453 ..... 0.166235
我必须将N,A,B,C等N列与THRESHOLD进行比较,然后输出结果
df['A_CALC'] = np.where(df['A'] > df['THRESHOLD'], 1, -1)
df['B_CALC'] = np.where(df['B'] > df['THRESHOLD'], 1, -1)
如何在不为每列显式写一个语句的情况下对所有列(A,B,C ...)应用上述内容?
答案 0 :(得分:5)
您可以使用df.apply
:
In [670]: df.iloc[:, :-1]\
.apply(lambda x: np.where(x > df.THRESHOLD, 1, -1), axis=0)\
.add_suffix('_CALC')
Out[670]:
A_CALC B_CALC
Date
2011-01-01 -1 -1
2012-01-01 -1 -1
2013-01-01 1 1
2014-01-01 1 1
2015-01-01 1 1
如果THRESHOLD
不您的上一栏,您最好使用
df[df.columns.difference(['THRESHOLD'])].apply(lambda x: np.where(x > df.THRESHOLD, 1, -1), axis=0).add_suffix('_CALC')
答案 1 :(得分:2)
或许您可以尝试使用subtract
,但速度应该快于apply
(df.drop(['THRESHOLD'],axis=1).subtract(df.THRESHOLD,axis=0)>0)\
.astype(int).replace({0:-1}).add_suffix('_CALC')
答案 2 :(得分:0)
以下是否足够?
import re
lines = '''
04/20/2009; 04/20/09; 4/20/09; 4/3/09
Mar-20-2009; Mar 20, 2009; March 20, 2009; Mar. 20, 2009; Mar 20 2009;
20 Mar 2009; 20 March 2009; 20 Mar. 2009; 20 March, 2009
Mar 20th, 2009; Mar 21st, 2009; Mar 22nd, 2009
Feb 2009; Sep 2009; Oct 2010
6/2008; 12/2009
2009; 2010
'''
rmonth = 'a'
regex = fr'(\d{1,2})/(\d{1,2})/(\d{4}|\d{2})'
date_found = re.findall(regex, lines)
date_found
答案 3 :(得分:0)
我需要将一些列与一个列进行比较(更改某些列并保持某些列不变)。我在上面使用了cs95的答案并设置了索引。
数据:
df=pd.DataFrame({'col1':range(10,15), 'col2':range(1,6), 'col3':np.random.randn(5)+3,'col4':np.random.randn(5)+3,'col5':np.random.randn(5)})
col1 col2 col3 col4 col5
0 10 1 2.741873 2.402274 -1.208714
1 11 2 3.328949 2.692367 -0.813730
2 12 3 5.074692 3.155199 -0.721969
3 13 4 2.725135 3.393867 -2.452344
4 14 5 3.626220 3.002514 -0.897204
代码:
import numpy as np
df['col2_copy'] = df['col2']
df=df.set_index(['col1','col2'])
df=df.apply(lambda x: np.where(x > df['col2_copy'], 1, 0), axis=0).reset_index().drop(['col2_copy'],axis = 1)
输出:
col1 col2 col3 col4 col5
0 10 1 1 1 0
1 11 2 1 1 0
2 12 3 1 1 0
3 13 4 0 0 0
4 14 5 0 0 0