假设我有一个6列的DataFrame:
close high low open volume change
ts
2017-08-24 13:00:00 921.28 930.840 915.50 928.66 1270306.0 -7.38
2017-08-25 13:00:00 915.89 925.555 915.50 923.49 1053376.0 -7.6
2017-08-28 13:00:00 913.81 919.245 911.87 916.00 1086484.0 -2.19
2017-08-29 13:00:00 921.29 923.330 905.00 905.10 1185564.0 16.19
2017-08-30 13:00:00 929.57 930.819 919.65 920.05 1301225.0 9.52
2017-08-31 13:00:00 939.33 941.980 931.76 931.76 1560033.0 7.51
如果更改>如何添加每行显示1的列? 0.0其他0?
答案 0 :(得分:3)
选项1
使用布尔过滤:
df['newCol'] = (df.change > 0).astype(int)
df['newCol']
ts
2017-08-24 13:00:00 0
2017-08-25 13:00:00 0
2017-08-28 13:00:00 0
2017-08-29 13:00:00 1
2017-08-30 13:00:00 1
2017-08-31 13:00:00 1
Name: newCol, dtype: int64
选项2
使用np.where
。
df['newCol'] = np.where(df.change > 0.0, 1, 0)
df['newCol']
ts
2017-08-24 13:00:00 0
2017-08-25 13:00:00 0
2017-08-28 13:00:00 0
2017-08-29 13:00:00 1
2017-08-30 13:00:00 1
2017-08-31 13:00:00 1
Name: newCol, dtype: int64
选项3
使用df.gt
:
df['newCol'] = df.change.gt(0).astype(int)
df['newCol']
ts
2017-08-24 13:00:00 0
2017-08-25 13:00:00 0
2017-08-28 13:00:00 0
2017-08-29 13:00:00 1
2017-08-30 13:00:00 1
2017-08-31 13:00:00 1
Name: newCol, dtype: int64
<强>性能强>
%timeit (df.change > 0).astype(int)
1000 loops, best of 3: 276 µs per loop
%timeit np.where(df.change > 0.0, 1, 0)
10000 loops, best of 3: 209 µs per loop
%timeit df.change.gt(0).astype(int)
1000 loops, best of 3: 351 µs per loop
df_test = pd.concat([df] * 10000, 0) # Setup
%timeit (df_test.change > 0).astype(int)
1000 loops, best of 3: 377 µs per loop
%timeit np.where(df_test.change > 0.0, 1, 0)
1000 loops, best of 3: 328 µs per loop
%timeit df_test.change.gt(0).astype(int)
1000 loops, best of 3: 425 µs per loop
而且......
%timeit df_test.change.apply(lambda x: 1 if x > 0 else 0)
10 loops, best of 3: 24.5 ms per loop
答案 1 :(得分:-1)
df['new_column']=df.apply(lambda row: value_return(row['change']),axis=1)
def value_return(change_variable):
if(change_variable>0):
m=1
else:
m=0
return m