所以我遇到了另一个令人烦恼的问题,这是我学习python的第一步中的另一个障碍。
我的结果列有正/负/零值(奖金,亏损,没有承诺)。我想根据标志拆分为奖金/亏损,零填充零和新奖金列中的负排;零会在新损失列中填充零和正行。
DATA
g=pd.DataFrame({'OUTCOME':[100,-100,400,-200,-200,-750,-250,1000,0,100,-100]},index=[1,1,2,2,2,3,3,3,4,4,4])
期望的输出。
g['WINNINGS']=[100,0,400,0,0,0,0,1000,0,100,0]
g['LOSS']=[0,100,0,200,200,750,250,0,0,0,100]
答案 0 :(得分:4)
Theres有多种方法可以做到这一点,但基本上你想要做的是应用一个函数,如果数字小于或等于零则返回0,否则输入数字。然后为损失做相反的事情。一种方法是:
def winnings(value):
return max(value, 0)
def losses(value):
return min(value, 0)
df["winnings"] = df["outcome"].map(winnings)
df["loss"] = = df["outcome"].map(losses)
答案 1 :(得分:1)
您可以使用Series.where
:
df["winnings"] = df.OUTCOME.where(df.OUTCOME > 0, 0)
df["loss"] = -1 * df.OUTCOME.where(df.OUTCOME < 0, 0)
print (df)
OUTCOME winnings loss
1 100 100 0
1 -100 0 100
2 400 400 0
2 -200 0 200
2 -200 0 200
3 -750 0 750
3 -250 0 250
3 1000 1000 0
4 0 0 0
4 100 100 0
4 -100 0 100
使用numpy.where
的更快解决方案:
df["winnings"] = np.where(df.OUTCOME > 0, df.OUTCOME, 0)
df["loss"] = np.where(df.OUTCOME < 0, - df.OUTCOME, 0)
<强>计时强>:
In [68]: %timeit (jez1(df2))
100 loops, best of 3: 3.75 ms per loop
In [69]: %timeit (jez(df1))
100 loops, best of 3: 5.82 ms per loop
In [70]: %timeit (bat(df))
10 loops, best of 3: 134 ms per loop
时间安排的代码:
df=pd.DataFrame({'OUTCOME':[100,-100,400,-200,-200,-750,-250,1000,0,100,-100]},index=[1,1,2,2,2,3,3,3,4,4,4])
print (df)
##[110000 rows x 1 columns]
df = pd.concat([df]*10000).reset_index(drop=True)
df1 = df.copy()
df2 = df.copy()
def winnings(value):
return max(value, 0)
def losses(value):
return min(value, 0)
def bat(df):
df["winnings"] = df.OUTCOME.apply(winnings)
df["loss"] = - df.OUTCOME.apply(losses)
return df
def jez(df):
df["winnings"] = df.OUTCOME.where(df.OUTCOME > 0, 0)
df["loss"] = -1 * df.OUTCOME.where(df.OUTCOME < 0, 0)
return (df)
def jez1(df):
df["winnings"] = np.where(df.OUTCOME > 0, df.OUTCOME, 0)
df["loss"] = np.where(df.OUTCOME < 0, - df.OUTCOME, 0)
return (df)
print (bat(df))
print (jez(df1))
print (jez1(df2))