熊猫

Question

我想在投注系统中使用Python和Pandas来实现经典的鞅。

假设此DataFrame定义如下

df = pd.DataFrame(np.random.randint(0,2,100)*2-1, columns=['TossResults'])

所以它包含投掷结果（-1 =输1 =胜利）

我想使用经典的鞅来改变赌注（我打赌每次下注的金额）。

初始股权为1。

如果我输了赌注将是之前赌注的2倍（乘数= 2）。

如果我赢得赌注将是stake_initial

我做了一个功能

def stake_martingale_classical(stake_previous, result_previous, multiplier, stake_initial):
    if (result_previous==-1): # lose
        stake = stake_previous*multiplier
    elif (result_previous==1):
        stake = stake_initial
    else:
        raise(Exception('Error result_previous must be equal to 1 (win) or -1 (lose)'))
    return(stake)

但我不知道如何使用Pandas有效地实现它。我试过这个：

initial_stake = 1
df['Stake'] = None
df['Stake'][0] = initial_stake
df['TossResultsPrevious'] = self.df['TossResults'].shift(1) # shifting-lagging
df['StakePrevious'] = self.df['Stake'].shift(1) # shifting-lagging

但是现在，我需要沿着0轴应用这个（多参数）函数。

我不知道该怎么办！

我曾见过pandas.DataFrame.applymap函数但它似乎只是1个参数函数。

也许我错了，使用shift功能不是一个好主意

Answer 1

一个轻微的解释变化是您需要将损失标记为1并将胜利标记为0。

第一步是找到失败的跑步的边缘，（steps + edges）。然后，您需要获取步骤大小的差异，并将这些值推回原始数据中。当你cumsum toss2时，它会给你当前的连败长度。您的赌注是2 ** cumsum(toss2)。

numpy版本比pandas版本快，但因素取决于N（N=100为~8，N > 10000为~2）

熊猫

使用pandas.Series：

import pandas as pd
toss = np.random.randint(0,2,100)

toss = pd.Series(toss)

steps = (toss.cumsum() * toss).diff() # mask out the cumsum where we won [0 1 2 3 0 0 4 5 6 ... ]
edges = steps < 0 # find where the cumsum steps down -> where we won
dsteps = steps[edges].diff() # find the length of each losing streak
dsteps[steps[edges].index[0]] = steps[edges][:1] # fix length of the first run which in now NaN
toss2 = toss.copy() # get a copy of the toss series
toss2[edges] = dsteps # insert the length of the losing streaks into the copy of the toss results
bets = 2 ** (toss2).cumsum() # compute the wagers

res = pd.DataFrame({'toss': toss,
                    'toss2': toss2,
                    'runs': toss2.cumsum(),
                    'next_bet': bets})

numpy的

这是纯numpy版本（我的母语是它）。将pandas为你做的阵列排成一行是很精细的

toss = np.random.randint(0,2,100)

steps = np.diff(np.cumsum(toss) * toss)
edges = steps < 0
edges_shift = np.append(False, edges[:-1])
init_step = steps[edges][0]
toss2 = np.array(toss)
toss2[edges_shift] = np.append(init_step, np.diff(steps[edges]))
bets = 2 ** np.cumsum(toss2)

fmt_dict = {1:'l', 0:'w'}
for t, b in zip(toss, bets):
    print fmt_dict[t] + '-> {0:d}'.format(b)

pandas输出

In [65]: res
Out[65]: 
    next_bet  runs  toss  toss2
0          1     0     0      0
1          2     1     1      1
2          4     2     1      1
3          8     3     1      1
4         16     4     1      1
5          1     0     0     -4
6          1     0     0      0
7          2     1     1      1
8          4     2     1      1
9          1     0     0     -2
10         1     0     0      0
11         2     1     1      1
12         4     2     1      1
13         1     0     0     -2
14         1     0     0      0
15         2     1     1      1
16         1     0     0     -1
17         1     0     0      0
18         2     1     1      1
19         1     0     0     -1
20         1     0     0      0
21         1     0     0      0
22         2     1     1      1
23         1     0     0     -1
24         2     1     1      1
25         1     0     0     -1
26         1     0     0      0
27         1     0     0      0
28         2     1     1      1
29         4     2     1      1
30         1     0     0     -2
31         2     1     1      1
32         4     2     1      1
33         1     0     0     -2
34         1     0     0      0
35         1     0     0      0
36         1     0     0      0
37         2     1     1      1
38         4     2     1      1
39         1     0     0     -2
40         2     1     1      1
41         4     2     1      1
42         8     3     1      1
43         1     0     0     -3
44         1     0     0      0
45         1     0     0      0
46         1     0     0      0
47         2     1     1      1
48         1     0     0     -1
49         2     1     1      1
50         1     0     0     -1
51         1     0     0      0
52         1     0     0      0
53         1     0     0      0
54         1     0     0      0
55         2     1     1      1
56         1     0     0     -1
57         1     0     0      0
58         1     0     0      0
59         1     0     0      0
60         1     0     0      0
61         2     1     1      1
62         1     0     0     -1
63         2     1     1      1
64         4     2     1      1
65         8     3     1      1
66        16     4     1      1
67        32     5     1      1
68         1     0     0     -5
69         2     1     1      1
70         1     0     0     -1
71         2     1     1      1
72         4     2     1      1
73         1     0     0     -2
74         2     1     1      1
75         1     0     0     -1
76         1     0     0      0
77         2     1     1      1
78         4     2     1      1
79         1     0     0     -2
80         1     0     0      0
81         2     1     1      1
82         1     0     0     -1
83         1     0     0      0
84         1     0     0      0
85         1     0     0      0
86         2     1     1      1
87         4     2     1      1
88         8     3     1      1
89        16     4     1      1
90        32     5     1      1
91        64     6     1      1
92         1     0     0     -6
93         1     0     0      0
94         1     0     0      0
95         1     0     0      0
96         2     1     1      1
97         1     0     0     -1
98         1     0     0      0
99         1     0     0      0

numpy输出

（与panadas结果不同的种子）

(result -> next bet):
w->  1
l->  2
w->  1
w->  1
l->  2
w->  1
l->  2
w->  1
l->  2
l->  4
w->  1
l->  2
w->  1
l->  2
l->  4
w->  1
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
l->  2
l->  4
w->  1
l->  2
l->  4
w->  1
w->  1
l->  2
w->  1
w->  1
w->  1
w->  1
l->  2
l->  4
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
w->  1
l->  2
l->  4
w->  1
w->  1
w->  1
w->  1
w->  1
w->  1
l->  2
w->  1
l->  2
w->  1
l->  2
w->  1
w->  1
w->  1
w->  1
w->  1
w->  1
l->  2
l->  4
l->  8
l->  16
w->  1
l->  2
l->  4
w->  1
w->  1
w->  1
w->  1
l->  2
w->  1
w->  1
l->  2
w->  1
w->  1
w->  1
l->  2
w->  1
w->  1
w->  1
w->  1
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
w->  1
l->  2
l->  4
l->  8
w->  1
l->  2
l->  4
w->  1
l->  2

Answer 2

当你可以使用矢量化操作时，Pandas将获得最大的效率，但我认为这个问题需要迭代。使用pandas的解决方案：

import pandas as pd
import numpy as np

df = pd.DataFrame(np.random.randint(0,2,100)*2-1, columns=['TossResults'])
initial_stake = 1
df['Stake'] = initial_stake

for i in xrange(1,df.shape[0]):
    if df.TossResults[i-1] == -1:
        df.Stake[i] = 2 * df.Stake[i-1]

使用Python和Pandas实现经典鞅

2 个答案:

熊猫

numpy的

pandas输出

numpy输出