python pandas DataFrame:DrawDown的持续时间

时间:2018-04-17 11:52:49

标签: python pandas numpy

我知道有很多dd持续时间的样本。 大多数情况下,他们获得高峰日期和低谷日期,然后获得约会。 但在我的情况下,我需要每天的所有持续时间,如下所示。

(实际上,我有excel电子表格,我将所有电子表格转换为python源码)

enter image description here

enter image description here

     A     B    Cum(A)  Cum(B)  DD(A)   D(B)   Duration(A)  Duration(B)
1    3.5   2.2  3.5     2.2     0       0      0            0
2   -2.1   1.8  1.4     4      -2.1     0      1            0
3    0.7   0.7  2.1     4.7    -1.4     0      2            0
4   -1.1  -1.8  1       2.9    -2.5    -1.8    3            1
5    2.4   3.2  3.4     6.1    -0.1     0      4            0
6    1.3  -1.8  4.7     4.3     0      -1.8    0            1
7   -0.5  -0.9  4.2     3.4    -0.5    -2.7    1            2
8    0.8  -0.7  5       2.7     0      -3.4    0            3
9   -0.2   1.8  4.8     4.5    -0.2    -1.6    1            4


# The main DataFrame 
data = {'A':[3.5, -2.1, 0.7, -1.1, 2.4, 1.3, -0.5, 0.8, -0.2],
   'B':[2.2, 1.8, 0.7, -1.8, 3.2, -1.8, -0.9, -0.7, 1.8]}
df_return = pd.DataFrame(data)

# Cumulative Sum 
df_return_cumsum = df_return.cumsum()

# DrawDown
df_return_dd = df_return_cumsum - df_return_cumsum.cummax()

# Duration of DrawDown
df_return_duration = ?? # I'd like to know how to generate

请帮忙......

1 个答案:

答案 0 :(得分:1)

以下是一种方法。您可以找到更有效的方法来计算Duration(A)Duration(B)

from itertools import groupby, chain
import pandas as pd, numpy as np

data = {'A':[3.5, -2.1, 0.7, -1.1, 2.4, 1.3, -0.5, 0.8, -0.2],
        'B':[2.2, 1.8, 0.7, -1.8, 3.2, -1.8, -0.9, -0.7, 1.8]}

df = pd.DataFrame(data)

df['Cum(A)'] = df['A'].cumsum()
df['Cum(B)'] = df['B'].cumsum()
df['DD(A)'] = df['Cum(A)'] - df['Cum(A)'].cummax()
df['D(B)'] = df['Cum(B)'] - df['Cum(B)'].cummax()

df['Duration(A)'] = list(chain.from_iterable((np.arange(len(list(j)))+1).tolist() if i==1 \
                         else [0]*len(list(j)) for i, j in groupby(df['DD(A)'] != 0)))

df['Duration(B)'] = list(chain.from_iterable((np.arange(len(list(j)))+1).tolist() if i==1 \
                         else [0]*len(list(j)) for i, j in groupby(df['D(B)'] != 0)))

<强>结果

print(df)

     A    B  Cum(A)  Cum(B)  DD(A)  D(B)  Duration(A)  Duration(B)
0  3.5  2.2     3.5     2.2    0.0   0.0            0            0
1 -2.1  1.8     1.4     4.0   -2.1   0.0            1            0
2  0.7  0.7     2.1     4.7   -1.4   0.0            2            0
3 -1.1 -1.8     1.0     2.9   -2.5  -1.8            3            1
4  2.4  3.2     3.4     6.1   -0.1   0.0            4            0
5  1.3 -1.8     4.7     4.3    0.0  -1.8            0            1
6 -0.5 -0.9     4.2     3.4   -0.5  -2.7            1            2
7  0.8 -0.7     5.0     2.7    0.0  -3.4            0            3
8 -0.2  1.8     4.8     4.5   -0.2  -1.6            1            4