如何在Numpy中矢量化以下循环?

时间:2016-06-23 21:34:53

标签: python loops numpy vectorization montecarlo

"""Some simulations to predict the future portfolio value based on past distribution. x is 
   a numpy array that contains past returns.The interpolated_returns are the returns 
   generated from the cdf of the past returns to simulate future returns. The portfolio 
   starts with a value of 100. portfolio_value is filled up progressively as 
   the program goes through every loop. The value is multiplied by the returns in that 
   period and a dollar is removed."""

    portfolio_final = []
    for i in range(10000):
        portfolio_value = [100]
        rand_values = np.random.rand(600)
        interpolated_returns = np.interp(rand_values,cdf_values,x)
        interpolated_returns = np.add(interpolated_returns,1)

        for j in range(1,len(interpolated_returns)+1):
            portfolio_value.append(interpolated_returns[j-1]*portfolio_value[j-1])
            portfolio_value[j] = portfolio_value[j]-1

        portfolio_final.append(portfolio_value[-1])
print (np.mean(portfolio_final))

我找不到使用numpy编写此代码的方法。我正在看看使用nditer的迭代,但我无法继续前进。

2 个答案:

答案 0 :(得分:1)

我想最简单的方法来弄清楚如何对你的东西进行矢量化将是看看你的进化的方程式,看看你的投资组合实际如何迭代,找到可以矢量化的模式,而不是试图将代码矢量化已经有。您会注意到cumprod实际上经常出现在您的迭代中。

然而,您可以在下面找到半矢量化代码。我也包含了您的代码,以便您可以比较结果。我还包括一个代码的简单循环版本,很多更容易阅读并可以翻译成数学方程式。因此,如果您与其他人共享此代码,我肯定会使用简单的循环选项。如果你想要一些花式裤子矢量化你可以使用矢量版本。如果您需要跟踪单个步骤,您还可以在简单循环选项中添加一个数组,并在每一步都附加光伏。

希望有所帮助。

编辑:我没有测试任何速度。这是你可以通过timeit轻松完成的事情。

import numpy as np
from scipy.special import erf

# Prepare simple return model - Normal distributed with mu &sigma = 0.01
x = np.linspace(-10,10,100)
cdf_values = 0.5*(1+erf((x-0.01)/(0.01*np.sqrt(2))))

# Prepare setup such that every code snippet uses the same number of steps
# and the same random numbers
nSteps = 600
nIterations = 1
rnd = np.random.rand(nSteps)

# Your code - Gives the (supposedly) correct results
portfolio_final = []
for i in range(nIterations):
    portfolio_value = [100]
    rand_values = rnd
    interpolated_returns = np.interp(rand_values,cdf_values,x)
    interpolated_returns = np.add(interpolated_returns,1)

    for j in range(1,len(interpolated_returns)+1):
        portfolio_value.append(interpolated_returns[j-1]*portfolio_value[j-1])
        portfolio_value[j] = portfolio_value[j]-1

    portfolio_final.append(portfolio_value[-1])
print (np.mean(portfolio_final))

# Using vectors
portfolio_final = []
for i in range(nIterations):
    portfolio_values = np.ones(nSteps)*100.0
    rcp = np.cumprod(np.interp(rnd,cdf_values,x) + 1)
    portfolio_values = rcp * (portfolio_values - np.cumsum(1.0/rcp))
    portfolio_final.append(portfolio_values[-1])
print (np.mean(portfolio_final))

# Simple loop
portfolio_final = []
for i in range(nIterations):
    pv = 100
    rets = np.interp(rnd,cdf_values,x) + 1
    for i in range(nSteps):
        pv = pv * rets[i] - 1
    portfolio_final.append(pv)
print (np.mean(portfolio_final))

答案 1 :(得分:0)

忘掉np.nditer。它不会提高迭代速度。仅在您打算使用C版本(通过cython)时使用。

我对这个内循环感到困惑。它应该做什么特别的?为什么循环?

在使用模拟值的测试中,这两个代码块产生相同的东西:

interpolated_returns = np.add(interpolated_returns,1)
for j in range(1,len(interpolated_returns)+1):
    portfolio_value.append(interpolated_returns[j-1]*portfolio[j-1])
    portfolio_value[j] = portfolio_value[j]-1

interpolated_returns = (interpolated_returns+1)*portfolio - 1
portfolio_value = portfolio_value + interpolated_returns.tolist()

我假设interpolated_returnsportfolio是1d长度相同的数组。