Question

以下代码占用了太多的运行时间（超过5分钟）

是否有任何减少运行时间的好方法。

data.head() # more than 10 year data, Total iteration is around 4,500,000
                Open      High       Low     Close  Volume  Adj Close  \
Date                                                                    
2012-07-02  125500.0  126500.0  124000.0  125000.0  118500  104996.59   
2012-07-03  126500.0  130000.0  125500.0  129500.0  239400  108776.47   
2012-07-04  130000.0  132500.0  128500.0  131000.0  180800  110036.43   
2012-07-05  129500.0  131000.0  127500.0  128500.0  118600  107936.50   
2012-07-06  128500.0  129000.0  126000.0  127000.0  149000  106676.54

我的代码是

import pandas as pd
import numpy as np
from pandas.io.data import DataReader
import matplotlib.pylab as plt
from datetime import datetime        

def DataReading(code):
    start = datetime(2012,7,1)
    end = pd.to_datetime('today')
    data = DataReader(code,'yahoo',start=start,end=end) 
    data = data[data["Volume"] != 0]  
    return data

data['Cut_Off'] = 0
Cut_Pct = 0.85

for i in range(len(data['Open'])):
    if i==0:
        pass
    for j in range(0,i):
        if data['Close'][j]/data['Close'][i-1]<=Cut_Pct:
           data['Cut_Off'][j] = 1 
           data['Cut_Off'][i] = 1 
        else
            pass

以上代码需要5分钟以上。当然，还有“elif”以下（我没有写上面的代码）我刚刚测试了上面的代码。

有没有什么好方法可以减少上面的代码运行时间？

额外的

    buying list is
    Open      High       Low     Close  Volume  Adj Close  \
 Date                                                                    
 2012-07-02  125500.0  126500.0  124000.0  125000.0  118500  104996.59   
 2012-07-03  126500.0  130000.0  125500.0  129500.0  239400  108776.47   
 2012-07-04  130000.0  132500.0  128500.0  131000.0  180800  110036.43   
 2012-07-05  129500.0  131000.0  127500.0  128500.0  118600  107936.50   
 2012-07-06  128500.0  129000.0  126000.0  127000.0  149000  106676.54   
 2012-07-09  127000.0  133000.0  126500.0  131500.0  207500  110456.41   
 2012-07-10  131500.0  135000.0  130500.0  133000.0  240800  111716.37   
 2012-07-11  133500.0  136500.0  132500.0  136500.0  223800  114656.28   
 for exam, i bought 10 ea at 2012-07-02 with 125,500, and as times goes 
 daily, if the close price drop under 85% of buying price(125,500) then i         
 will sell out 10ea with 85% of buying price. 
 for reducing running time, i made buying list also(i didnt show in here)
 but it also take more than 2 min with using for loop.

Answer 1

不要迭代数据中的4.5MM行，而是使用pandas＆＃39;内置索引功能。我在代码末尾重写了循环，如下所示：

data.loc[data.Close/data.Close.shift(1) <= Cut_Pct,'Cut_Off'] = 1

.loc找到符合第一个参数中条件的行。 .shift根据传递的参数向上或向下移动行。

如何减少运行（for循环），Python

1 个答案: