替换为循环?此功能有效,但需要很长时间。我正在寻找改善方法

时间:2019-08-02 18:49:42

标签: python python-3.x numpy dataframe

可以工作,但是需要40秒才能工作1个股票1个简单的移动平均线。我是一个初学者,是否有任何方法可以将其替换为循环,或更有效的方式来运行?我正在阅读有关numpy的内容,但我不知道它如何代替循环。

我正在尝试创建一个csv,以存储从当前期间到数据帧开始的所有指标值。 我目前只有一个移动平均线,但以这种速度添加其他任何东西毫无意义:)

def runcheck(df,adress):
    row_count = int(0)


    row_count=len(df)
    print(row_count)
    lastp = row_count-1

    row_count2 = int(0)
    mabuild = int(0)
    ma445_count = int(0)
    ma_count2 = int(0)
    row_count5 = int(0)
    row_count3 = int(0)
    row_count4 = int(0)
    resultat = int(0)
    timside_count = int(0)
    slott_count = int(0)
    sick_count = int(0)
    rad_data = []

    startT = time.time()
##    denna kollar hela vägen till baka t.ex idag.  sen igår i förrgår 
    for row in df.index:
        row_count2 += 1
        timside_count = row_count-row_count2
        if timside_count >= 445:
            for row in df.index:
                row_count5 = row_count-row_count2
                slott_count = row_count5-row_count3
                mabuild = mabuild+df.iloc[slott_count,5]
                row_count3 += 1
                row_count4 += 1
                if row_count4 == 445:
                    resultat = mabuild/row_count4
                    rad_data.append(resultat)
                    row_count3 = int(0)
                    row_count4 = int(0)
                    mabuild = int(0)
                    resultat = 0
                    break

##        sparar till csv innan loop börjar om
        with open(adress, "a") as fp:
            wr = csv.writer(fp,)
            wr.writerow(rad_data)
        rad_data.clear()

    print('Time was :', time.time()-startT)
    stop=input('')

1 个答案:

答案 0 :(得分:2)

尝试一下:

import numpy as np
from functools import reduce


def runcheck(df,adress):
    startT = time.time()

    rad_data = map(lambda i: reduce(lambda x, y: x + y, map(lambda z: df.iloc[z, 5], np.arange(i-445, i)))/445, np.arange(445, len(df.index)))

    '''
    Explanation

    list_1 = np.arange(445, len(def.index) -> Create a list of integers from 445 to len(def.index)
    rad_data = map(lambda i: function, list_1) -> Apply function (see below) to each value (i) in the generated list_1
    function = reduce(lambda x, y: x + y, list_2)/445 -> Take 2 consecutive values (x, y) in list_2 (see below) and sum them, repeat until one value left (i.e. sum of list_2), then divide by 445
    list_2 = map(lambda z: df.iloc[z, 5], list_3) -> Map each value (z) in list_3 (see below) to df.iloc[z, 5]
    list_3 = np.arange(i-445, i) -> Create a list of integers from i-445 to i (value i from list_1)
    '''
    # writing to your csv file outside the loop once you have all the values is better, as you remove the overhead of re-opening the file each time
    with open(adress, "a") as fp: 
        wr = csv.writer(fp,)
        for data in rad_data:
            wr.writerow([data])

    print('Time was :', time.time()-startT)
    stop=input('')

不确定,因为我没有示例数据。让我知道是否有错误,我将尝试调试!