识别python

时间:2016-06-15 12:37:25

标签: python pandas memory-leaks

我在python中有一个方法,它根本不是内存密集型的,但每次我的代码片段到达该方法时,在大约30-40次迭代后,系统的RAM达到100%并且我的系统冻结。同样的方法以前没有给出任何问题,但现在我似乎无法找到方法的方法。任何具体原因?我有i5处理器(CPU使用率甚至不是50%)和8Gb RAM。代码段如下:

def modify_data(self, data, col):
    """
    Convert Raw scores to Z-scores.

    Args:
        data (Pandas DataFrame): DataFrame storing raw scores
        col (String): The column on which conversion is being done

    Returns:
        z_score (Pandas DataFrame): New DataFrame storing z-scores instead of raw scores.
    """
    print "inside modify_data"
    # Now calculate z-score based on rolling windows for columns calculated above.

    z_score = pd.DataFrame()

    score = data
    z_scores = []

    ctr = 0

    indeces = score.index.tolist()

    for idx in indeces:
        row = score.ix[idx]

        n = min(idx, self.window)

        if self.use_window:
            subdata = score.iloc[ctr-n:ctr+1]
        else:
            subdata = score.iloc[:ctr+1]

        row = score.ix[idx]
        x = row[0]

        mu = subdata.mean()
        sigma = subdata.std()
        new_score = (x-mu)/sigma

        z_scores = np.append(z_score, new_score)

        ctr += 1

        z_scores = np.array(z_scores)
        tmpDF = pd.DataFrame(data=z_scores, columns=[col])

        z_score = z_score.append(tmpDF)

    return z_score

早些时候我实施过:

for index, row in score.iterrows():

迭代循环,并认为iterrows是基于缓慢的性能,因此切换到迭代索引列表。但即使这样似乎没有帮助。任何输入都会非常有用

谢谢..

0 个答案:

没有答案