提高熊猫的速度/性能

时间:2015-07-06 21:53:44

标签: performance pandas

所以我有下面的循环,但想知道是否有任何方法可以重写我的任何代码,使其运行速度明显更快(目前运行大约需要30分钟)。任何建议都表示赞赏,因为我仍然是编程和熊猫的初学者。谢谢!

all_dates是datetime.dates的列表

loan_matches_by_month_36是一个字典,其中的键是datetime.dates,值是该日期发放的贷款的数据帧。

aggregate_cashflow_dataframe_platform = pd.DataFrame([])
investment_per_loan = 25

for month in all_dates:
    if month <= max_done_date:
        for loan in loan_matches_by_month_36[month].index:
            p_n = loan_matches_by_month_36[month].ix[loan,'p_n']
            fraction = investment_per_loan/(loan_matches_by_month_36[month].ix[loan,'funded'])
            p_adjust = loan_matches_by_month_36[month].ix[loan,'p_adjust']*fraction
            installment = loan_matches_by_month_36[month].ix[loan,'installment']*fraction
            cashflow_vector = {loan: [-investment_per_loan] + [installment] * (p_n-1) + 
                                  [installment + p_adjust]}
            cashflow_dataframe = pd.DataFrame(cashflow_vector)
            columns = pd.date_range(start = month, end = (month + relativedelta(
                    months=+len(cashflow_dataframe))), freq = 'M').shift(15, freq=pd.datetools.day)
            cashflow_dataframe = cashflow_dataframe.T
            cashflow_dataframe.columns = columns
            cashflow_dataframe['month'] = month
            cashflow_dataframe.set_index('month', append = True, inplace = True)
            cashflow_dataframe.index.names = ['index', 'month']
            result = aggregate_cashflow_dataframe_platform.append(cashflow_dataframe)
            aggregate_cashflow_dataframe_platform = result

    print month

0 个答案:

没有答案