减少python中的循环时间

时间:2018-12-23 18:00:59

标签: python python-3.x pandas dataframe

Python中的循环需要大量时间才能得出结果。其中包含约10万条记录。

这需要很多时间。如何减少时间

df['loan_agr'] = df['loan_agr'].astype(int)

for i in range(len(df)):

    if df.loc[i,'order_mt']== df.loc[i,'enr_mt']:

        df['new_N_Loan'] = 1

        df['exist_N_Loan'] = 0

        df['new_V_Loan'] = df['loan_agr']

        df['exist_V_Loan'] = 0

    else:        

        df['new_N_Loan'] = 0

        df['exist_N_Loan'] = 1

        df['new_V_Loan'] = 0

        df['exist_V_Loan'] = df['loan_agr']

2 个答案:

答案 0 :(得分:5)

您可以使用loc并以矢量化方式设置新值。这种方法比使用迭代要快得多,因为这些操作是在整个列上一次执行的,而不是单个值。查看this article,了解有关熊猫速度优化的更多信息。

例如:

mask = df['order_mt'] == df['enr_mt']
df.loc[mask, ['new_N_Loan', 'exist_N_Loan', 'exist_V_Loan']] = [1, 0, 0]
df.loc[mask, ['new_V_Loan']] = df['loan_agr']

df.loc[~mask, ['new_N_Loan', 'exist_N_Loan', 'new_V_Loan']] = [0, 1, 0]
df.loc[~mask, ['exist_V_Loan']] = df['loan_agr']

编辑:

如果您的熊猫版本不支持~(按位不)运算符,则可以为“ else”条件制作一个新的掩码,类似于第一个条件。

例如:

mask = df['order_mt'] == df['enr_mt']
else_mask = df['order_mt'] != df['enr_mt']

然后将else_mask用于第二组定义,而不是~mask

示例:

输入:

   order_mt  enr_mt new_N_Loan exist_N_Loan exist_V_Loan new_V_Loan  loan_agr
0         1       1       None         None         None       None       100
1         2       2       None         None         None       None       200
2         3      30       None         None         None       None       300
3         4      40       None         None         None       None       400

输出:

   order_mt  enr_mt  new_N_Loan  exist_N_Loan  exist_V_Loan  new_V_Loan  loan_agr
0         1       1           1             0             0         100       100
1         2       2           1             0             0         200       200
2         3      30           0             1           300           0       300
3         4      40           0             1           400           0       400

答案 1 :(得分:0)

您可以将len函数更改为一个值,而不是range(Len(...))。