使用多个if语句加快代码的运行时间

时间:2018-11-07 00:35:40

标签: python pandas conditional

我正在处理一个数据集,该数据集的一列会告诉您批准建筑许可需要多少天(time_range列会为您提供此信息)。我正在尝试创建另一列(time_frame),将这些批准时间分为1-29天,30-59天等类别。数据集还具有被拒绝的许可,我已经用一些填充了time_frame列否认。对于其余的条目,我试图用自己创建的类别进行填写。 当我在jupyter笔记本中运行单元时,它一直处于运行状态,并且没有输出任何内容。我应该如何重写代码以使用较少的if-else语句,并可能删除for循环?

这是我的代码:

for i in range(0,len(df['time_range'])):

    if df.loc[i,'time_frame'] != 'denied':

    if df.loc[i,'time_range'] == 0.0:
        df.loc[i,'time_frame'] = 'instant approval'

    elif (df.loc[i,'time_range'] >= 1.0 and df.loc[i,'time_range'] <= 29.0):
        df.loc[i,'time_frame'] = '1 - 29 days'

    elif (df.loc[i,'time_range'] >= 30.0 and df.loc[i,'time_range'] <= 59.0):
        df.loc[i,'time_frame'] = '30 - 59 days'

    elif (df.loc[i,'time_range'] >= 60.0 and df.loc[i,'time_range'] <= 89.0):
        df.loc[i,'time_frame'] = '60 - 89 days'

    elif (df.loc[i,'time_range'] >= 90.0 and df.loc[i,'time_range'] <= 119.0):
        df.loc[i,'time_frame'] = '90 - 119 days'

    elif (df.loc[i,'time_range'] >= 120.0 and df.loc[i,'time_range'] <= 149.0):
        df.loc[i,'time_frame'] = '120 - 150 days'

    elif (df.loc[i,'time_range'] >= 150.0 and df.loc[i,'time_range'] <= 179.0):
        df.loc[i,'time_frame'] = '150 - 179 days'

    else:
        df.loc[i,'time_frame'] = '180+ days'

1 个答案:

答案 0 :(得分:2)

设置

df = pd.DataFrame({
        'time_frame': {0: nan, 1: nan, 2: nan, 3: 'denied', 4: nan, 5: nan, 6: nan},
        'time_range': {0: 0, 1: 10, 2: 120, 3: 10, 4: 50, 5: 175, 6: 250}})

df
   time_range time_frame
0           0        NaN
1          10        NaN
2         120        NaN
3          10     denied
4          50        NaN
5         175        NaN
6         250        NaN

使用pd.cut并屏蔽“ time_frame”被拒绝的行:

bins = [-np.inf, 0, 29, 59, 89, 119, 149, 179, np.inf]
labels = [
    'instant', '1-29 days', '30-59 days', '60-89 days', 
    '90-119 days', '120-149 days', '150-179 days', '180+ days']

df['time_frame'] = (
    pd.cut(df['time_range'], bins=bins, labels=labels, right=True)
      .where(df['time_frame'].ne('denied'), 'denied'))

print(df)
   time_range    time_frame
0           0       instant
1          10     1-29 days
2         120  120-149 days
3          10        denied
4          50    30-59 days
5         175  150-179 days
6         250     180+ days