我正在处理一个数据集,该数据集的一列会告诉您批准建筑许可需要多少天(time_range列会为您提供此信息)。我正在尝试创建另一列(time_frame),将这些批准时间分为1-29天,30-59天等类别。数据集还具有被拒绝的许可,我已经用一些填充了time_frame列否认。对于其余的条目,我试图用自己创建的类别进行填写。 当我在jupyter笔记本中运行单元时,它一直处于运行状态,并且没有输出任何内容。我应该如何重写代码以使用较少的if-else语句,并可能删除for循环?
这是我的代码:
for i in range(0,len(df['time_range'])):
if df.loc[i,'time_frame'] != 'denied':
if df.loc[i,'time_range'] == 0.0:
df.loc[i,'time_frame'] = 'instant approval'
elif (df.loc[i,'time_range'] >= 1.0 and df.loc[i,'time_range'] <= 29.0):
df.loc[i,'time_frame'] = '1 - 29 days'
elif (df.loc[i,'time_range'] >= 30.0 and df.loc[i,'time_range'] <= 59.0):
df.loc[i,'time_frame'] = '30 - 59 days'
elif (df.loc[i,'time_range'] >= 60.0 and df.loc[i,'time_range'] <= 89.0):
df.loc[i,'time_frame'] = '60 - 89 days'
elif (df.loc[i,'time_range'] >= 90.0 and df.loc[i,'time_range'] <= 119.0):
df.loc[i,'time_frame'] = '90 - 119 days'
elif (df.loc[i,'time_range'] >= 120.0 and df.loc[i,'time_range'] <= 149.0):
df.loc[i,'time_frame'] = '120 - 150 days'
elif (df.loc[i,'time_range'] >= 150.0 and df.loc[i,'time_range'] <= 179.0):
df.loc[i,'time_frame'] = '150 - 179 days'
else:
df.loc[i,'time_frame'] = '180+ days'
答案 0 :(得分:2)
设置
df = pd.DataFrame({
'time_frame': {0: nan, 1: nan, 2: nan, 3: 'denied', 4: nan, 5: nan, 6: nan},
'time_range': {0: 0, 1: 10, 2: 120, 3: 10, 4: 50, 5: 175, 6: 250}})
df
time_range time_frame
0 0 NaN
1 10 NaN
2 120 NaN
3 10 denied
4 50 NaN
5 175 NaN
6 250 NaN
使用pd.cut
并屏蔽“ time_frame”被拒绝的行:
bins = [-np.inf, 0, 29, 59, 89, 119, 149, 179, np.inf]
labels = [
'instant', '1-29 days', '30-59 days', '60-89 days',
'90-119 days', '120-149 days', '150-179 days', '180+ days']
df['time_frame'] = (
pd.cut(df['time_range'], bins=bins, labels=labels, right=True)
.where(df['time_frame'].ne('denied'), 'denied'))
print(df)
time_range time_frame
0 0 instant
1 10 1-29 days
2 120 120-149 days
3 10 denied
4 50 30-59 days
5 175 150-179 days
6 250 180+ days