我想从两个传递的日期列返回特定字符串。到目前为止我的代码:
def Risk_Bucket(x):
if x <= 730:
return '< 2YR'
elif (x > 730 and x <= 1825):
return '2YR_5YR'
elif (x > 1825 and x <= 2555):
return '5YR_7YR'
elif (x > 2555 and x <= 3650):
return '7YR_10YR'
elif (x > 3650 and x <= 7300):
return '10YR_20YR'
elif (x > 7300):
return '> 20YR'
else:
return "Check passed Date"
df_Date['Bucket'] = Risk_Bucket(df_Date['Days'])
print(df_Date.head(10))
我将天数转换为字符串的功能如下,并产生错误:TypeError:打印数据帧时无效的类型比较。
Spinner
我认为这是因为Days列有字符串&#39; days&#39;在里面?
如何将Days列设为数字?任何解决此问题的建议 并改进我的代码?
答案 0 :(得分:2)
我相信您需要按days
转换timedeltas天数:
df_Date['Bucket'] = df_Date['Days'].dt.days.apply(Risk_Bucket)
使用cut
改进了代码:
bins = [-np.inf,730,1825,2555,3650,7300,np.inf]
labels = ['< 2YR', '2YR_5YR','5YR_7YR','7YR_10YR', '10YR_20YR', '> 20YR']
df_Date['Bucket'] = pd.cut(df_Date['Days'].dt.days, bins=bins, labels=labels)
验证
bins = [-np.inf,730,1825,2555,3650,7300,np.inf]
labels = ['< 2YR', '2YR_5YR','5YR_7YR','7YR_10YR', '10YR_20YR', '> 20YR']
df_Date['Bucket'] = df_Date['Days'].dt.days.apply(Risk_Bucket)
df_Date['Bucket1'] = pd.cut(df_Date['Days'].dt.days, bins=bins, labels=labels)
print (df_Date)
state maturity_date Today Days Bucket Bucket1
0 Traded Away 2018-03-15 2018-03-19 -4 days < 2YR < 2YR
10 Traded Away 2025-06-15 2018-03-19 72 days < 2YR < 2YR
12 Traded Away 2047-03-21 2018-03-19 10594 days > 20YR > 20YR
15 Traded Away 2166-03-15 2018-03-19 54052 days > 20YR > 20YR
17 Traded Away 2166-12-18 2018-03-19 54330 days > 20YR > 20YR
20 Traded Away 2023-05-04 2018-03-19 1872 days 5YR_7YR 5YR_7YR
22 Traded Away 2027-11-15 2018-03-19 3528 days 7YR_10YR 7YR_10YR
23 Traded Away 2025-03-15 2018-03-19 2553 days 5YR_7YR 5YR_7YR
25 Traded Away 2023-01-15 2018-03-19 1763 days 2YR_5YR 2YR_5YR
26 Traded Away 2166-05-01 2018-03-19 54099 days > 20YR > 20YR