我有一个涉及贷款默认信息的数据集,我正在尝试构建一个神经网络来预测默认值。构建神经网络看起来像:
form <- as.formula(paste("loan_status_fixed ~", paste(n[!n %in% "use"], collapse = " + ")))
表格输出是:
loan_status_fixed ~ addr_stateAK + addr_stateAL + addr_stateAR +
addr_stateAZ + addr_stateCA + addr_stateCO + addr_stateCT +
addr_stateDC + addr_stateDE + addr_stateFL + addr_stateGA +
addr_stateHI + addr_stateIA + addr_stateID + addr_stateIL +
addr_stateIN + addr_stateKS + addr_stateKY + addr_stateLA +
addr_stateMA + addr_stateMD + addr_stateME + addr_stateMI +
addr_stateMN + addr_stateMO + addr_stateNH + addr_stateNJ +
addr_stateNM + addr_stateNV + addr_stateNY + addr_stateOH +
addr_stateOK + addr_stateOR + addr_statePA + addr_stateRI +
addr_stateSC + addr_stateSD + addr_stateTN + addr_stateTX +
addr_stateUT + addr_stateVA + addr_stateVT + addr_stateWA +
addr_stateWI + addr_stateWV + annual_inc + collections_12_mths_ex_med +
delinq_2yrs + dti + `emp_length1 year` + `emp_length2 years` +
`emp_length3 years` + `emp_length4 years` + `emp_length5 years` +
`emp_length6 years` + `emp_length7 years` + `emp_length8 years` +
`emp_length9 years` + `emp_length10+ years` + `emp_lengthn/a` +
fico_averaged + funded_amnt + sub_gradeA1 + sub_gradeA2 +
sub_gradeA3 + sub_gradeA4 + sub_gradeA5 + sub_gradeB1 + sub_gradeB2 +
sub_gradeB3 + sub_gradeB4 + sub_gradeB5 + sub_gradeC1 + sub_gradeC2 +
sub_gradeC3 + sub_gradeC4 + sub_gradeC5 + sub_gradeD1 + sub_gradeD2 +
sub_gradeD3 + sub_gradeD4 + sub_gradeD5 + sub_gradeE1 + sub_gradeE2 +
sub_gradeE3 + sub_gradeE4 + home_ownershipMORTGAGE + home_ownershipOWN +
open_acc + pub_rec + purposecar + purposecredit_card + purposedebt_consolidation +
purposeeducational + purposehome_improvement + purposehouse +
purposemajor_purchase + purposemedical + purposemoving +
purposeother + purposesmall_business + purposevacation +
revol_util
fit <- neuralnet(form, data = train,linear.output=FALSE)
该功能有效,但当我尝试根据它运行预测时:
results <- neuralnet::compute(fit, test)
Error in neurons[[i]] %*% weights[[i]] : non-conformable arguments
有关此状态的先前问题是由于字符或因子变量而导致此结果,但我的数据仅包含数字,整数和双数据类型。之前的其他建议是数据集必须只包含计算中未包含的列,但是我已经对此进行了更正,并且列车和测试数据集中的所有列都包含在计算中。
下面是火车数据集的str。
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 654046 obs. of 104 variables:
$ loan_status_fixed : int 0 0 0 0 1 1 0 1 0 0 ...
$ addr_stateAK : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateAL : int 1 0 0 0 0 0 0 0 0 0 ...
$ addr_stateAR : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateAZ : int 0 0 0 0 0 0 0 0 1 0 ...
$ addr_stateCA : int 0 0 0 0 0 0 1 0 0 0 ...
$ addr_stateCO : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateCT : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateDC : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateDE : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateFL : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateGA : int 0 0 0 0 1 0 0 0 0 0 ...
$ addr_stateHI : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateIA : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateID : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateIL : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateIN : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateKS : int 0 0 0 0 0 0 0 0 0 1 ...
$ addr_stateKY : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateLA : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateMA : int 0 0 1 0 0 0 0 0 0 0 ...
$ addr_stateMD : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateME : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateMI : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateMN : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateMO : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateNH : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateNJ : int 0 0 0 1 0 0 0 0 0 0 ...
$ addr_stateNM : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateNV : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateNY : int 0 1 0 0 0 0 0 1 0 0 ...
$ addr_stateOH : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateOK : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateOR : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_statePA : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateRI : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateSC : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateSD : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateTN : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateTX : int 0 0 0 0 0 1 0 0 0 0 ...
$ addr_stateUT : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateVA : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateVT : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateWA : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateWI : int 0 0 0 0 0 0 0 0 0 0 ...
$ addr_stateWV : int 0 0 0 0 0 0 0 0 0 0 ...
$ annual_inc : num 58000 175000 66500 94800 64000 70000 95000 57000 67500 40000 ...
$ collections_12_mths_ex_med: int 0 0 0 0 0 0 0 0 0 0 ...
$ delinq_2yrs : int 0 0 0 1 0 0 0 0 0 2 ...
$ dti : num 28.7 14.1 13.7 14.5 26.1 ...
$ emp_length1 year : int 0 0 1 0 0 0 0 0 0 1 ...
$ emp_length2 years : int 0 0 0 0 0 0 0 0 1 0 ...
$ emp_length3 years : int 0 0 0 0 0 0 0 0 0 0 ...
$ emp_length4 years : int 0 0 0 0 1 0 0 0 0 0 ...
$ emp_length5 years : int 0 0 0 1 0 0 0 0 0 0 ...
$ emp_length6 years : int 0 0 0 0 0 0 0 0 0 0 ...
$ emp_length7 years : int 0 0 0 0 0 0 0 0 0 0 ...
$ emp_length8 years : int 0 0 0 0 0 0 0 0 0 0 ...
$ emp_length9 years : int 0 0 0 0 0 0 0 0 0 0 ...
$ emp_length10+ years : int 1 0 0 0 0 1 1 1 0 0 ...
$ emp_lengthn/a : int 0 0 0 0 0 0 0 0 0 0 ...
$ fico_averaged : int 712 722 777 677 727 757 687 687 677 687 ...
$ funded_amnt : int 17000 25000 8000 20000 29425 22000 11600 16000 26575 18000 ...
$ sub_gradeA1 : int 0 0 1 0 0 0 0 0 0 0 ...
$ sub_gradeA2 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeA3 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeA4 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeA5 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeB1 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeB2 : int 0 1 0 0 0 0 0 0 0 0 ...
$ sub_gradeB3 : int 0 0 0 0 0 1 0 0 0 0 ...
$ sub_gradeB4 : int 1 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeB5 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeC1 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeC2 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeC3 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeC4 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeC5 : int 0 0 0 1 0 0 0 0 0 0 ...
$ sub_gradeD1 : int 0 0 0 0 0 0 1 0 0 1 ...
$ sub_gradeD2 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeD3 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeD4 : int 0 0 0 0 0 0 0 0 1 0 ...
$ sub_gradeD5 : int 0 0 0 0 0 0 0 1 0 0 ...
$ sub_gradeE1 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeE2 : int 0 0 0 0 0 0 0 0 0 0 ...
$ sub_gradeE3 : int 0 0 0 0 1 0 0 0 0 0 ...
$ sub_gradeE4 : int 0 0 0 0 0 0 0 0 0 0 ...
$ home_ownershipMORTGAGE : int 1 0 1 1 0 1 0 0 1 1 ...
$ home_ownershipOWN : int 0 0 0 0 0 0 0 0 0 0 ...
$ open_acc : int 14 11 16 5 14 6 5 10 9 16 ...
$ pub_rec : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposecar : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposecredit_card : int 0 0 1 0 0 0 0 0 1 0 ...
$ purposedebt_consolidation : int 1 1 0 1 1 1 1 1 0 1 ...
$ purposeeducational : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposehome_improvement : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposehouse : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposemajor_purchase : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposemedical : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposemoving : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposeother : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposesmall_business : int 0 0 0 0 0 0 0 0 0 0 ...
$ purposevacation : int 0 0 0 0 0 0 0 0 0 0 ...
$ revol_util : num 45.1 50.1 29.7 93.4 66 0 96.5 68.2 88.4 28.6 ...
答案 0 :(得分:0)
此错误的原因是,您将小步传递给了compute()函数,但是它需要一个数据框或矩阵,如从参数定义中可以看到的那样:
compute(x,协变量,rep = 1)
参数
covariate:包含变量的数据框或矩阵 被用来训练神经网络。
小标题的类别始终看起来像您对test
数据所拥有的:
Classes ‘tbl_df’, ‘tbl’ and 'data.frame'
。
相反,数据框的类始终返回:
data.frame
因此解决方案很简单:只需将小标题转换为数据帧,然后再传递它即可:
results <- neuralnet::compute(fit, as.data.frame(test)