随机森林中的变量重要性仅返回1个值

时间:2018-03-13 02:48:56

标签: python jupyter-notebook random-forest data-analysis

我正在使用随机森林分类器来预测研究生的职业道路作为我的项目。我想打印功能重要性,我得到一个单值列表,其中我有19个变量。如何获得RF的变量重要性?我得到的输出是[1.]我的代码是:

traindt=pd.read_csv("Trainingdata11.csv")
testdt=pd.read_csv("Testingdata.csv")
cols= ['age','gender','f_occ','siblings','upb_style','medium','percent_10th','percent_12th','CGPA','loan','type_ins','opentopath','tech_rate','apt_rate','comm_rate','cca_rate','projects','internships','papers']
fin=['path']
trainArr =np.genfromtxt('Trainingdata11.csv',skip_header=1,delimiter=',',dtype=None,filling_values=0,usecols=(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18))

trainRes = np.genfromtxt('Trainingdata11.csv',skip_header=1,delimiter=',',dtype=None,filling_values=0,usecols=(19))
testArr = np.genfromtxt('Testingdata.csv',skip_header=1,delimiter=',',dtype=None,filling_values=0,usecols=(0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18))
ActRes = np.genfromtxt('Testingdata.csv',skip_header=1,delimiter=',',dtype=None,filling_values=0,usecols=(19))

trainArr= trainArr[:, None]
testArr=testArr[:,None]
trainRes=trainRes[:,None]
ActRes=ActRes[:,None]
rf=RandomForestClassifier(bootstrap=True,oob_score=True,n_estimators=400,
    criterion='gini',max_features='log2',random_state=1)
rf.fit(trainArr,trainRes.ravel())
importances = rf.feature_importances_
print(importances)

0 个答案:

没有答案