您好我在导入为df的数据集上启动了随机林。现在我想导出结果(0-1预测)和预测概率(二维数组)并将它们与我的数据集df匹配。那可能吗?到现在为止,我想出了如何以单独的方式导出到csv。是的,我还不是熊猫专家。任何提示?
# Import the `RandomForestClassifier`
from sklearn.ensemble import RandomForestClassifier
# Create the target and features numpy arrays:
target = df["target"].values
features =df[["var1",
"var2","var3","var4","var5"]]
features_forest = features
# Building and fitting my_forest
forest = RandomForestClassifier(max_depth = 10, min_samples_split=2, n_estimators = 200, random_state = 1)
my_forest = forest.fit(features_forest, target)
# Print the score of the fitted random forest
print(my_forest.score(features_forest, target))
print(my_forest.feature_importances_)
results = my_forest.predict(features)
print(results)
predicted_probs = forest.predict_proba(features)
#predicted_probs = my_forest.predict_proba(features)
print(predicted_probs)
id_test = df['ID_CONTACT']
pd.DataFrame({"id": id_test, "relevance": results, "probs": predicted_probs }).to_csv('C:\Users\me\Desktop\python\data\submission.csv',index=False)
pd.DataFrame(predicted_probs).to_csv('C:\Users\me\Desktop\python\data\submission_2.csv',index=False)
答案 0 :(得分:1)
你应该能够
df['results] = results
df = pd.concat([df, pd.DataFrame(predicted_probs, columns=['Col_1', 'Col_2'])], axis=1)